Deep-Mining Netflix
October 26, 2006 9:31 PM   Subscribe

Why is Miss Congeniality the most frequently rated DVD on Netflix? Database magic reveals the most contentious movies ever.
posted by muckster (52 comments total) 4 users marked this as a favorite
 
But I like Birth and Full Frontal and I hate The Shawshank Redemption
posted by dobbs at 9:40 PM on October 26, 2006


The Day After Tomorrow?! I still regret not walking out of the theatre during that. Soooo terrible...
posted by aubilenon at 9:47 PM on October 26, 2006


Wow, that's a very methodical way of approaching it. I'm impressed.
posted by spiderskull at 9:51 PM on October 26, 2006


And all three charts pass the smell test. They all just feel right. Very, very cool.

But who loves Pearl Harbor?
posted by Bookhouse at 9:56 PM on October 26, 2006


Not surprisingly, "contentiousness" has an uneven but inversely proportional relationship to "universal love." (Yeah, I graphed it.)

Mmm... data. Neat.
posted by zennie at 10:29 PM on October 26, 2006


Cool stuff. I suppose this Netflix prize contest is exactly what they mean when they say "crowdsourcing." Interesting it's going for such a long period of time (2011, if I read correctly).
posted by treepour at 10:35 PM on October 26, 2006


Lost in translation is the second most contentious movie ever? I loved it, and a friend of mine hated it. I also liked The Royal Tenenbaums, but not as much. I think it would be interesting to see who rated the movies what.

Turns out if you want to do well in the rental market, just make quirky movie that some group of people will love.
posted by delmoi at 10:39 PM on October 26, 2006


I was surprised to see such hatred for Coen brothers movies, I thought they'd be in the contentious category.
posted by BrotherCaine at 10:50 PM on October 26, 2006


Miss Congeniality is most contentious. Hilarious.
posted by weapons-grade pandemonium at 10:55 PM on October 26, 2006


The Day After Tomorrow?! I still regret not walking out of the theatre during that. Soooo terrible...

I thought it was a pretty good comedy.
posted by Tacos Are Pretty Great at 11:01 PM on October 26, 2006


Database magic?!?

While the results are interesting enough, there's nothing at all amazing about the SQL, or the methodology:

Essentially, I'm trying to get the rating of the bottom 83.5 percentile of viewers (all of those below the mean, and all of those within one standard deviation above the mean), thus getting rid of the gushing opinions of those in the top 16.5 percentile. I'll call this the haters' rating. I want to amplify the hatred of the haters, so I'm going to square it. Then I'll take the total number of people rating this movie, and de-emphasize it by taking its square root. Finally, I'll multiple the haters' rating by the normalized popluation size, and that'll be my aggregate 'hatred' number.

IANAStatistician, but seriously, wtf??? "Let's just arbitrarily square this & square-root that, factor in my shoe size, exclude part of the sample, and voila! A table of figures is produced!" Put it into pretty graphical form, lose the SQL & and it would be perfect for a management report, but not much good for anything else.
posted by UbuRoivas at 11:20 PM on October 26, 2006


Fascinating.

I'm a regular reader of the Roger Ebert archives, since he's been around long enough to review most movies I care about, and he's pretty trustworthy, though I do dissagree with him a good amount of the time.

What I started to find odd recently is that, no matter what Ebert had rated the movie as, the "User Rating" was almost ALWAYS three and a half stars. No matter the movie. I realized, pretty quickly, that users don't rate movies they feel ambivilent about. They rank movies they either love, or hate. They use either the four-star ranking or zero, and generally love/hate for any given movie runs seven or eight passionate loves for every one or two passionate hates.

User ratings are about as useful as straw polls. Still, I'm not surprised to see the contentiousness list being a list of most movies I love (though I was shocked that Fight Club wasn't on any of the lists.)
posted by Navelgazer at 11:27 PM on October 26, 2006


Oh, wow. Holy shit. Look for this: "These are the lowest-ranked movies, in order of ascending average ranking:"

Look at the chart under it. Look at the ninth item. It's Rise of the Undead.

Err... I directed that. No kidding. And my mama never thought I'd amount to anything.
posted by brundlefly at 11:32 PM on October 26, 2006 [8 favorites]


That would be in the first comment, that is.
posted by brundlefly at 11:39 PM on October 26, 2006


The formula doesn't even seem to do what he claims it does ("[get] rid of the gushing opinions of those in the top 16.5 percentile"). I mean, how does the following exclude the top 16.5%?

POW(5 - (rating_avg + rating_stdev), 2)

Doesn't this mean that something with an average of 6 & a stdev of 1 turns out the same as an ave of 5 & stdev of 2? Or one with 4 & 3? And what the hell is that "5-" about? Does it assume that five is some sort of magical starting point? Can somebody who understands statistics better than me please tell me that this guy isn't just pulling calculations out of his ass?
posted by UbuRoivas at 11:44 PM on October 26, 2006


Well, brundlefly, congratulations on all your success. You smell terrific.
posted by id at 11:49 PM on October 26, 2006


Why use a scalpel where a chisel will do? As far as I can tell, this is no statistician, but someone having a bit of fun with some relatively inconsequential data. The stats are a little wack, but the results make sense for the most part. Enjoy with a grain of salt, brothers.
posted by zennie at 11:53 PM on October 26, 2006


I'm sure I would be right there with you with the film I wrote if only Netflix didn't have a prejudice against porn films.
posted by Astro Zombie at 11:57 PM on October 26, 2006


Um, yeah... the statistics are completely insane. You know how you can dress up a dog in a jacket and a hat and slippers, and give him a little corn-cob pipe, but he'll still lick his balls because he's a dog? This is what happens when someone doesn't know statistics tries to do statistics. Now that I've safely made it into the "haters" section of reviews for this, I will say that it was an amusing idea, and I enjoyed reading it. 4/5 stars.
posted by Humanzee at 11:58 PM on October 26, 2006 [1 favorite]


So... Rise of the Undead isn't that bad? *dances jig*

/ignores the fact that it made it anywhere near that list
posted by brundlefly at 12:00 AM on October 27, 2006


Oh, hang on. If the movies are rated out of 5, it makes some sense. I was assuming they were rated out of 10, as they are everywhere else in the civilised world, in which case the formula wasn't chiselsense, but batshitsense.
posted by UbuRoivas at 12:00 AM on October 27, 2006


"Metafilter: to amplify the hatred of the haters."

Me, I loved this post. As Zennie said, mmm... data. Neat.
posted by salvia at 12:26 AM on October 27, 2006


So, brundlefly, where can I get Rise of the Undead? Is there a torrent? I'm assuming it's going to be hard to get legitimately in Mexico.
posted by Joakim Ziegler at 12:35 AM on October 27, 2006


Let's just arbitrarily square this

That part at least is pretty standard practice with variance statistics to keep positive and negative differences from canceling each other out. I didn't really look closely at what he was doing (my head hurts today) but I wouldn't rule it out as unreasonable right away. At least he is clearly stating how he got what he did.

That said, this is pretty much somebody just playing around with SQL. Genuine cluster analysis would be better (but probably also harder to communicate). I do think the approach here would be an fantastic way to teach SQL. Imagine getting to play with an interesting dataset instead of Salesman Bob sold X widgets in Tacoma.


I didn't think the Shawshank Redemption was that good..
posted by srboisvert at 12:38 AM on October 27, 2006


Based on my own personal experiences I'd say most of the movies on the contentious are indeed "love it or hate it" type movies. However, it would be interesting to compare the ratings of those who saw a particular movie in the theater with those who've seen the same film, but only on the small screen. Afterall, you can rate films on Netflix without actually renting them. It is possible that people who saw these films in the theaters added the positive ratings, while those who rented the movies added the negative ratings. The experience of seeing a film in a theater versus on a TV screen can make a big difference sometimes. I remember watching an episode of Ebert and Other Guy that was made shortly after Lost in Translation was released on DVD. Ebert talked about how he'd gotten mail from people who rented the movie and absolutely hated it, and who wondered why he had given the movie such a positive review when it was intially released. Ebert concluded that perhaps the whole atmosphere of the film lent itself more to viewing on a big screen. (I rented Lost in Translation and wasn't impressed, but I didn't think it was a bad film persay. Then again, it takes a lot for me to hate a movie)

...anyway, I'm not really sure where I was going with all that. It's late, I think I need to go to bed now.

Oh, wow. Holy shit. Look for this: "These are the lowest-ranked movies, in order of ascending average ranking:"

Look at the chart under it. Look at the ninth item. It's Rise of the Undead.

Err... I directed that. No kidding. And my mama never thought I'd amount to anything.
posted by brundlefly at 11:32 PM PST on October 26


Hey, any PR is good PR, yeah? ;)
posted by kosher_jenny at 12:42 AM on October 27, 2006


...since I'm paying a great deal to say whatever I like to Stanley Kubrick, one of the greatest directors of all time (although not as good as Shannon Hubell who directed Rise of the Undead)...
posted by hal9k at 1:42 AM on October 27, 2006


The first 5 that got him thinking - the most-rated movies - appear to me to be movies that were critically panned but did well at the box office, reflecting a critic-audience disconnect.

That data's not in the Cinematch system, though.
posted by ikkyu2 at 1:56 AM on October 27, 2006


Ebert concluded that perhaps the whole atmosphere of the film lent itself more to viewing on a big screen.

Yeah, Scarlett Johansson's ass has so much more impact when it's a meter high.
posted by grouse at 3:01 AM on October 27, 2006


This is what happens when someone doesn't know statistics tries to do statistics.

Yep. Exactly. But all this pseudo-statistical stuff aside for a minute, Miss Congeniality was of course a shitty movie, but it does contain one of my favorite jokes about New Jersey:

Q: Why is New Jersey called "The Garden State"?
A: Because the "Oil and Petrochemical Refinery State" wouldn't fit on a license plate.

Congratulations on your success, brundlefly. 2.4 stars in IMDB is certainly nothing to sneeze at.
posted by psmealey at 3:03 AM on October 27, 2006


These are (with the exception of 'One Hour Photo', which I actually enjoyed) some of the worst movies ever. At least, they're the worst blockbusting movies.

But ... Solaris is on that list. Even the new one was pretty good.

And, brundlefly, I wanna see your movie!
posted by The Great Big Mulp at 6:06 AM on October 27, 2006


The Day After Tomorrow?! I still regret not walking out of the theatre during that. Soooo terrible...

I thought it was a pretty good comedy.


If the wolves escape from the zoo in the first act, they must find their way into an abandoned russian freighter in NYC by the end of the movie.
posted by unsupervised at 6:29 AM on October 27, 2006


I didn't think the Shawshank Redemption was that good..

It must've been pretty damn good, since I generally can't stand Tim Robbins (Bull Durham excepted) and I enjoyed Shawshank.
posted by jonmc at 6:31 AM on October 27, 2006


If the wolves escape from the zoo in the first act

Those were wolves? I thought they were some sort of bear / horse hybrid that had been mutated by the Cold Monster that chased Our Heroes through the building.
posted by ROU_Xenophobe at 6:42 AM on October 27, 2006


Yeah, some of this doesn't do what he thinks it does. For example:

POW(rating_avg - rating_stdev, 2)

which he uses to compute the "love score," doesn't "get the top 83.5 percentile of ratings, ignoring the pessimists and haters in the lowest 16.5 percentile." What it actually does is apply a penalty to movies with more variance in their scores (because he's subtracting the standard deviation from the average rating, and then squaring the result). However, all the standard deviations in that list are very close to 1, so it may not be enough of a penalty to matter.

All the other business about multiplying by the root rating count is just to weight for popularity, so you don't get a bunch of zombie movies nobody's ever seen as your "most hated" list (no offense, brundlefly). To his credit he says that this is what he's doing.
posted by myeviltwin at 6:58 AM on October 27, 2006


I was surprised that The Blair Witch Project didn't appear on the "most contentious" list.
posted by DevilsAdvocate at 7:21 AM on October 27, 2006


Err... I directed that.

I'm gonna try to check it out!
posted by sonofsamiam at 7:23 AM on October 27, 2006


I want to see brundlefly's movie too. Miss Congeniality, not so much.
posted by djeo at 8:45 AM on October 27, 2006


These lists were pretty neat - as others have said, they somehow feel right. There are a few on the contentious list that I haven't seen yet, I'll have to check those out and see if I love 'em or hate 'em.
posted by miskatonic at 8:56 AM on October 27, 2006


Interesting post. Thank you.
posted by keijo at 9:17 AM on October 27, 2006


I don't understand the distaste for The Ladykillers. I found that to be a smashing, feel good black comedy for the whole family.

Certainly better than Donnie Darko vs. The Weather Channel.
posted by Uther Bentrazor at 9:21 AM on October 27, 2006


I was surprised that The Blair Witch Project didn't appear on the "most contentious" list.

We are witnessing The Blair Witch Project's slow slide into obscurity. See also, American Beauty -aka "the Joe of the 90s"- which also isn't on here anywhere that I could find.
posted by PinkStainlessTail at 9:46 AM on October 27, 2006


You know how you can dress up a dog in a jacket and a hat and slippers, and give him a little corn-cob pipe

No. No I do not.
posted by yerfatma at 10:01 AM on October 27, 2006


I didn't think the Shawshank Redemption was that good..
Did it remind you of the time the bad men touched you?
posted by Critical_Beatdown at 10:03 AM on October 27, 2006


I'm an extra in Miss Congeniality. I walk over their heads, in the distance, during the kiss scene.
posted by swift at 10:16 AM on October 27, 2006


I knew it!
posted by sonofsamiam at 10:22 AM on October 27, 2006


Rise of the Undead, while not the ninth-worst movie in the world, ain't so hot. Kind of incoherent. We didn't schedule enough time for a whole lot of takes, so a lot of the early dialogue ended up being (poorly) improvised. Also, the first twenty minutes or so are a snooze-fest. A whole lot of first-timer mistakes.

Joakim Ziegler: You can get it on Netflix (obviously), as well as Amazon. There are bittorrents available, and I have no problem with you pursuing that. I still haven't seen a dime for the damn thing.
posted by brundlefly at 10:32 AM on October 27, 2006


SQL is cool!
posted by Roger Dodger at 11:49 AM on October 27, 2006


What I find most interesting is that the most contentious movies are a pretty even split between so-called 'blockbusters' like the pap Roland Emmerich churns out every year, and art-house/intellectual films. It's not surprising, really, just fascinating.
posted by wolftrouble at 12:39 PM on October 27, 2006


Astro Zombie writes "I'm sure I would be right there with you with the film I wrote if only Netflix didn't have a prejudice against porn films."

You'll note that my non-porn film has a much, much lower IMDb rating.
posted by brundlefly at 1:59 PM on October 27, 2006


brundlefly : Rise of the Undead

Wow, this just combined my love for low budget movies, metafilter, and the undead all in one happy package.

I'm going to see your movie.

I see from IMDB you have another one coming out as well. Congrats.
posted by quin at 5:14 PM on October 27, 2006


And Astro Zombie, my old nemesis. I will now have to see your film as well. I'll be honest, I thought your above post was a joke till I followed the link to IMDB. That's when I said: "Awesome the AZ wrote a porno. How great is that?"

Then one of my coworkers said "Who are you talking to? And where are your pants? You know you are supposed to wear pants to work right?"

It went on a little longer, but I'll spare you the boring details.
posted by quin at 5:30 PM on October 27, 2006


quin writes "I see from IMDB you have another one coming out as well. Congrats."

Alas, IMDb lies to you. My co-director and I created the listing for Eschaton about a year and a half ago, and posted our hoped for completion date. We were doing pretty well there on it, then an act of God (that rhymes with "Patrina") intervened. Eschaton was shelved, pretty much.

My co-director is living in LA right now, and has been showing our script around. A few b-movie type people have expressed interest in it (I may be biased, but the script is damn good... an order of magnitude beyond the-ninth-worst-movie-among-movies-no-one-has-seen), but nothing concrete so far. Anyway, it seems that IMDb trusted us on our completion date, and lists the thing as being in the can. Nothing has been shot, unless you count the promo trailer we made to show to investors.

Actually, my second AskMe question was about alternate titles for Eschaton. It's a really fun read.
posted by brundlefly at 3:17 AM on October 28, 2006


« Older 12 tone scale? bah!   |   I was only ACTING!!! Newer »


This thread has been archived and is closed to new comments