Kaggle
November 13, 2010 5:46 PM Subscribe
Kaggle hosts competitions to glean information from massive data sets, a la the Netflix Prize. Competitors can enter free, while companies with vast stores of impenetrable data pay Kaggle to outsource their difficulties to the world population of freelance data-miners. Kaggle contestants have already developed dozens of chess rating systems which outperform the Elo rating currently in use, and identified genetic markers in HIV associated with a rise in viral load. Right now, you can compete to forecast tourism statistics or predict unknown edges in a social network. Teachers who want to pit their students against each other can host a Kaggle contest free of charge.
Let's see- 10 contests available so far..
...total prizes: $ 4417
...total weeks: 27
Even if you "win" every single one you could only earn 164 a week, that's about 7800 a year. And that is rather unlikely as each challenge methodology is quite different & if you were skilled enough to master all of them, well, you wouldn't be part of a crowd, you'd be an expert.
Kaggle isn't going to attract the "Worlds Best Data Scientists" at that price - only leverage the starving students in the first world against the the english speaking analytically literate of the rest of the planet. Of course both of those populations could use the $500. Free data is nice too.
posted by zenon at 8:22 PM on November 13, 2010
...total prizes: $ 4417
...total weeks: 27
Even if you "win" every single one you could only earn 164 a week, that's about 7800 a year. And that is rather unlikely as each challenge methodology is quite different & if you were skilled enough to master all of them, well, you wouldn't be part of a crowd, you'd be an expert.
Kaggle isn't going to attract the "Worlds Best Data Scientists" at that price - only leverage the starving students in the first world against the the english speaking analytically literate of the rest of the planet. Of course both of those populations could use the $500. Free data is nice too.
posted by zenon at 8:22 PM on November 13, 2010
What I am saying - it ain't the cool million that netflix offered, but it is better than nothing.
posted by zenon at 8:24 PM on November 13, 2010
posted by zenon at 8:24 PM on November 13, 2010
Am I the only person thinking of Ender's Game right now?
posted by AkzidenzGrotesk at 8:38 PM on November 13, 2010
posted by AkzidenzGrotesk at 8:38 PM on November 13, 2010
Zenon, the Hearst Challenge fronts on a different URL but is still a Kaggle competition, and its prize is $25,000. (The challenge problem in this case is something do with optimizing the logistics of newspaper distribution: predicting how many of each newspaper should be sent to each newsstand.)
But I'm pretty sure it's not the prize money, but the chance to get lots of interesting problems with real-world sized data sets that's the draw. I'm not exactly one of the word's best data scientists, but I'm interested enough to subscribe to their newsletter, and I have done so.
posted by Michael Roberts at 9:05 PM on November 13, 2010
But I'm pretty sure it's not the prize money, but the chance to get lots of interesting problems with real-world sized data sets that's the draw. I'm not exactly one of the word's best data scientists, but I'm interested enough to subscribe to their newsletter, and I have done so.
posted by Michael Roberts at 9:05 PM on November 13, 2010
Kinda like an inverse Kickstarter. Or, more ominously, like a 99Designs for data analysis experts.
posted by breath at 10:43 PM on November 13, 2010 [1 favorite]
posted by breath at 10:43 PM on November 13, 2010 [1 favorite]
Funny, I was just looking for some datasets to analyze in order to try out RapidMiner. Thanks for the link!
posted by acheekymonkey at 2:04 AM on November 14, 2010
posted by acheekymonkey at 2:04 AM on November 14, 2010
Is there actually such a thing as freelance data mining?
If yes, I'd be intrigued to know more... like where and how it happens.
Anyway, nice site with some fun challenges.
posted by philipy at 6:52 AM on November 14, 2010
If yes, I'd be intrigued to know more... like where and how it happens.
Anyway, nice site with some fun challenges.
posted by philipy at 6:52 AM on November 14, 2010
Michael Roberts: Any decent data analyst always throws out the outliers!
Seriously - I sound a little too pessimistic there, but I think breath is on to something. It is going to change how folks approach data analysis. The worst case scenario in my mind is when the same starving grad students who would be attempting this are very group who are most likely to be out in the cold if small academic studies switch to this model of analysis. Thankfully that is unlikely in the near future, mainly because IRB boards are generally pretty conservative - for the reasons spiderskull lists.
posted by zenon at 7:33 AM on November 14, 2010
Seriously - I sound a little too pessimistic there, but I think breath is on to something. It is going to change how folks approach data analysis. The worst case scenario in my mind is when the same starving grad students who would be attempting this are very group who are most likely to be out in the cold if small academic studies switch to this model of analysis. Thankfully that is unlikely in the near future, mainly because IRB boards are generally pretty conservative - for the reasons spiderskull lists.
posted by zenon at 7:33 AM on November 14, 2010
Finally, someone to come up with reliable metrics for the MeFi Fantasy League!
posted by klangklangston at 9:21 AM on November 14, 2010
posted by klangklangston at 9:21 AM on November 14, 2010
« Older The Circular Jump is a White Hole | "Last year at the World Cup, there were broken... Newer »
This thread has been archived and is closed to new comments
posted by spiderskull at 6:10 PM on November 13, 2010 [1 favorite]