One of its distinctive characteristics is its often offensive content
January 10, 2018 10:09 AM Subscribe
The Anatomy of the Urban Dictionary: The first large-scale study of the Urban Dictionary provides unique insights into the way our language is evolving.
From the study [PDF]: Our study highlights that UD has a higher content heterogeneity than traditional dictionaries. Depending on the goal, this could mean that more effort is needed to filter and process the data (e.g., the removal of opinions) compared to when traditional dictionaries are used. However, UD is unique in capturing many infrequent, informal words and it could therefore complement the traditional dictionaries. Furthermore, while there is more offensive content in UD, highly offensive definitions do get ranked lower through the voting system. We also found that words with more definitions tended to be more familiar to crowdworkers, suggesting that UD content does reflect broader trends in language use to some extent.
From the study [PDF]: Our study highlights that UD has a higher content heterogeneity than traditional dictionaries. Depending on the goal, this could mean that more effort is needed to filter and process the data (e.g., the removal of opinions) compared to when traditional dictionaries are used. However, UD is unique in capturing many infrequent, informal words and it could therefore complement the traditional dictionaries. Furthermore, while there is more offensive content in UD, highly offensive definitions do get ranked lower through the voting system. We also found that words with more definitions tended to be more familiar to crowdworkers, suggesting that UD content does reflect broader trends in language use to some extent.
Now I want to see Filthy Watson.
The cesspit tag is criminally underused on metafilter.
posted by adept256 at 10:37 AM on January 10, 2018 [5 favorites]
The cesspit tag is criminally underused on metafilter.
posted by adept256 at 10:37 AM on January 10, 2018 [5 favorites]
I love Urban Dictionary. But I'd naively assumed the data wouldn't be very useful because there are so many joke entries. Also a bunch of garbage drive-by definitions that aren't meaningful. I skimmed the paper and didn't see them address this question head on, but maybe I missed it. They do seem to be relying heavily on up and down votes as a signal for getting through the noise.
I interviewed the Urban Dictionary founder Aaron Peckham back around 2005, when he applied for a job at Google. He was very modest, seemed genuinely surprised I'd heard of the site and liked it. Also he seemed very smart to me. He took the job and I always kind of wondered what happened to him afterwards. Good things, apparently, and I think he still runs Urban Dictionary independently as a business. I suspect he's doing pretty well with that.
posted by Nelson at 11:00 AM on January 10, 2018 [5 favorites]
I interviewed the Urban Dictionary founder Aaron Peckham back around 2005, when he applied for a job at Google. He was very modest, seemed genuinely surprised I'd heard of the site and liked it. Also he seemed very smart to me. He took the job and I always kind of wondered what happened to him afterwards. Good things, apparently, and I think he still runs Urban Dictionary independently as a business. I suspect he's doing pretty well with that.
posted by Nelson at 11:00 AM on January 10, 2018 [5 favorites]
Coffee: Someone who is coughed upon.
Fantastic.
posted by dazed_one at 12:18 PM on January 10, 2018 [3 favorites]
Fantastic.
posted by dazed_one at 12:18 PM on January 10, 2018 [3 favorites]
I actually used UD the other day to look up "Bowiemas", just to see if the youngins' knew who he was. Saw the definition, and I laughed and laughed.
The kids are all right.
posted by droplet at 12:50 PM on January 10, 2018
The kids are all right.
posted by droplet at 12:50 PM on January 10, 2018
In tests it even used the word "bullshit" in an answer to a researcher's query.This hardly seems like a disqualifying characteristic without more information. Calling out bullshit is a lot more useful than playing Jeopardy. To quote Captain Picard, "It came from us, from our mission records, personal logs, holodeck programs, our fantasies. Now, if our experiences with the [English language speakers with internet access] have been honorable, can't we trust that the sum of those experiences will be the same?" I say, let Filthy Watson live. He is but a reflection of us.
Mostly, I use the urban dictionary these days when figuring out when my friends are making a contemporary television reference. It's easy, because the entry is inevitably incoherent nonsense filled with specific character names that are incomprehensible to anyone who doesn't already get the joke. An algorithm that automates that decision tree and runs it on every phrase in a conversation could be quite useful.
posted by eotvos at 1:25 PM on January 10, 2018 [1 favorite]
I quit checking the Urban Dictionary because any entry that could possibly be written in a misogynist light is written at reddit/4chan levels. "Often offensive content" is either an academic understatement or the authors are using offensive to mean only vulgar language. Because the misogynist crap is absolutely not being voted down; it's being vigorously voted up.
posted by Karmakaze at 2:32 PM on January 10, 2018 [6 favorites]
posted by Karmakaze at 2:32 PM on January 10, 2018 [6 favorites]
The name “urban” strikes me as problematic, in that the word is often used as a euphemism for “Black”; this gives the name “Urban Dictionary” an aura of “here's what those black dudes mean by all that wacky jive talk, Mr. Whitebread Suburban Homeowner” or something similar.
posted by acb at 4:14 PM on January 10, 2018 [1 favorite]
posted by acb at 4:14 PM on January 10, 2018 [1 favorite]
"Often offensive content" is either an academic understatement or the authors are using offensive to mean only vulgar language.
I was really surprised that the authors didn’t mention the volume of misogynistic/racist/hate speech entries. They intentionally ignored it because it’s impossible to miss.
posted by not_the_water at 4:41 PM on January 10, 2018 [4 favorites]
I was really surprised that the authors didn’t mention the volume of misogynistic/racist/hate speech entries. They intentionally ignored it because it’s impossible to miss.
posted by not_the_water at 4:41 PM on January 10, 2018 [4 favorites]
Filthy Watson
Great sockpuppet name up for grabs!
posted by Greg_Ace at 6:04 PM on January 10, 2018 [3 favorites]
Great sockpuppet name up for grabs!
posted by Greg_Ace at 6:04 PM on January 10, 2018 [3 favorites]
The word with the next highest number of definitions on Urban Dictionary is love, with 1140. The other words in the top 10 by number of definitions are: god, urban dictionary, chode, Canada’s history, sex, school, cunt, and scene.Hehe, Canada's history.
posted by batter_my_heart at 11:23 PM on January 10, 2018
In tests it even used the word "bullshit" in an answer to a researcher's query.
Clutching my pearls.
posted by Segundus at 2:37 AM on January 11, 2018
Clutching my pearls.
posted by Segundus at 2:37 AM on January 11, 2018
The name “urban” strikes me as problematic....
It is in fact named after Pope Urban.*
*No it isn't
posted by Just this guy, y'know at 3:12 AM on January 11, 2018
It is in fact named after Pope Urban.*
*No it isn't
posted by Just this guy, y'know at 3:12 AM on January 11, 2018
« Older Trashy Journalism | "I went from resenting the American flag to... Newer »
This thread has been archived and is closed to new comments
IBM's Watson Memorized the Entire 'Urban Dictionary,' Then His Overlords Had to Delete It
posted by beatThedealer at 10:34 AM on January 10, 2018 [14 favorites]