The OED in two minutes
January 8, 2015 5:31 AM   Subscribe

The OED in two minutes is a visualisation of the change and growth of the English language since 1150, showing the frequency and origin of new words year by year. Notes and explanations about the project. posted by dng (18 comments total) 22 users marked this as a favorite
 
This is fascinating. Of course, I'm a history nerd, and seeing the old word shifts is fun, but I think it's really interesting to look at more recent years and see words like otaku show up.

Even weirder is to see words that I didn't think were, like, a special THING show up really recently. Portobello in 1990? Polyamory in 1992? Chav in 1998?
posted by chainsofreedom at 5:58 AM on January 8, 2015 [1 favorite]


I love this kind of thing, thanks for posting.
posted by mareli at 6:13 AM on January 8, 2015 [1 favorite]


There are two points which are easy to see with this map:

1. English has long since stopped being a borrowing language in the way it was in medieval times.
2. The scatter of low frequency words from around the world is utterly outnumbered by those of French, Latin and Greek origin. Even into the 1970s there is still a marked preference for those words.
posted by Thing at 6:46 AM on January 8, 2015 [1 favorite]


Hey this is neat! But isn't it more of a view of the written word and the OED's research capability than it is the actual language? Not that I have a better source for the actual language...

What's the state of the OED online, anyway? The words in this visualization link to full entries available for free, is there no more paywall? Although it's hilarious the URLs are like http://www.oed.com/view/Entry/141900; was there some deliberate decision not to put the subject word ("petrify", in this case) in the URL? That's like anti-SEO.
posted by Nelson at 7:31 AM on January 8, 2015


English has long since stopped being a borrowing language in the way it was in medieval times

Is this really true? Or was the rate of growth higher during medieval times because the base size of the vocabulary was much smaller?
posted by Pararrayos at 8:14 AM on January 8, 2015


Or, perhaps the OED is a lot more conservative with what it considers as new words in the language in recent years.
posted by graventy at 9:00 AM on January 8, 2015


> English has long since stopped being a borrowing language in the way it was in medieval times

Is this really true? Or was the rate of growth higher during medieval times because the base size of the vocabulary was much smaller?


The latter. The circle representing the rate of growth gets smaller and smaller because the existing vocabulary has gotten so huge it's hard to make a dent in it. English has been borrowing like a drunk gambler for centuries and shows no sign of slowing down.

> Or, perhaps the OED is a lot more conservative with what it considers as new words in the language in recent years.

No, not at all.

This is a great post, thanks!
posted by languagehat at 9:06 AM on January 8, 2015 [5 favorites]


Thing: The scatter of low frequency words from around the world is utterly outnumbered by those of French, Latin and Greek origin.

I was surprised to see how few English words are derived from Greek words, compared to Latin. I think of "Greek and Latin" as going hand in hand, both as classical languages and as equals in terms of etymologies. But this visualization shows that as of 1700, Greek-derived words are about 0.1% of the language, and as of 2010, Greek-derived words are now about 0.2%. In both 1700 and 2010, about 7% of English words are Latin-derived.
posted by chickenmagazine at 9:19 AM on January 8, 2015


The latter. The circle representing the rate of growth gets smaller and smaller because the existing vocabulary has gotten so huge it's hard to make a dent in it. English has been borrowing like a drunk gambler for centuries and shows no sign of slowing down.

The size of the circle doesn't stand for percentage of vocabulary, but frequency of the word borrowed. Words borrowed after 1650 make up tiny percentage of not just words in the total vocabulary, but tokens in any text. In medieval times English borrowed words which were or became absolutely essential to the language. That seldom happens any more.

I was surprised to see how few English words are derived from Greek words, compared to Latin. I think of "Greek and Latin" as going hand in hand, both as classical languages and as equals in terms of etymologies. But this visualization shows that as of 1700, Greek-derived words are about 0.1% of the language, and as of 2010, Greek-derived words are now about 0.2%. In both 1700 and 2010, about 7% of English words are Latin-derived.

Words derived from Greek are much smaller in frequency than Latin or French, but still huge in comparison to most languages. In 2010, the figures for frequency are:

Germanic: 49%
Compound: 26%
Romance: 18%
Latin: 7%
Greek: .2%
All other: .2%

This is something which is too often overlooked when people say things like "English has been borrowing like a drunk gambler for centuries". Were that true, then it would be hard to reconcile the numbers. English has been rather selective in the languages which it borrows from, French, Latin, and Greek being the main three. This is not a linguistic fact of English, but rather a cultural fact of England and English-speaking people. It's also historical. The cultural influence of these three languages has waned, and with it the huge borrowing that we saw in medieval times. Too many people get taken away with the history of English that they fail to see what is happening now.
posted by Thing at 9:40 AM on January 8, 2015


This is something which is too often overlooked when people say things like "English has been borrowing like a drunk gambler for centuries".

Surely, the question of who, exactly, English has been borrowing like is more dependent on how other languages borrow than the number of languages from which English has borrowed. Since English is a Germanic language, the numbers you cite above suggest to me that about half of all English vocabulary has been borrowed from other languages, which seems high to me, but I admit I have no idea what percentage of most modern languages are borrowed from other languages. If they all clock in at 50%, then English is pretty average in its borrowing and, so, would not be much like a drunken gambler in this regard. If the percentage is more like 5% or 10%, English would be borrowing noticeably more, perhaps, indeed, in the manner or a drunken gambler. Can anyone give a sense of borrowing rates for other languages?

Of course, I suspect that we would have to make some decisions on what, exactly, is a loanword. "Laderhosen" is a word that makes sense in English, but only to refer to the German garment -- is that a loanword? Something like "enui" has been much more thoroughly incorporated and is a loanword by pretty much any definition you'd like. Should the former be counted against English's "borrowed word total?"

Regardless, the number of languages providing words is kind of irrelevant to a discussion of how much of a borrower a language is.
posted by GenjiandProust at 10:51 AM on January 8, 2015


> The size of the circle doesn't stand for percentage of vocabulary, but frequency of the word borrowed.

You're talking about the circle representing a word, I'm talking about the circle at the lower left representing how fast vocabulary was growing, which is at its largest (if I recall correctly) around the fifteenth century. It shrinks noticeably (downright alarmingly) in the last few centuries, which surprised me until I realized what was going on.

> Of course, I suspect that we would have to make some decisions on what, exactly, is a loanword. "Laderhosen" is a word that makes sense in English, but only to refer to the German garment -- is that a loanword? Something like "enui" has been much more thoroughly incorporated and is a loanword by pretty much any definition you'd like. Should the former be counted against English's "borrowed word total?"

All those are loanwords. A loanword is not a foreign-sounding word, it's a word that was borrowed from another language rather than inherited. Cheese is a loanword (it's from Latin), for example.

> Regardless, the number of languages providing words is kind of irrelevant to a discussion of how much of a borrower a language is.

Uh, sort of? I guess? Depending on how you approach it? I don't know, but in any case it doesn't have anything to do with what I was talking about.
posted by languagehat at 11:11 AM on January 8, 2015 [3 favorites]


I did not expect to see a Hungarian->English borrowing that I was unfamiliar with, as those are the two languages of the entire world that I know best, but that's what happened. And it's apparently a significant musical instrument that I've never heard of. Huh.
posted by Wolfdog at 12:44 PM on January 8, 2015


Very cool.

A loanword is not a foreign-sounding word, it's a word that was borrowed from another language rather than inherited
Does that mean that every word not deriving from germanic languages is a loanword?
posted by dg at 1:38 PM on January 8, 2015


Is there a link to a list of the words used in the video that I'm missing somewhere?
posted by BlueHorse at 1:45 PM on January 8, 2015


You're talking about the circle representing a word, I'm talking about the circle at the lower left representing how fast vocabulary was growing, which is at its largest (if I recall correctly) around the fifteenth century. It shrinks noticeably (downright alarmingly) in the last few centuries, which surprised me until I realized what was going on.

I am talking about the bottom left circle too. It's a sum total of the frequency of borrowed words, not total number of words. The drop-off after 1650 is reflective of the fact that few high-frequency words have been borrowed since then.

Surely, the question of who, exactly, English has been borrowing like is more dependent on how other languages borrow than the number of languages from which English has borrowed. Since English is a Germanic language, the numbers you cite above suggest to me that about half of all English vocabulary has been borrowed from other languages, which seems high to me, but I admit I have no idea what percentage of most modern languages are borrowed from other languages. If they all clock in at 50%, then English is pretty average in its borrowing and, so, would not be much like a drunken gambler in this regard. If the percentage is more like 5% or 10%, English would be borrowing noticeably more, perhaps, indeed, in the manner or a drunken gambler. Can anyone give a sense of borrowing rates for other languages?

But there's a difference between current borrowing and cumulative historical borrowing. It is absolutely true that English has a lot of borrowed words in the vocabulary. It's not the highest in the world, but it's above average. However, many of these are old and the recently borrowed words are both smaller in absolute number and in the level of frequency. We shouldn't regard what happened from 1250-1650 to be representative of what is happening now.
posted by Thing at 1:57 PM on January 8, 2015


All those are loanwords. A loanword is not a foreign-sounding word, it's a word that was borrowed from another language rather than inherited. Cheese is a loanword (it's from Latin), for example.

Ok, and I am asking this from ignorance, but doesn't a loanword need to be incorporated into the language in some way? Lederhosen has no English meaning beyond naming a particular garment. Is that enough for something to be a loanword?
posted by GenjiandProust at 5:14 PM on January 8, 2015


> Does that mean that every word not deriving from germanic languages is a loanword?

It means that every word not inherited by English from Proto-Germanic is a loanword. If it's borrowed from German, it's still a loanword.

> Ok, and I am asking this from ignorance, but doesn't a loanword need to be incorporated into the language in some way? Lederhosen has no English meaning beyond naming a particular garment. Is that enough for something to be a loanword?

I'm not sure what you mean by "incorporated into the language in some way"; that's true of every word in the dictionary. Since you can say "He's wearing lederhosen" and people will know what you mean, it's ipso facto an English word. (Of course, the question of what is an English word is a vexed one, and I am more liberal than a lot of people on the topic -- in fact, about as liberal as it's possible to be. There was a long and contentious discussion at LH a couple of years ago about it.)

> I am talking about the bottom left circle too. It's a sum total of the frequency of borrowed words, not total number of words. The drop-off after 1650 is reflective of the fact that few high-frequency words have been borrowed since then.

Ah, I misunderstood then -- thanks for edumacating me!
posted by languagehat at 5:49 PM on January 8, 2015 [1 favorite]


GenjiandProust: " I admit I have no idea what percentage of most modern languages are borrowed from other languages."

I also admit to having no idea of percentages, but I know that English doesn't hold a candle to Japanese or Korean when it comes to adoption. Lots of people recognize Japanese as having a lot of English loanwords like "biru" (beer) or "aidoru" (idol) or the like, but people forget about all the Chinese loanwords, which make up an absolutely enormous swath of the Japanese language that dwarfs the Western-derived loanwords (from what I can tell, the same is true for Korean as well). I'm very curious about other nearby languages, especially ones that used to be written with Chinese characters, like Vietnamese. But the enormous influence of ancient Chinese civilization may make East Asia a bit of an anomaly, I dunno.
posted by Bugbread at 9:03 PM on January 8, 2015


« Older Where No Kerbal Has Gone Before   |   I said “I feel like these characters should be... Newer »


This thread has been archived and is closed to new comments