Semantic Wikipedia
June 8, 2006 9:02 PM   Subscribe

New Scientist reports on German project to create a Semantic MediaWiki (MediaWiki is the software behind Wikipedia, Semantic is hidden/encoded meta-data). A sample page markup using the relations annotation. What do the Wikipedia (Wikimedia) folks think of implementing it? "Some members are keen, but some are dubious about additional complexity."
posted by stbalbach (11 comments total)
 
Cool idea. The term 'mindpixel' comes to mind as trying this already. That said, wikipedia has proved that the masses can produce some pretty amazing things if lead properly, and a globally built ontology full of all kinds of things would be a boon to AI research.
posted by delmoi at 9:58 PM on June 8, 2006


a globally built ontology full of all kinds of things would be a boon to AI research

That's what RDF is. That's also what Google Base is. Talking about it as a hypothetical just makes it easier to forget the reality: computers continue to be dumb no matter how much data we throw at them.
posted by scottreynen at 10:27 PM on June 8, 2006


But, boy, can they ever compute!
posted by TwelveTwo at 10:29 PM on June 8, 2006


New Scientist reports on German project to create a Semantic MediaWiki

Germans?

So long as it's not anti-Semantic.
posted by uncanny hengeman at 11:00 PM on June 8, 2006 [2 favorites]


The power of the approach is to combine small pieces of information to make the structural data more easy to manage. Complicated categories as they are currently found in Wikipedia can then be obtained by simple composition (numerous categories such as "Cities in England" become obsolete).

I didn't know cities in England was a complicated category. No wonder people keep on yelling at me when I try and add Los Angeles.

Personally, I like lists over the idea of adding more to the coding side.
posted by Atreides at 4:28 AM on June 9, 2006


I'd love to see something like this make it on to Wikipedia proper. Here's a great example of why this sort of thing is important. Consider today's featured article, US Wisconsin. See that box on the right hand side of the page? There's one of those for nearly every naval ship article on the site, and every one of them has been constructed by hand. Check out the markup: it's a mess of nasty pseudo-table markup.

Imagine if that stuff had been added semantically. Firstly, that box could be automatically created by a macro (saving a huge amount of work for the contributors). More importantly, people could programatically create things like "browse navy ships by the date they were commissioned", as a timeline, or search by number of crew members (or whatever).

As with all metadata, the challenge is coming up with an interface for entering it that is fast and simple enough not to put people off (hence why tagging has taken off in such a big way). OntoWorld's proposal is the best attempt at hitting the right balance that I've seen yet.
posted by simonw at 7:19 AM on June 9, 2006


Wikipedia already has that kind of thing for some data: the taxobox on biology articles, for example.

Semantic markup needn't be "hidden or encoded" as this FPP states. It just needs to be regular enough that it can be traversed by machine. The taxobox is an example of a regular, machine-parsable, but not hidden or encoded semantic link.
posted by hattifattener at 11:47 AM on June 9, 2006


simonw, actually, there is a standard template for infoboxes in ship articles. The USN editors still use a straight table for ship info, though. But there are plenty of such infoboxes like the taxobox that can just be filled in, and then they're "semantic" in that they have standardized data. I see that doing an article semantically would make infoboxes a lot easier to implement, but then infoboxes tend to have their own data requirements that then spur improvements in articles, so it works both ways.

They're talking about implementing a "category math" feature on MediaWiki, so that you could say, for example, "list all actors born in Wisconsin in 1952". That would be a step in the same direction.
posted by dhartung at 2:11 PM on June 9, 2006


Imagine if that stuff had been added semantically. Firstly, that box could be automatically created by a macro (saving a huge amount of work for the contributors).

There's a good reason to be wary of stuff like this: most of wikipedia's success lies in that it's cobbled together a bit at a time by actual people. Automating part of the process could have unforseen consequences that could upset the social dynamics that keep wikipedia humming -- or it could be great. There's really no way to know in advance, which is why it should be integrated through a greasemonkey-style plugin, not hoisted directly onto en.wikipedia.org (or de.wikipedia).

Self-link about making Wikipedia (the encyclopedia) more dynamic without changing en.wikipedia.org (the website).
posted by Tlogmer at 6:27 PM on June 9, 2006


Tlogmer, thanks for posting, I just checked out your blog and will be adding it to my daily feed.
posted by stbalbach at 7:22 PM on June 9, 2006


Wikipedia needs a rating system for its articles that catalogs the compiling of a short list of criticisms from readers. From that rating, many things can be targeted and even protected. Currently, there are entries that are created and defended by unschooled believers and they are not just slanted, but contain disinformation and malicious content that service the belief. The last hurdle is feedback, which will hopefully expose some of the worst abuses and abusers.
posted by Brian B. at 9:59 AM on June 11, 2006


« Older Now your penis won't cause cancer!   |   Transparent Street Signs Newer »


This thread has been archived and is closed to new comments