human language processing

Semantic annotation in collaborative content systems (like wiki)

Collaborative content systems accumulated in short time a huge amount of good quality well lectured data, being in some cases a real alternative to traditional lexicons. The natural next step is the wish to exploit this human language data programmatically, allow the "machine" to understand the data, make it machine accessible. Although some information can be extracted with the general approaches of usual information extraction from web data, enriching the texts with some kind of semantic annotation would certainly lead to a new level of utility. This annotation shall be done manually by the editors. One important constraint is thus surely ease of use (simple editing, simple syntax), which is arguably one important factor for the sucess of wiki systems and in general an seamless integration in the existing system should be envisaged, so that the user isn't required to learn to much new or readapt to a change environment. This structured information, which will be inputted by the user in some simple syntax, has to be extractible and convertible to standard formats, particularly RDF (Resource Description Framework). Third step is to provide this information as a data service.
