DDC is a robust scalable corpus query system.
http://sourceforge.net/projects/ddc-concordance/
It takes as input a collection of texts, creates multiple indexes, to allow for fast results even to complex queries, eg:(adjective or article) one to x nouns (having /regexp/) followed by verb ending "nt" within next 5 words
.
Even for collection of mulitple 100 millions of words, the results are returned within one second, together with the exact number of hits.
The result is returned as a paginated list of hits with the matching text and surrounding context.
Variable sorting and filtering options (depending on the annotation level of the input data).
This all is nice but not so special, as there are other corpus query system showing similar capabilities.
The presumingly unique feature of this system is it's capability of distributed service. Ie it is possible to run multiple servers, which act as threads. User's query is send by the "asked" server to its peers, processed here simultaneously and the partial results are sent back to the "asked" server and merged there, providing the user with the merged result, completely transparazing the distribution of the system and with next to no time handicap. Especially with regard to the optional sorting, this is exhibition of remarkable performance power.
Perhaps one aspect to point out with respect to the distribution is the fact, that the communication between the server and its threads is based on sockets exchanging simple as can be protocol.
This fact is also used for the later developed API in perl and python, which don't have to do much more, than to translate user query into protocol-conformant string, send it via socket and output the incoming result (which comes optionally as plain-text, html or self-defined xml).