Industry News
Technology
Culture
Mobile
Development
Business
Startups
Politics
Education
Web
Interviews
Luganda
This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.













Kamusi: The Semantic Search Engine of African Languages
What African language is spoken by one out of every 60 people on earth? Swahili. The Kamusi Project function is to help that number grow with their collaborative database/dictionary: Kamusi.
Late last year KamusiProject.org put out a call for volunteer programmers to help them make some changes to their website. They wanted to go from Perl to PHP, a debugged MySQL Database, reduce the amount of requests from search engines and they wanted to expand the database with a plan to go from two languages to two dozen in two years. Coders championed the cause and while not all of the updates have been made to date, Kamusi recently announced that they’d be going multilingual (adding many other widely spoken languages African and global) and that they will soon deploy an embeddable widget called Wijiti!
If you aren’t familiar with the Kamusi Project, the owners refer to it as a “living dictionary” that allows users to look up English to Swahili, Swahili to English translations and definitions. What makes Kamusi stand out are all the collaborative features that have been integrated. Users can upload pictures for their corresponding words. Users can also request to become editors (think moderators on Wikipedia) to help improve translations and adding words. There’s a sidebar that allows you to see what other users are searching in the database in real time (with each page refresh).
The biggest problem Kamusi faces is affording enough bandwidth to handle all the calls to their server. One reader asked if they would open the new languages to a Wiki-style system allowing for the contextual database to be populated more quickly. Here is what project director Martin Benjamin had to say in response:
I question the benefit though, whil I think this is a great idea, if the Kamusi board is having problems with bandwidth now, once they deploy this widget their database will be inundated with queries. If a site that runs their widget gets Dugg or in Times magazine or something (not to mention the site itself), they’ll long for the days when they only had search engine problems! Regardless take a look at the plans for Wijiti below.
Slide show of the Wijiti specs:
The best thing about Kamusi is the fact that beyond direct translations, Kamusi queries also return modifiers for words and explains what it means to add them, how grammatical context can change that words meaning and what class (part of speech) the word is. There’s also a learning center that features multimedia tutorials, a forum and all sorts of other useful tools for those who want to learn Swahili. Martin Benjamin obviously has some big plans in mind. In his piece called “The End of English” he references the economics of language and loosely describes in perfect detail what makers of semantic web applications all over the world are envisioning for the future of the internet:
What Martin may not realize is that Kamusi could play a huge role in this, allowing for his ‘living dictionary’ to add a semantic layer of context to future widgets and web applications. Wijit truly looks interesting and it’ll be exciting to see how it’s used by fans, bloggers and other website owners adopt it.
Follow the Kamusi Project on Twitter