This is our blog. We write about cool experiments, interesting case studies, open talks and technicalities.
Grab the RSS feed
talks and presentations


What’s that Odor?

I use Gavagai Lexicon extensively to lookup the actual use of words present in everyday online language. So, when the notion of “odor”, for some reason, popped up at the office yesterday, I turned to the Lexicon to see what our self-learning semantic memory has learned about it. The annotated screenshot below shows that the semantic memory knows three fairly distinct meanings of “odor”. The most prominent one has to do with “smell, stench, reeked”. A second meaning has to do with odor as instrumental in alerting people to something else, such as “foul odor alerted residents to dead body”. The…


The extraordinary productivity of foul language – Do you and your text analytics solution know these bad words?

By looking into the extraordinary productivity of foul language, this post showcases the ability of the Gavagai’s semantic memories to automatically learn and relate terms in a vocabulary. If you are sensitive to swearing and cursing, you should stop reading now! Foul language, profanity, expletives, and bad words. The creativity of the human mind when it comes to inventing impolite, rude or offensive language is simply amazing. But regardless of how productive a single human being might be, she still will never be able to come up with all the variants of a given bad language concept used throughout an…


Poor Panama – The Central American state’s name is heavily associated with the events at Mossack Fonseca

Poor Panama. Since the investigation around the Panama Papers was made public earlier this month, mentions of the Republic of Panama in online media has been heavily associated with negative connotations such as “tax evasions”, and “shell companies”, and “leaked documents”. Although more intended as a way of inspecting the state of the semantic memories, the Gavagai Living Lexicon can also serve as a probe into the state-of-mind of the online media. As illustrated in the screenshot of the Lexicon below, “Panama” has, as of this writing, an unfortunate relation to “Mossack Fonseca” (click the image for a larger version). How will…


Business bingo – Is your text analytics system up-to-date with current affairs?

In my role as Chief Data Officer at Gavagai, I meet with lots of leads, clients, and data providers. Much of our conversations are carried out in English, and as a non-native speaker, I sometimes find the choice of wordings peculiar, and at times slightly amusing.Touch base, reach out, back-to-back, and help me understand, to name but a few. In the game of buzzword bingo, players tick off pre-defined buzzwords available on a bingo-like board. But what to enter as buzzwords? How would you recognize such a word? In my view, many of the business terms I’ve encountered would qualify as…


What do Czech, Hebrew and Italian have in common?

Answer: they have just been added to the Gavagai Living Lexicon, which is an unsupervised semantic memory that continuously learns language by reading large amounts of online news and social media. You can think of the lexicon as a brain in silico (or, equivalently, as a piece of artificial intelligence) that tirelessly reads online media and learns how terms are related to each other. As of today (2016-02-16), the Gavagai Living Lexicon contains the following 20 languages: Czech, Danish, Ducth, English, Estonian, Finnish, French, German, Hebrew, Hungarian, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish …and more is…


Spanish in Gavagai’s Living Lexicon

Spanish is now live in Gavagai Living Lexicon. It took about 3 days to add. Since the lexicon is at the heart of our systems, Spanish is now available in all our applications and the API. So now you can add targets in Spanish in the Monitor, explore themes in Spanish for answers to open-ended survey questions in Explorer, lookup relationships between words in Spanish in Living Lexicon, and make API calls with Spanish text, for example to get multi-dimensional sentiment (“love”, “hate”, “fear”, “violence”, “desire”, “skepticism”, “positive sentiment”, and “negativity”) or to cluster the most relevant sentences of Spanish texts (just like in…


Gavagai Living Lexicon online

We are proud to announce the release of the Gavagai Living Lexicon – an online lexicon that gives you access to the knowledge our distributional semantic models gather about terms in language as it is used by people in every corner of the known world. The lexicon is based on Gavagai’s distributional semantic models that learn language constantly from live data feeds with millions of documents per day from both social and news media. This means that the living lexicon is continuously evolving and always à jour with current language use. As an example, try searching for some topical term…


Pulled pork is now officially mainstream cuisine

Pulled pork which until recently has been the most archetypical hipster meal component in Stockholm has now according our live lexicon become part of the Swedish mainstream. Its closest neighbours in Swedish word space are less than edgy.