Google research beyond LSI

Google picked up Amrit Gruber who is doing an internship with them.  He’s pretty valuable because of his PhD research in statistical text analysis (which is what LSI is).  His method is uses Hidden Topic Markov Models (HTMM) and a working version was released in 2007.

In this post Google mention PLSI (Probabilistic latent semantic indexing) and also Latent Dirichlet Allocation as examples of varients to LSI.
It’s different because instead of treating the document as a bag of words, it uses a Temporal Markov Structure.  
Read the Google post heremini rdf Google research beyond LSI, and OpenHTTM is available heremini rdf Google research beyond LSI.  Good old Google, thanks for sharing.
This supports my postmini rdf Google research beyond LSI about how LSI in its very basic form as summarized in various places as well as the excellent Wikipedia is not the variety used in Google, whatever Matt Cutts says.  Yes it is used, but he doesn’t give away the important information, what he presents is a very very basic version.  It’s like saying “Yes, we use glue in our computer chips” or “Yes, here at NASA we use Glue as an adhesive for our rockets”.  It’s unlikely to be the glue your child uses at playschool icon smile Google research beyond LSI

Related Posts:


Your Comment






© 2009-2013 Science for SEO All Rights Reserved -- Copyright notice by Blog Copyright

SEO Powered by Platinum SEO from Techblissonline

Twitter links powered by Tweet This v1.8.1, a WordPress plugin for Twitter.