Search engines are personalising their results and not only in the SERPS but also in the paid ads arena and for things like Google alerts for example. It’s important to understand why personalisation is difficult, how it works and what future directions are. Without this information it’s very difficult to have a handle on why the things that are happening are happening, and what the changes mean for you and your business.
The tutorial below describes user profile creation, document modeling techniques (including LSI an PLSI) and the use of semantics in personalization. LSI is a reoccurring algorithm which is often misunderstood by the search engine optimisation community. There have been articles and posts which have said that it’s useless and redundant in search engines, and other which have hailed it as the new big thing. Neither of those are correct. LSI is very much alive and well and used in lots of cases. It isn’t a new big thing either it’s been around since the 80’s. This tutorial is really useful for eradicating misconceptions and giving further knowledge about this nifty little algo.
Enjoy.




I’m surprised anyone’s using LSA in production. It’s static (in principle, though can be hacked in various ways for various purposes) and is notoriously hard to scale. Scaling’s possible using stochastic gradient techniques for Netflix-sized data sets (20K x 500K partial matrix with 100M entries), which have lots of features, but are partial rather than sparse. I don’t think it’d be possible to, say, compute a sparse SVD over Gigaword (1M x 100K sparse matrix with 1G non-zero entries), much less over the web, which is several orders of magnitude larger.
So now I’m curious about who’s using it (or any other matrix factorization method or even general factor anlaysis) and what they’re doing with it.
SVD is becoming increasingly popular for generating predictive structures for cross-task and cross-genre applications, but these are relatively small, static data sets.
PS: Thanks for the pointer to the tutorial.
Most of the internet marketers don’t know what exactly is SE Personalisation. And yes, it is difficult to understand. Till date, I am also included in those internet marketers. As far as my knowledge from this post, SE Personalisation is providing an user with the information and services while searching according to his/her past saved activities. Am I correct or I need to make improvements?
You are quite right!