Today I went and presented my research work for my PhD thesis to a group of really knowledgeable researchers. It was excellent to get feedback, pointers, help and some insight into how huge this area of research is. I thought I would share my slides with you. This is of no interest to SEO right now[...]
Archive for the ‘Tutorials’ Category
How does search engine personalisation work?
Search engines are personalising their results and not only in the SERPS but also in the paid ads arena and for things like Google alerts for example. It’s important to understand why personalisation is difficult, how it works and what future directions are. Without this information it’s[...]
Knowledgebase vs Database
With the launch of WolframAlpha the mainstream public has been exposed to new term, and a new concepts: “Knowledgebase” and “knowledge engine”. There has been an awful lot of confusion over this, and the multiple posts comparing WolframAlpha to Google are a testimony to this.[...]
Make a web spider
I’m often asked how to make a spider to crawl the web, or a specific site, a directory, look for hubs, and so on. They’re not hard to make and there are a ton load of them out there. There is really no need to reinvent the wheel and write a load of other ones [...][...]
Text Mining tutorial
This is a fantastic presentation which takes you through all sorts of commonly used algorithms in text mining and for processing web data. Concepts and methods from machine learning are presented and there’s even some little stick men, so it’s gotta be good. Who is this for? Those of you[...]
10 papers you need to read
This is a list of my top 10 freely available papers on the topic of information retrieval. You will notice that they are rather old, but the techniques used described and the findings are not always dated. Those that dated are important nonetheless because they provide a good foundation to under[...]
How does a search engine know what words mean?
Word sense disambiguation (WSD) belongs to the field of computational linguistics. It’s the research area dedicated to finding ways for machines to understand the meaning of words. More precisely, it’s about determining the word sense of a particular word in a context. This[...]



