I got to the Hitwise report through a link from Dave to a post on Search Engine Land written by Matt McGee (convoluted journey I know). I liked the post because I have tons of papers and things to read each day and the concise writeup pleased me greatly. The report (isn’t that long actually) says that 5+ word length queries have increased 10% comparing January 2009 to January 2008. That sounds like quite a lot doesn’t it?
Matt McGee says that basically this means that as users become more sophisticated they use more words. In my experience of evaluating user search and behaviour in a scientific environment, this means that users are having to “repair” the query more often. This means that they need to reformulate more often to get what they want. This is considered bad. The fewer words in a query, the better. The more there are, the lower the precision of the engine.
Jeff who commented on Matt’s post and said “Search engines have gotten better at ranking relevant long-tail results” and also said they convert better. I don’t know if they convert better because I haven’t seen sufficient data and analysis to deduce that, but I’d welcome seeing any if it’s available. I find it interesting also because some seo’s are saying long-tail converts and is very good.
The science says otherwise.
Here are couple of evaluations that disagree, I have found none that so far that say that a SE performs well with long-tail queries:
“Understanding the Relationship between Searchers’Queries and Information Goals” by Downy, Dumais, leibling and Horvitz (CIKM 08)
They found that a user’s query is far more specific or general than their underlying information goal. This means that the search engine has more chance of success when the the relative frequency of the query matches that of the need. Common queries require far less repair than specific and rare ones. They performed extensive data analysis and found that search engines perform poorly on long-tail queries (and URLs as measured by SERP clicks and requeries).
Their results show that “the probability of a requery drops (from almost 50% to less than 20%) as query frequency increases from the tail to 100+ occurrences”. They also state that the differences in distribution of actions following a query observed seem to be related to the query frequency rather than the query length. This means that search engines do far better on common queries. This is due to the matching techniques that they use. The best result happens when there is an alignment between the “frequency of goals and expression of those goals”.
Here we see clear contradiction with the statement that search engines are doing fine with long-tail queries, because we see that the matching techniques used are not tailored to long-tailed queries. I get excellent results when I search for the title of a particular paper which may be 12 terms long for example but this is very very specific, there can only be one result as the matching techniques are easily able to cope. If I type in a few keywords, I don’t get my paper.
For that particular paper to get a lot of traffic, many people much be searching specifically for it. How does this work in online businesses? Apparently long-tail converts. It would interesting to get a substantial amount of this data and analyse it to see whether it is indeed true. Maybe if you’re getting a return from these you should just leave it all alone if it’s not broken, but perhaps there are a lot of other variables to take into account:
Query and task complexity and frequency, how it impacts your users, query and goal rarity, seeing where you’re losing your long tail searchers…not just conversion alone for example.
“Understanding the Relationship of Information Need Specificity to Search Query Length” by Phan, Bailey and Wilkinson (SIGIR 07)
They found that the longer the query, the more specific it was. In this sense we can talk of repair but how many here type in a long query first off? ”We found an average cross-over point of specificity from broad to narrow of 3 words in the query”. “broad” and “narrow” queries are used to define “quality of current knowledge and knowledge state specificity”. They found that there was a “statistically highly significant relationship (99% confidence level) between narrow/broad specificity and query length”. The intersection of broad and narrow terms was observed to be at 3. So they conclude that “as query length increases, the corresponding information need is more likely to be perceived to be narrow.” They basically found that there is a correlation between query length and the degree of specificity of a query. There are however such things as short but specific queries so they want to look at those next.
So, like me searching for a particular paper, people look for similarly specific things. I don’t know if most of the population looks for that particular paper but I think not many. How generic can long-tail be?
“A Study of Query Length” by Arampatzis and Kamps (SIGIR 08)
Interesting paper describing analysis using query length, and fit power-law and Poisson distributions. The main thing to take away is: “The relative steepness of the power-law indicates that users do not need many words to formulate information needs or that the diminishing value of adding words appears soon.”
This is an old paper from 1999 but you shoud read it because it describes good evaluation methods and the results haven’t changed much in 10 years:
“Patterns of Search: Analyzing and Modeling Web Query Refinement” by Lau and Horvitz
The goal of their research was to use Bayesian networks to infer the probability of a user’s next action. Again we see that specialized queries contain more words than others, and queries become longer as query refinement occurs. They break down the data into different categories like education and so on, and they found that the overall average query length was of 2.30 words, the longest were in the education category and had a mean of over 3 words per query.
From the Hitwise data we can see that there are more instances of queries of 5+ words per query for the long ones. But, their data does show that the most prevalent queries are those of 1,2 and 3 words. Those are still quite significantly the most used query lengths.
Are the search engines cleverer? Well the longer your queries the better for me, I’ve spent 5 years trying to prove that people are ready for natural language querying, so please continue! I work with conversational systems and the point of them is to retrieve information and present it to you in natural language. The thing it shares with a search engine is the IR part (which is quite vast!). It’s harder in natural language systems because they have more words to deal with, grammar, anaphora, etc…a standard search engine uses matching based on fewer dimensions – for both if the user needs to keep reiterating, it’s not performing well.
For some SEO’s the take is that you go for long-tail when you can’t rank for the competitive short queries. They also maintain that long-tail does not convert high enough to be worth it. I’m reading very conflicting things about this.
As a community of SEO practitioners, we should be doing our own experiments. This doesn’t mean in isolation, each looking at our own data, but rather pooling together anonymous logs and other data so we can verify these things. Obviously on a practical level this isn’t so easy I know, and we have to make money and not shuffle about. In my experience though analysing the data well pays off in the long term. As an SEO professional I have never had the opportunity to do this on that scale. I’m certainly ready to though.