Today’s paper is all about website visitors and how to get the most out of the data you collect. A good example of what to look for and what can be identified is presented by Bruno Goncalves and Jose J. Ramasco who wrote a paper called “Towards the characterization of individual users through web analytics“. This one is freely available online, so my synopsis will be shorted than usual as all of you can go and read the full thing.
They looked at temporal patterns, focusing on users’ return to a particular page. They measured the interval between visits and also how likely it was that these visitors would return. The random surfer model has been proved wrong many times, and this study goes on not only to show that once again but also to shed light on the factors that do typify a web user.
They used the Emory University website which received “over 3 million visitors to about 2:5 million pages for a grand total of over 53 million requests”. They used the probability of finding an inter-visit time of length T, P(t), and the probability of a user returning to the same site after a time T of the first visit, R(T). They found a steady and slow decrease in both functions.
Their findings reveal that:
- Visitors didn’t return to a previsously visited page
- They identified automated processed (like bots) through their regular dynamics
- Intervals between 2 consecutive requests to a given web page obey a power-law distribution with exponent over about 5 orders of magnitude in time.
- This finding allowed them to see that there was no average and that big fluctuations were possible, thus the “normal” user does not exist.
- Around 68% of requests were due to the visitor already having seen that page n the past
- The found that they could classify visitors into categories:
- Those that hit refresh or “back” and those students and staff coming back (usually after a 16hr period) to finish a task or check on updates or something.
The main idea to take away for site owners and SEO’s is the fact that categrories of visitors need to be considered. Instead of lumping them all together and looking at those patterns, it is probably more insightful to identify different categories of users based on their behaviour. This could be identified by looking at browsing patterns, temporal information, purchase history or product interest (identified through page interest). Many other variables can be found in website data if it’s filtered and analysed correctly. Metrics provided by analytics systems are useful but carrying out further data mining work on them can give you more detail and insight.