I came across this presentation by the National Library of Medicine and thought it was a great insight into working with Twitter data. They looked at tracking the H1N1 virus using the MEDLINE prototype.
MEDLINE is the National Library of Medicine’s online library that contains 11 million citations and abstracts from health and medical journals. Medline is the main component of PubMed which is a huge database of citations all related to the field of health. Those working in the health research field will be familiar with it, but I would say that computer scientists working in information retrieval will be more intimately aware of the collection. It is well structured and so we use it to test algorithms.
This is a nice presentation about using the MEDLINE idea and applying it to Twitter data – enjoy.





It is interesting. Do you know if they have a paper?
I have no idea, sorry!