Content analysis toolkit (CAT)

It’s not a list of tools, it’s one easy to use tool. CAT is used to explore relationships in text and to get a better overview of how it is made up and what elements are important in it. CAT also works on an entire folder, so you can get some pretty interesting information on what your collection of documents is all about.  You can identify overlying topic groups and constructs a model that represents your text(s).

Topics in CAT consist of a number of words ordered by descending probability (i.e. words best describing the topic precedes words that describe the topic to a lesser extent). Associated documents and other closely related topics further characterise a topic. The user can navigate the model by focusing on a given element (i.e. a topic, document or word) and using their inter-relationships to focus on associated elements (i.e. highly appropriate topics, documents and words). The user, as an expert or a student in the field, may supplement each topic with a label to describe it better. The user may further open any document in the analysed document collection directly from the CAT interface to examine its content.

As part of the process of exploring and understanding the content of a collection of documents users may further rate documents according to the perceived level of usefulness or quality. CAT also returns a vocabulary for each analysis consisting of all significant words and terms encountered in the specific analysis along with a weight indicating the level of generality or specificity of the given word or term. Users can further use the integrated look-up facilities to establish the meaning of a given word, acronym or term using Google, on-line dictionaries, gazetteers or Wikipedia. Words or terms may be supplemented with a definition or description to capture the newly discovered meaning of a unknown word or term.

Here a screenshot representing one of my smaller folders in my ever growing messy library! CAT is useful because I can tell what’s in each one and when I have 100’s of texts in there, it makes life a lot easier. Another good use for CAT is for looking at content across an entire site rather than just one page at a time. Now that’s pretty interesting don’t you think? You can get a lot more stats out of CAT than this though.

CAT graph

The software is currently in public Beta so you can just go and download it and start playing with it right away if you think this is the sort of thing that might be useful to you. Knowing what the overriding topics are in a large site is pretty interesting, in fact I’d like to see a version which can spider a site and pick all that out, it’d be great to have some comparative data.

Post to Twitter Tweet This Post

Related Posts:


4 Trackbacks/Pingbacks

  1. Daily Links for Sunday, June 7th, 2009 | LaptopHeaven's Blog 07 06 09
  2. Daily Links for Monday, June 8th, 2009 | LaptopHeaven's Blog 08 06 09
  3. Daily Links for Sunday, June 7th, 2009 11 06 09
  4. Daily Links for Monday, June 8th, 2009 11 06 09

Your Comment






© 2009-2010 Science for SEO All Rights Reserved -- Copyright notice by Blog Copyright

SEO Powered by Platinum SEO from Techblissonline

Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.