Search engines to calculate implicit semantic relations

 

shr0548l Search engines to calculate implicit semantic relations

www.cartoonstock.com

I came across a cool paper called ”Measuring the Similarity between Implicit Semantic Relations using Web Search Engines” by Bollegala, Matsuo and Ishizuka from the University of Tokyo (WSDM 09).

It’s all about calculating the implicit semantic relatedness between word pairs using search engines.  You enter 2 words and it returns the relation between those.

For example:

[Google, Youtube] – here the relation is ACQUIRER-ACQUIREE.  Similar ones would be [Yahoo, Inktomi] for example.  You could find all the relationships available of this type if you wanted you.

[ostrich, bird] and [lion, cat - ostrich is a large bird and lion is a large cat so the implicit relation is LARGE.

[Muslim church] should return “mosque”

[Hindu bible] should return “the Vedas”

Existing keyword-based search engines can’t really do this because they retrieve documents that match the user query and not the relationships between the keywords provided.  

How it works:

1- A query is entered

2 – Web search occurs to find the context of the word pairs

3 – lexical pattern extraction occurs

4 – Pattern clustering occurs using feature vectors…

5 – Then inter-cluster correlation occurs

6 – The relational similarity score is calculated

The lexical patterns are automatically extracted and the the similarity between different semantic relations is done using an inter-cluster correlation matrix.

More on the web-search method:

They use text snippets returned by a Web search engine as an approximation of the context of two words.  

They also use multiple queries per word-pair that induce different rankings, and aggregate search results as ranking differ with the number of wildcards used.

Similarities between words:

Attributional similarity measure: If two words show a high degree of attributional similarity they are called synonyms.

Relational similarity measure: Word-pairs that show a high degree of relational similarity are considered as analogies.

Why it’s difficult to do:

1 - relational similarity is a dynamic phenomenon.  Relations between companies, people and so on change constantly

2 - all relations between the two words in each word-pairs have to be extracted before similarity can be measured

3 – There can be more than one way a particular semantic relation can be expressed in a text

4 – WordNet does not cover all the names entities (nouns, proper nouns) that occur in queries 

Why it’s good:

It performs really well.  It “significantly outperforms the state-of-the-art relational similarity measure in a relation classification task”.  

It doesn’t require the use of NLP processing to complete the task.

It’s language independent.

Why should you care?

It allows you to find the relationships between different entities, words, or whatever on the web.  It gives really good insight into how the search engines function/could function as far as putting queries in context is concerned.  As an SEO professional it could give you a further method to look into your keywords, and as a computing professional, it gives you an interesting idea to build on.  It also fits nicely into a lot of different systems.

Post to Twitter Tweet This Postmini rdf Search engines to calculate implicit semantic relations

Related Posts:


6 Comments Add Yours ↓

  1. 1

    Thanks for a very in-depth article on this very interesting topic. Really enjoyed it.

  2. CJ #
    2

    You’re welcome!

  3. 3

    Thanks for the article, i have an Online Business Optimization Agency sembuilders.com, where i wrote about this a few months ago, still this approach will bring new questions onto this matter, great stuff! we are investigating about semantics and its future on the serp’s but hilltop will always be a secret, even for super senior seo guys ;) .

  4. CJ #
    4

    Thanks Gaston,it definitely is an interesting area to look at.

  5. elhoim #
    5

    It seems you can even get the video lecture:
    http://videolectures.net/wsdm09_bollegala_msrisr/

    or other lectures from WSDM09:
    http://videolectures.net/site/search/?q=wsdm09

    Please make nice summaries, like you do so well! :-)

  6. CJ #
    6

    Thanks for sharing those links and for the compliment :)


1 Trackbacks/Pingbacks

  1. Learn German Tips | Poetry Blog 17 03 09
  2. this is in my top ten list 22 03 09

Your Comment






© 2009-2010 Science for SEO All Rights Reserved -- Copyright notice by Blog Copyright

SEO Powered by Platinum SEO from Techblissonline

Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.