Ranking Methods in Full Text Search

Posted on April 29, 2008

5


This is direct from MSDN but I was asked about this specifically so I thought I’d post about it. If anyone has used full text search you will know that the default way it ranks results is pretty good. It uses a known method called Jaccard Coefficient. However, there are times when you might want something els.e. Sharepoint 2007 (and versions onwards I assume) uses Enterprise Search and supports the following ranking methods. I have no idea if or when this kind of thing will be supported in 2008 or future versions of SQL Server out of the box.

JACCARD COEFFICIENT (default if not specified)

Calculates ranking results from the relative proportion of matching terms, excluding any terms that are not matched.

DICE COEFFICIENT

Calculates ranking results from the frequency of multiple terms found together, compared with the probability that they are found in isolation.

INNER PRODUCT

Calculates ranking results by using the integral of the products of the ranks of the individual matching documents.

MINIMUM

Calculates ranking results from the lowest rank score from all the matching documents.

MAXIMUM

Calculates ranking results from the highest rank score from all the matching documents.

Now , if you want to specify the RANKMETHOD when you perform your ISABOUT query you can simply state the method you want to use.

ISABOUT ( <match_terms> RANKMETHOD <rank_method> )

An example of which is:

WHERE CONTAINS(Description, ‘ISABOUT(“computer”,”software”) RANKMETHOD INNER PRODUCT’)

I hope this helps clarify the different ranking methods there are.

Advertisements