Ranking and Navigation
Rubicon 2 offers a variety of ranking and navigation options.
These options affect the ordering of records and can have a significant performance
tradeoffs depending on a number of matches and component configuration.
The purpose of this document is to provide an overview of ranking and navigation. You should be familiar with RankMode, SearchMode, and the FindFirst/Next/Prior/Last properties and methods. For a detailed description of the classes, methods, and properties discussed here, please see the Rubicon documentation and help files. Rubicon indexes only contain information on whether a word appears in a given location. The indexes do not contain information on the number of times a word appears in a location nor any positional information about the word. This keeps the size of the Words table much smaller than it otherwise would be, but it does require that ranking be performed dynamically. Since ranking can be time consuming, ranking is not performed until it must be. For instance, a search that brings back a large number of matches may never be viewed, so time should not be spent ranking it. Therefore, ranking, if any, usually occurs when RankArray is first accessed, usually by the FindFirst method or when TrbMatchMaker is executed. Ranking is based on how many times the words that matched the search criteria appear in the matching location. For searches using slNear or slPhrase SearchLogic, the ranking process only scores the matching words in the record, not whether the words are near or in sequence of each other. Ranking usually requires that all the matching records be read, ranked, and the rank results sorted in an internal array. If there are 100 matching records and the RankMode is rmCount, then a call to FindFirst will have to rank all the matching records in order to move to the record with the highest rank (subsequent calls to FindFirst do not require re-ranking all the results unless the search has changed). There are four RankModes: rmNone, rmCount, rmPercent, and rmPresense. rmNone leaves the records in index or natural order as determined by the IndexFieldName used to make the TextLink. Since the TextLink may be open on another index, the ordering of records may not be consistent with other views of the table. rmCount orders the records by a count of the matching words. rmPercent is like rmCount, except results are normalized to a 100 scale. rmPresense orders the records by counting whether or not a matching word appears in the record. Each matching word has a maximum score or rank of one. If the search is for "delphi or paradox or dbase" and all three words appear in the text, the rank will be three (even if a word appears multiple times), if only two words appear, the rank will be two, etc. Note that this property is only useful when the search uses OR search logic, otherwise all the scores will be the same.
Except as noted, the order of the records in the examples is the same when using the FindFirst/Next/Prior/Last methods and when using TrbMatchMaker. In the above cases, ranking occurs when the first record is accessed by FindFirst or before the first record is added to the Match table (except for the rmNone which has no ranking). Case #2: Restricting RankLimit
Setting RankLimit to a lower value reduces the time spent ranking, but only the first RankLimit records are ranked. Records 743 and above are not ranked.
The results are just like the previous case except processing starts from the last record in the table and works backward. Records 731 and below are not ranked, which may be desirable when records are being added sequentially to the database and you are only interested in the most recent records. When the soNavReverse option is enabled and RankMode is rmNone, FindFirst moves to record 835, FindLast moves to 708, and the FindNext and FindPrior methods are also reversed. Case #4: Natural Ordering, Efficient Ranking
When RankMode is rmPercent the results are just like Case #2, but the records are ordered naturally. Only the first RankLimit records are ranked. The rmCount and rmPresense results rank all 10 records even though the RankLimit is 5. What is going on? Unlike the previous cases which had to rank all the records beforehand, there is no need to do any ranking in advance of the first navigation. Therefore, ranking can be postponed until the dataset has actually moved to the record, and since the dataset is at the matching record, it might as well be ranked. This is the most efficient way to rank. When using TrbMatchMaker, it may be more efficient to create the dataset as described in this case and then add an index to the Rank field to achieve the ordering of Case #1.
|
Copyright 2003 © Tamarack Associates |
||
www.TamarackA.com | Last updated 10/29/00 | www.FullTextSearch.com |