Chapter 4 | RNI Elasticsearch Plugin

4. Interpreting RNI Scores

RNI scores range from 0 to 1. The higher the score, the greater the confidence that this a relevant match. A score of 1.0 indicates that the query name string and result name string are identical (including all name properties), and scores less than 1.0 for similar names where the query name and index name vary with respect to one or more properties (such as language of origin) and one or more of the following:

Variation	Example(s)
Phonetic and/or spelling differences	Nayif Hawatmeh and Nayif Hawatma
Missing name components	Mohammad Salah and Mohammad Abd El-Hamid Salah
Rarity of a shared name component	Two English names that contain Ditters are more likely to match than two names that contain Smith
Initials	John F. Kennedy and John Fitzgerald Kennedy
Nicknames	Bobby Holguin and Robert Holguin
"Cousin" or cognate names	Pedro Calzon and Peter Calzon
Uppercase/Lowercase	Rosa Elena PACHECO and Rosa Elena Pacheco
Reordered name components	Zedong Mao and Mao Zedong
Variable Segmentation	Henry Van Dick and Henry VanDick
Corresponding name fields	For [Katherine][Anne][Cox], the similarity with [Katherine][Ann][Cox] is higher than the similarity with [Katherine Ann][Cox]
Truncation of name elements	For Sawyer, the similarity with Sawy is higher than the similarity with Sawi.

Scoring is commutative: the scores for two given names are always the same, regardless of which name is in the index and which name is in the query.