Loading AI tools
String matching algorithm From Wikipedia, the free encyclopedia
Trigram search is a method of searching for text when the exact syntax or spelling of the target object is not precisely known[1] or when queries may be regular expressions.[2] It finds objects which match the maximum number of three consecutive character strings (i.e. trigrams) in the entered search terms, which are generally near matches.[3] Two strings with many shared trigrams can be expected to be very similar.[4] Trigrams also allow for efficiently creating search engine indexes for searches that are regular expressions or match the text inexactly. Indexes can significantly accelerate searches.[5][6] A threshold for number of trigram matches can be specified as a cutoff point, after a result is unmatched.[4]
Using trigrams for accelerating searches is a technique used in some systems for code searching, in situations in which queries that are regular expressions may be useful,[5][2][7] in search engines such as Elasticsearch,[8] as well as in databases such as PostgreSQL.[4]
Consider the string "alice". The trigrams of the string would be "ali", "lic", and "ice", not including spaces.[5] Searching for this string in a database with a trigram-based index would involve finding which objects contain as many of the three trigrams as possible.
As a concrete example of using trigram search to search for a regular expression query, consider searching for the string ab[cd]e
, where the brackets denote that the third character in the string being searched for could be c
or d
. In this situation, one could query the index for objects that have the two trigrams abc
and bce
or the two trigrams abd
and bde
. Thus, finding this query would involve no string matching, and could just query the index directly, which can be faster in practice.[2]
Seamless Wikipedia browsing. On steroids.
Every time you click a link to Wikipedia, Wiktionary or Wikiquote in your browser's search results, it will show the modern Wikiwand interface.
Wikiwand extension is a five stars, simple, with minimum permission required to keep your browsing private, safe and transparent.