The term fuzzy search encompasses several meanings, all revolving around the concept of approximate matching. The most common interpretation involves fuzzy string matching, where a search engine matches words that don’t match exactly. The classic example is misspellings, where incorrectly spelled queries return correctly spelled results. But fuzzy search extends beyond correcting typos - it can also handle poorly formulated queries, recognize colloquial vocabulary, expand prefixes, or establish loose category relationships between a query and the content being searched.
The fuzzy matching described in this article enables Manticore Search to apply some (slightly) inexact matching techniques to ensure that it returns all relevant results. In this sense, relevance is not an exact science.
Search relevance, in general
The starting point for search is textual matching, which matches a query’s exact characters, letters, and words to the records in the dataset. For example, if you search for a “boat”, the search engine will return all records with the word “boat” in it. Not “canoe” or “cruise ship”.
Textual matching goes a long way. With good data, the query “boat” will return every relevant record, as long as it contains the word “boat”, for example, in titles, descriptions, and categories. With even smarter data, when you search for a “boat”, the search engine could return anything that floats, as well as boating lessons or cruise vacations.
But what if the word “boat” does not appear in the records, even though the data contains boat-related information? Here’s where synonyms come in. But what if someone wants to find “canoe” and reasonably spells it “canou”? Without typo tolerance, the search engine would return no results.
Fuzzy matching is about enhancing relevance, making the search experience even more intuitive. When done well, fuzzy matching seamlessly integrates closely-related items, where not doing so would be a serious loss of opportunity. Like doing a puzzle, fuzzy search collects similar pieces to help you locate the exact piece. Additionally, because people define what they need as they search, fuzzy search brings up options based on similarity, like taste-testing before ordering the best meal.
Search tools like Manticore Search, Elasticsearch, Algolia, and others offer fuzzy matching capabilities. They each approach the subject differently. In this article, we’ll explore Manticore’s powerful fuzzy search implementation.
Definitions of fuzzy search & fuzzy matching
What is fuzzy search?
Fuzzy search refers to approximate string matching as opposed to exact string matching. It matches two or more words even if there are typos or misspellings. Fuzzy search resolves clumsy fingers, rushed-for-time and careless typers, mobile users, and the complexities of spelling in every language of the world. Fuzzy search can also play a role in matching user-generated data, which is generally unreliable because users misspell and use alternative or localized spellings for the same words. This can also include matching based on phonetics and sound.
Other terms for fuzzy search include fuzzy or approximate string matching.
Examples:
- Type “helo”, find records with “hello”
- Type “help”, find records with help and hello
See more examples & how this is implemented in Manticore below.
What is fuzzy matching?
Fuzzy matching extends the fuzziness of search to include finding information based on similarities. It’s a broad term, so we’ll focus on language-based similarities, such as synonyms, grammar (plurals, verb conjugations, noun endings, etc.), dictionary-based techniques, and other heuristics or NLP methods. We’ll also include similarities typically used by search engines such as partial word, phrase, or query matching, and the use of filters.
Examples:
- Type “pants”, find pants, trousers, slacks (synonyms)
- Type “be”, find Beatles, bees (prefix search)
See more examples & how this is implemented in Manticore below.
Other usages of “fuzzy”
Fuzzy sets group words (and objects) based on shared characteristics. For example, the shape of an American football, rugby ball, and an egg make these items more or less similar (oblong shape), but football and rugby are closer because they also share the qualities of bounce and sport (notwithstanding the egg-in-a-spoon game).
Fuzzy logic constructs logical relationships based on relative nearness, not binaries like true and false (two objects match based on both being “tall” rather having the same exact height).
Fuzzy search using Levenshtein Distance in Manticore
Typo tolerance allows users to make mistakes while typing and still find the records they’re looking for. Manticore implements this through the Levenshtein distance algorithm.
What exactly is a typo?
- A missing letter in a word, “strm” → “storm”
- An extraneous letter, “stoorm” → “storm”
- Inverted letters: “strom” → “storm”
- Substituted letter: “storl” → “storm”
Levenshtein distance algorithm
Manticore’s typo tolerance algorithm is based on distance, following the Levenshtein distance algorithm. Distance refers to the difference in spelling between a typed word and its exact match in the index. Distance has a precise meaning: it’s the minimum number of operations (character additions, deletions, substitutions, or transpositions) required to change one word into another. A perfect match is distance = 0. When there is a perfect match, or the distance is low (one or two letters mistakenly typed), then a match is made and the record is added to the results.
For example, if the engine receives a word like “strm”, this can mean “storm” or “strum” (distance = 1 / one letter missing), or “star” or “warm” (distance = 2 / two operations required).
Distance establishes a threshold of tolerance. In Manticore, you can set this threshold using the distance
parameter, with a default value of 2: when a word has a distance of 3 or more mistakes, it’s ’not tolerated’ (not included in the results).
More examples of fuzzy search based on Levenshtein distance
Below are a few examples of how Manticore counts the operations needed to transform a word to find records with “Michael”:
- michael - 0 typos (perfect match)
- mickael - 1 typo (substitution: h → k)
- micael - 1 typo (deletion: h)
- mickhael - 1 typo (addition: k)
- micheal - 1 typo (transposition: a ⇄ e)
- mickaell - 2 typos (substitution: h → k, addition: l)
- tichael - 2 typos (substitution: m → t, first letter)
- tickael - 3 typos (substitution: m → t, first letter, substitution h → k) - beyond the default distance threshold
Ranking and typo tolerance
In terms of relevance: all fuzzy matches based on spelling differences are considered less relevant than exact matches. Thus, records that have a 0 distance – that is, which match exactly – are ranked (ordered) higher than records with typos. Likewise, records with 1 typo are ranked higher than records with 2 typos.
Fuzzy matching with Keyboard Layouts
A unique feature of Manticore’s fuzzy search is its keyboard layout awareness. This functionality understands that users might be typing on different keyboard layouts, which can lead to specific types of errors.
For instance, if someone accidentally types with the wrong keyboard layout active, Manticore can still understand the query by considering the physical key positions across different layouts. Manticore supports 17 different keyboard layouts including QWERTY, AZERTY, QWERTZ, and others, making it extremely versatile for international applications.
For example, a user types “yucker” on a US keyboard, but the system interprets it as a German QWERTZ layout. Manticore recognizes the mismatch and correctly maps the input to “zucker” (the German word for “sugar”). Without layout awareness, the search would fail due to the Y/Z key swap. Manticore ensures users find results even if their keyboard layout is misconfigured.
Implementing Fuzzy Search in Manticore
Enabling fuzzy search in Manticore is straightforward, with support for both SQL and JSON APIs.
Using SQL Syntax
SELECT * FROM mytable
WHERE MATCH('someting')
OPTION fuzzy=1, layouts='us,ua', distance=2;
In this example:
fuzzy=1
enables fuzzy matchinglayouts='us,ua'
specifies which keyboard layouts to consider (US and Ukrainian)distance=2
sets the maximum Levenshtein distance to 2
Using JSON API
POST /search
{
"table": "test",
"query": {
"bool": {
"must": [
{
"match": {
"*": "ghbdtn"
}
}
]
}
},
"options": {
"fuzzy": true,
"layouts": ["us", "ru"],
"distance": 2
}
}
This JSON example shows how Manticore can handle cross-layout typos—“ghbdtn” typed on a US keyboard would actually be “привет” (“hello” in Russian) if typed on a Russian keyboard.
Other methods for fuzzy matching
Partial word matching with Prefix Search
Like other advanced search engines, Manticore supports prefix matching for as-you-type search experiences. It enables the engine to start matching records based on partial words.
For example, records containing “apricot” are returned as soon as a user types “a”, “ap”, “apr”. There’s no need for the engine to wait for a full-word match before displaying results.
Synonyms
Synonyms tell the engine which words and expressions to consider equal – for example, pants = trousers. Thus, a search for “trousers” will return “trousers” and “pants”, and a search for “pants” will also return “trousers” and “pants”.
While general thesaurus-based synonym expansion can be too broad, Manticore allows you to define specific, contextually relevant synonyms for your dataset using wordforms .
How to balance fuzzy matching for optimal results
Adding fuzzy logic, while powerful, comes with some risk. It may return too many results for a user to sift through, or it may return unexpected results that could, in the user’s mind, break the relevance of the results.
It’s the job of the search engine to limit the impact of this expansion of information. For example, by giving higher priority to exact matches, ensuring that exact matches show up at the top of the results, above the fuzzy ones:
- In the case of misspellings, exact matches are considered more relevant than inexact matches (even if the typo is obvious to the user).
- With synonyms, exact matches are more relevant than synonym matches (even if the word in the query has multiple meanings and the synonym would be the better choice).
Essentially, Manticore hedges the bet on the fuzzy algorithm by displaying the fuzzy matches after the exact matches. This is part of Manticore’s robust approach to handling misspellings and variations in search queries.
There are also many UI patterns to manage this. For example, search engines often add “suggested spellings” just below the search bar. Or they offer Autocomplete or Query-Suggestion features, where the user is given suggested queries (and categories) in a dropdown list as they type, which they can choose from as their query. Manticore provides comprehensive support for these features as documented in our Autocomplete API .
But the proper solution is the one below the surface, completely transparent to the user, where the search engine returns results that create an intuitive feeling in the user that the displayed items are the best and most relevant results – even if not exactly matching the text of the query.
Real-world Applications
Let’s look at practical applications where Manticore’s fuzzy search excels:
E-commerce
In e-commerce, users often search for products with complicated names or technical specifications they might misspell. Fuzzy search ensures they still find what they’re looking for:
SELECT * FROM products
WHERE MATCH('sansung glaxy')
OPTION fuzzy=1, distance=2;
This query would successfully match products containing “Samsung Galaxy” despite the typos.
Content Management
For content-heavy sites, fuzzy search helps users discover relevant articles even when they misremember terms or titles:
SELECT * FROM articles
WHERE MATCH('artifical inteligence')
OPTION fuzzy=1;
This would match articles about “artificial intelligence” despite the spelling errors.
International Websites
For sites serving multilingual audiences, keyboard layout awareness is particularly valuable:
SELECT * FROM content
WHERE MATCH('ghdbtn')
OPTION fuzzy=1, layouts='us,ru';
This would match content with “привет” (Russian for “hello”) when a user accidentally types with the wrong keyboard layout.
Conclusion
Fuzzy search is no longer a luxury. It’s an essential component of any modern search implementation. Manticore Search provides a sophisticated yet approachable fuzzy search capability that can dramatically improve user experience while accommodating the natural variations and errors in human-generated queries.
By leveraging Manticore’s Levenshtein distance algorithm and unique keyboard layout awareness, you can create search experiences that feel almost magical to users. They’ll find what they’re looking for even when they don’t get the query exactly right.
Whether you’re building an e-commerce platform, content management system, or any application with search functionality, implementing fuzzy search with Manticore can lead to higher user satisfaction, better engagement, and ultimately, improved conversion rates.