Autocomplete: Making Search More User-Friendly

Autocomplete: Making Search More User-Friendly

Published: Sep 25, 2025

Introduction

Autocomplete may look like a small feature, but it has a huge impact. It saves typing, helps people find what they want faster, and prevents “no results” searches.

This article is a practical guide to building autocomplete with Manticore Search. You’ll see how to use:

CALL AUTOCOMPLETE and the /autocomplete HTTP endpoint
fast dictionary lookups with CALL KEYWORDS
sentence completion using infix search and highlighting
practical tips for running autocomplete in production

Autocomplete in action

Who should read this

This guide is for developers adding search suggestions to an online store, documentation site, or internal tool — especially if you use or plan to use Manticore Search. The examples are hands-on and can be tried directly against your data.

Quick decision guide

Use CALL AUTOCOMPLETE (or POST /autocomplete) when you need typo-tolerant, multi-word suggestions from indexed data.
Use CALL KEYWORDS when you want lightning-fast, dictionary-like completions for single words or very short phrases.
Use infix search with highlighting when you want phrase or sentence-level suggestions (like showing the rest of a document sentence).
Enable bigram_index if you want two-word predictions (predicting the “next word”).

Prerequisites and Caveats

Buddy is required for CALL AUTOCOMPLETE and the /autocomplete HTTP endpoint. If Buddy isn't installed, these calls will not work.
The target table must have infixes enabled (min_infix_len). Manticore caches the min_infix_len check for ~30 seconds for performance; if you modify min_infix_len (e.g., when tuning search performance or enabling autocomplete on an existing table), you might see a short inconsistency during that window.
CALL AUTOCOMPLETE caches only successful checks. If you disable min_infix_len or remove the table, subsequent autocomplete calls may act stale until the cache updates or an error appears.

CALL AUTOCOMPLETE — quick example (SQL)

This is the most direct way to get suggestion candidates from your index.

CALL AUTOCOMPLETE('ice', 'products');

A typical result set contains rows with a single column named query:

+---------------+
| query         |
+---------------+
| ice           |
| ice cream     |
| iceberg       |
| iceland       |
+---------------+

How CALL AUTOCOMPLETE works (briefly)

CALL AUTOCOMPLETE is a high-level convenience that orchestrates several lower-level primitives inside Manticore to produce fast, relevant suggestions. In practice it combines:

CALL KEYWORDS: fast dictionary lookups that return prefix/infix token candidates and statistics (docs/hits). This is what gives autocomplete strong, low-latency candidates from the index dictionary.
Suggestion routines (historically exposed via CALL SUGGEST / CALL QSUGGEST): routines that produce fuzzy variants for tokens (especially the last word) and help generate typo-tolerant alternatives.
Fuzzy-search logic: the same edit-distance (Levenshtein) and ranking heuristics introduced for fuzzy search, which the autocomplete flow reuses to rank and filter candidates.

These components work together: keywords provide candidates, the suggest/fuzzy routines expand and repair them, and the fuzzy logic ranks/filter results by distance and popularity. Keyboard-layout guessing (when enabled) is applied early so that layout-mistyped input can be corrected before fuzziness is computed.

When not to call CALL KEYWORDS directly

CALL KEYWORDS is an excellent, extremely fast tool when you need strict dictionary-based completions (exact prefix/infix matches). However, it does not provide typo tolerance or multi-word suggestion composition. For multi-word, typo-tolerant suggestions prefer CALL AUTOCOMPLETE; use CALL KEYWORDS for very short prefixes, hot lists, or when you explicitly want dictionary-only results.

Options mapping (high-level)

fuzziness in CALL AUTOCOMPLETE corresponds to limiting edit distance similar to max_edits in suggest APIs.
preserve controls whether non-fuzzy tokens are kept alongside fuzzy matches.
layouts enables keyboard-layout guessing prior to fuzzy evaluation.

Note: the internal combination of primitives described here reflects the implementation approach but should be treated as an implementation detail — rely on the documented CALL AUTOCOMPLETE API and options rather than internal behavior when designing your application.

These internal combinations are why CALL AUTOCOMPLETE often gives better, more usable suggestions than running CALL KEYWORDS alone.

HTTP/JSON example (Buddy)

If you prefer HTTP or have a frontend that talks JSON, use the /autocomplete endpoint provided by Buddy:

POST /autocomplete
{
  "table": "products",
  "query": "ice"
}

The JSON response returns an array of suggested completions (and metadata) you can present in the UI.

CALL AUTOCOMPLETE options explained

CALL AUTOCOMPLETE accepts several options to tune behavior. Here are the ones you'll use most often:

layouts: Comma-separated keyboard layout codes (us, ru, ua, de, fr, etc.). Use this to detect layout-mistyped input (e.g., typing "ghbdtn" on an English layout when you meant to type "привет" - Russian for "hello"). Requires at least two layouts to compare character positions.
fuzziness: 0, 1, or 2 (default 2). Maximum Levenshtein distance for matching typos. Set to 0 to disable fuzziness.
preserve: 0 or 1 (default 0). If 1, suggestions will include words that did not get fuzzy matches (useful for preserving proper nouns or short tokens).
prepend / append: Boolean (0/1). If true, an asterisk is added before/after the last word to expand prefixes/suffixes (e.g., prepend -> *word, append -> word*).
expansion_len: Number of characters to expand the last token (default 10). Controls how many characters will be considered for expansion.

Example with options (SQL)

CALL AUTOCOMPLETE('ghbdtn', 'comments', 'us,ru' as layouts, 1 as fuzziness);

This will detect that "ghbdtn" is a layout-mistyped version of "привет" (Russian for "hello") and apply fuzzy matching to find the correct suggestions.

CALL KEYWORDS — token-based completions

When you only need to suggest single words (or endings) and want maximum speed, CALL KEYWORDS is an excellent alternative. It uses the index dictionary rather than scanning documents, which makes it very efficient.

Basic syntax:

CALL KEYWORDS('ca*', 'products', 1 AS stats, 'hits' AS sort_mode);

This returns rows with tokenized and normalized forms and optional stats (docs, hits). Sorting by hits surfaces the most popular completions.

Example result (illustrative):

+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | ca*       | cat        | 1    | 2    |
| 1    | ca*       | carnivorous| 1    | 1    |
+------+-----------+------------+------+------+

bigram_index trick

If your table has bigram_index enabled, the index stores pairs of adjacent words as tokens. This lets you suggest likely next words ("predict the next word") rather than only finishing the current token. It's a simple but powerful way to improve multi-word suggestions without adding an external ML model.

Sentence completion with infix search and highlighting

For completion of phrases or sentence tails (e.g., autocompleting the remainder of a sentence from a document), use infix queries with wildcards and highlight the match. For example, to find documents where a field starts with a typed fragment:

Query examples you can issue as the user types:
- ^"m*"
- ^"my *"
- ^"my c*"
- ^"my ca*"

Use the ^ anchor to match from the beginning and * to expand the rest. With highlighting enabled, you can extract the matched portion and present it as a suggestion (for example: "My cat loves ...").

This approach is best when you want suggestions that are full phrases or document snippets rather than isolated tokens.

Integration pattern (frontend → backend)

A minimal, production-friendly flow looks like this:

Debounce user input (150–300 ms) to avoid too many requests.
For 1–2 characters, use CALL KEYWORDS or a hot list.
For 3+ characters, call /autocomplete or CALL AUTOCOMPLETE. Include appropriate options (layouts, fuzziness) based on your users.
Show suggestions with highlighted matches.
Track clicks and reorder suggestions by popularity or business rules.

UX tips

Show 6–10 suggestions on desktop, 3–5 on mobile.
Group by type (e.g., products, docs, people) if relevant.
Always offer a fallback: e.g., "Search for {query}".

Production tips

min_infix_len: Ensure this is set appropriately for your language and use case. Very small values increase index size and CPU use; very large values reduce matching flexibility.
Caching: allow ~30s for changes like min_infix_len to take effect.
Rate limiting: Use request throttling or in-memory caching for heavy traffic.
Index maintenance: Consider precomputing a "top suggestions" table for the most common prefixes.

Troubleshooting checklist

If autocomplete behaves unexpectedly:

Verify Buddy is installed (if using /autocomplete).
Confirm the table has min_infix_len set and infixes enabled.
Retry after 30 seconds to allow the internal cache to refresh if you recently changed table settings.
Try CALL KEYWORDS for the same prefix to ensure tokens exist in the dictionary.
Check encoding and tokenization settings (morphology, stopwords) that might affect results.

Example: full flow (sample)

User types: "ice c"

Client (debounced) sends:

POST /autocomplete
{
  "table": "products",
  "query": "ice c",
  "options": { "fuzziness": 1 }
}

Server returns suggestions such as "ice cream", "ice coffee", "ice cold". The UI shows these; on selection the client navigates to a search URL like /search?q=ice+cream or fetches product details using the selected suggestion.

Conclusion

Autocomplete is a small UX feature that delivers outsized value: faster searches, fewer dead-ends, and higher user satisfaction. With Manticore Search you have practical choices for implementing suggestions — from the fast, dictionary-driven CALL KEYWORDS to the more flexible, typo-tolerant CALL AUTOCOMPLETE (and the HTTP /autocomplete endpoint via Buddy). Use the approaches described here to balance latency, accuracy, and resource cost for your application.

Try the examples in this article against your index, monitor suggestion quality, and tune options like fuzziness and min_infix_len to match your data and users. If you need a lightweight starting point, build a short hot-list for the most common prefixes and route longer inputs through CALL AUTOCOMPLETE.

Build smarter, faster search with autocomplete.