blog-post

KNN prefiltering in Manticore Search

Vector search rarely happens in isolation. You almost always have filters — a price range, a category, a date window, a geographic boundary. The question is: when do those filters get applied?

The answer makes a surprising difference in result quality.

KNN prefiltering is available in Manticore Search starting from version 19.0.1.

The problem with postfiltering

Consider a product catalog with 10 million items. A user asks for the 10 nearest neighbors to a query vector, restricted to category = 'electronics'. With postfiltering, the KNN search runs first over the entire dataset, then the filter is applied to the results. If electronics make up 5% of the catalog, the graph explores nodes that are mostly irrelevant. Worse, many of the k nearest neighbors may not be electronics at all, so the final result set can be much smaller than requested. Ask for 10 results, get 2.

This is the fundamental limitation of postfiltering: the HNSW graph doesn't know about your filters. It finds the closest vectors overall, not the closest vectors that match your criteria. The more selective the filter, the worse the problem gets.

What prefiltering does differently

Prefiltering passes the filter into the HNSW graph traversal itself. As the algorithm explores candidate nodes, each one is checked against the filter before being added to the result heap. Only matching documents contribute to the final k results. This means you reliably get the k results you asked for, assuming k matching documents exist in the dataset.

In Manticore Search, prefiltering is enabled by default when your query combines KNN search with attribute filters. No special syntax is needed:

SELECT id, title, knn_dist()
FROM products
WHERE knn(embedding, (0.12, 0.45, 0.78, 0.33))
  AND category = 'electronics'
  AND price < 500
LIMIT 10;

Both category = 'electronics' and price < 500 are evaluated during HNSW traversal, not after. The equivalent JSON query:

POST /search
{
    "table": "products",
    "knn": {
        "field": "embedding",
        "query": [0.12, 0.45, 0.78, 0.33]
    },
    "query": {
        "bool": {
            "must": [
                { "equals": { "category": "electronics" } },
                { "range": { "price": { "lt": 500 } } }
            ]
        }
    },
    "limit": 10
}

Naive prefiltering and where it falls short

The obvious first approach is straightforward: traverse the HNSW graph normally, compute distances for every neighbor, but only add filter-matching nodes to the result heap. Filtered-out nodes still participate in navigation — if a non-matching node has a competitive distance, it enters the candidate queue and its neighbors get explored. The filter only gates what goes into the results.

This actually works reasonably well. The graph stays connected because filtered-out nodes are still traversed. But it has a performance problem that gets worse as the filter becomes more selective: every unvisited neighbor gets a distance computation regardless of whether it passes the filter. Distance computation is the most expensive operation in the search. With a filter matching 5% of documents, 95% of that work produces results that are immediately discarded. The algorithm pays full cost for navigation but gets no results from most of the work.

How Manticore solves it: ACORN-1

Manticore uses an ACORN-1-based algorithm (from the ACORN paper , SIGMOD 2024) that improves on naive prefiltering in two ways:

No distance computation for filtered-out nodes. When visiting a node's neighbors, ACORN-1 checks the filter first and only computes distance for nodes that pass. Filtered-out neighbors are never scored. When 95% of nodes fail the filter, this saves roughly 95% of the distance work compared to the naive approach.

Adaptive expansion through filtered-out nodes. When a neighbor fails the filter, the algorithm looks through that node's own neighbors to find filter-passing nodes further away. If those neighbors also fail the filter and not enough matching candidates have been found yet, it keeps going — 3 hops, 4 hops, as far as needed. The more selective the filter, the more aggressively the algorithm expands. This targeted walk through non-matching neighborhoods reaches matching candidates without scoring the non-matching ones along the way.

Think of it as searching for Italian restaurants in a city. The naive approach checks the menu at every restaurant and only keeps the Italian ones. ACORN-1 glances at the sign first — "French, skip; Thai, skip" — without going inside. And when it sees a stretch of non-Italian restaurants, it walks past them, peeking around each corner until it finds an Italian place on the other side.

Manticore activates ACORN-1 when fewer than 60% of total documents pass the filter. Above that threshold, naive prefiltering works well enough on its own.

Automatic brute-force fallback

Prefiltering works well across a wide range of filter selectivities, but there's an extreme case: what if only 50 documents out of 10 million match the filter? Traversing the HNSW graph — even with ACORN-1 — visits far more nodes than just scanning those 50 documents directly.

Manticore detects this automatically. When prefiltering is enabled, the query planner estimates the cost of HNSW traversal versus a brute-force distance scan over the filtered subset. It uses histogram-based selectivity estimates to predict how many documents pass the filter, then compares that against the expected number of nodes HNSW would visit. If brute-force is cheaper, Manticore skips HNSW entirely and scans the filtered documents directly.

This means you don't need to think about edge cases. Prefiltering adapts: ACORN-1 for moderate selectivity, brute-force for extreme selectivity, and standard HNSW when no filter is present.

When to use postfiltering instead

Prefiltering isn't always the best choice. There are cases where postfiltering is preferable:

  • When you want the closest vectors regardless of filters. Postfiltering gives you the k nearest neighbors from the full dataset, then removes non-matching ones. If your application tolerates getting fewer than k results and you care most about vector distance quality, postfiltering is simpler and more predictable.
  • When the filter matches most documents. If 95% of documents pass the filter, prefiltering adds overhead for almost no benefit — nearly every candidate matches anyway.
  • When you're debugging or benchmarking. Postfiltering gives you a clean baseline: pure HNSW results with a filter on top. This makes it easier to isolate whether a quality issue comes from the graph or the filter.

To explicitly request postfiltering in SQL:

SELECT id, knn_dist()
FROM products
WHERE knn(embedding, (0.12, 0.45, 0.78, 0.33), { prefilter=0 })
  AND category = 'electronics'
LIMIT 10;

In JSON, set "prefilter": false inside the knn object:

POST /search
{
    "table": "products",
    "knn": {
        "field": "embedding",
        "query": [0.12, 0.45, 0.78, 0.33],
        "prefilter": false
    },
    "query": {
        "equals": { "category": "electronics" }
    },
    "limit": 10
}

Forcing brute-force

If you know your dataset is small enough or your filters selective enough that a linear scan is the right strategy, you can force brute-force mode directly:

SELECT id, knn_dist()
FROM products
WHERE knn(embedding, (0.12, 0.45, 0.78, 0.33), { fullscan=1 })
  AND category = 'electronics'
LIMIT 10;

This skips HNSW entirely and computes exact distances over all documents that pass the filter. It guarantees perfect recall at the cost of linear-time scanning.

Summary

Prefiltering is the default in Manticore and the right choice for most filtered KNN queries. It guarantees you get k results (if they exist). Manticore automatically picks the best strategy based on how selective the filter is: standard filtered HNSW when most documents match, ACORN-1 when fewer than 60% pass (saving distance computations on filtered-out nodes), and brute-force when the filtered subset is small enough to scan directly. The query planner estimates filter selectivity per-query, per-segment, so there's nothing to tune.

Use postfiltering (prefilter=0 in SQL, "prefilter": false in JSON) when you want the globally closest vectors and can tolerate getting fewer than k results. Use brute-force (fullscan=1 in SQL, "fullscan": true in JSON) when you know a linear scan is the right strategy for your data.

Install Manticore Search

Install Manticore Search