# Understanding Pagination in Manticore Search

分页是任何搜索引擎的重要功能，它允许用户高效地浏览大量结果集。Manticore Search 提供了多种强大的分页方法，每种方法都有其自身的优缺点。本文将探讨 Manticore Search 中可用的不同分页选项，并通过实际的 SQL 示例帮助您为您的使用场景实现最佳解决方案。

## 目录
1. [分页简介](/blog/pagination/#introduction-to-pagination)
2. [设置测试环境](/blog/pagination/#setting-up-a-test-environment)
3. [传统的基于偏移的分页](/blog/pagination/#traditional-offset-based-pagination)
4. [结果集窗口和 max_matches](/blog/pagination/#result-set-window-and-max_matches)
5. [基于滚动的分页](/blog/pagination/#scroll-based-pagination)
6. [通过 HTTP JSON 进行分页](/blog/pagination/#pagination-via-http-json)
7. [分页的排序最佳实践](/blog/pagination/#sorting-best-practices-for-pagination)
8. [限制和性能考虑](/blog/pagination/#limitations-and-performance-considerations)
9. [带分页的面向方面的搜索](/blog/pagination/#faceted-search-with-pagination)
10. [实际使用案例](/blog/pagination/#practical-use-cases)
11. [结论](/blog/pagination/#conclusion)

## 分页简介

在处理大型数据集时，一次性返回所有匹配结果是不切实际的。分页通过将结果划分为可管理的块或“页面”来解决这个问题。Manticore Search 提供了多种分页方法，以适应不同的使用场景，从简单的列表导航到无限滚动界面。

在本文中，我们将探讨三种主要的分页方法：
- 传统的基于偏移的分页 ([第3节](/blog/pagination/#traditional-offset-based-pagination))
- 使用 max_matches 的深度分页 ([第4节](/blog/pagination/#result-set-window-and-max_matches))
- 用于高效浏览大型结果集的基于滚动的分页 ([第5节](/blog/pagination/#scroll-based-pagination))

我们还将讨论它们通过 SQL 和 HTTP JSON 接口的实现方式，性能考虑因素以及实际应用。

## 设置测试环境

在深入分页示例之前，让我们使用 Manticore Load 工具创建一个带有示例数据的测试环境。该工具使生成和加载测试数据以进行基准测试和实验变得容易：

```bash
# Create a products table and populate it with 10,000 records containing "smartphone" in the title
manticore-load --quiet \
  --batch-size=1000 \
  --threads=4 \
  --total=10000 \
  --drop \
  --init="CREATE TABLE products(title text, description text, price int, category string, rating float, stock int)" \
  --load="INSERT INTO products(id, title, description, price, category, rating, stock) VALUES(<increment>, '<text/3/5> smartphone', '<text/20/50>', <int/10/1000>, '<string/5/10>', <float/1/5>, <int/0/100>)"
```

**输出：**
![manticore-load 输出](./pagination/manticore-load-output-for-pagination.png)

有了这个测试环境，我们现在可以探索不同的分页方法。

## 传统的基于偏移的分页

Manticore Search 中最直接的分页方法使用 `LIMIT` 子句与偏移量和计数参数。这种方法对于任何使用过 SQL 数据库的人都很熟悉。

### 基本语法

```sql
SELECT ... FROM ... [LIMIT [offset,] row_count]
SELECT ... FROM ... [LIMIT row_count][ OFFSET offset]
```

### 示例

```sql
-- Return the first 5 results (page 1)
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 5;
```

**输出：**
```
+------+----------------------------------------------+-------+----------+
| id   | title                                        | price | weight() |
+------+----------------------------------------------+-------+----------+
| 1179 | Weak spoke? fall. smartphone                 |    10 |     1272 |
| 1388 | 3; her day of. smartphone                    |    10 |     1272 |
| 1636 | Positively bad winter clean. smartphone      |    10 |     1272 |
| 5628 | Foolish from went heard low. smartphone      |    10 |     1272 |
| 8561 | Left sent. Infrequently, try lay. smartphone |    10 |     1272 |
+------+----------------------------------------------+-------+----------+
```

```sql
-- Return results 6-10 (page 2)
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 5, 5;
```

**输出：**
```
+------+----------------------------------------------+-------+----------+
| id   | title                                        | price | weight() |
+------+----------------------------------------------+-------+----------+
|  246 | Soft technically took. smartphone            |    11 |     1272 |
| 1105 | Dirty up are! temporarily called. smartphone |    11 |     1272 |
| 3293 | Instantly thick ran that hurt. smartphone    |    11 |     1272 |
| 3736 | Work her wrong locally. smartphone           |    11 |     1272 |
| 6978 | Stale loud passively sweet clean. smartphone |    11 |     1272 |
+------+----------------------------------------------+-------+----------+
```

```sql
-- Alternative syntax with separate LIMIT and OFFSET
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 5 OFFSET 10;
-- Results 11-15 (page 3)
```

**输出：**
```
+------+----------------------------------------+-------+----------+
| id   | title                                  | price | weight() |
+------+----------------------------------------+-------+----------+
| 7318 | Her spent at. smartphone               |    11 |     1272 |
| 8436 | His? felt work. Angry fast. smartphone |    11 |     1272 |
| 9699 | Feel spend funny. smartphone           |    11 |     1272 |
| 2326 | Quietly wet drew found. smartphone     |    12 |     1272 |
| 4171 | Set perhaps new send soon. smartphone  |    12 |     1272 |
+------+----------------------------------------+-------+----------+
```

此方法易于实现，适合浏览结果的前几页。然而，由于 Manticore 内部处理查询的方式，它在深度分页时效率较低。有关这些限制的详细信息，请参见 [第8节：限制和性能考虑](/blog/pagination/#limitations-and-performance-considerations)。

## 结果集窗口和 max_matches

默认情况下，Manticore Search 限制返回的匹配项数量为 1000。如果您尝试分页超过此限制，查询将导致错误。可以使用 `max_matches` 选项调整此限制。

### 理解 max_matches

`max_matches` 选项控制 Manticore 在搜索时在 RAM 中保留的匹配项数量。它旨在限制每个查询的内存使用。默认值为 1000，对于大多数常见搜索场景来说已经足够，但当您需要访问更深层的结果页面时，可以增加此值。

```sql
-- Increase max_matches to allow deeper pagination (results 1001-1005)
SELECT id, title, price FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 1000, 5 
OPTION max_matches=1500;
```

**输出：**
```
+------+-----------------------------------------+-------+
| id   | title                                   | price |
+------+-----------------------------------------+-------+
| 3550 | Open sold being late wide. smartphone   |   111 |
| 4896 | Grew. Wrote as hungry. smartphone       |   111 |
|  647 | Went seen had. smartphone               |   112 |
|  883 | Hold lose; have thought. smartphone     |   112 |
| 1774 | Get! began sick. Gone, wild. smartphone |   112 |
+------+-----------------------------------------+-------+
```

```sql
-- For very deep pagination (results 5000-5005)
SELECT id, title, price FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 5000, 5 
OPTION max_matches=5500;
```

**输出：**
```
+------+------------------------------------------------+-------+
| id   | title                                          | price |
+------+------------------------------------------------+-------+
| 4894 | Subtly? inside for spend evening. smartphone   |   507 |
| 5203 | Bad; artistically from yearly. smartphone      |   507 |
| 7446 | Empty probably clever! universally. smartphone |   507 |
| 8053 | In! rich daily irregularly. smartphone         |   507 |
| 8055 | Short cold fall; keep physically. smartphone   |   507 |
+------+------------------------------------------------+-------+
```

请注意，增加 `max_matches` 会带来内存成本。每个匹配项都会消耗内存，因此将此值设置得过高可能会影响服务器性能，尤其是在高负载情况下。对于一种更节省内存的深度分页替代方法，请参见 [基于滚动的分页](/blog/pagination/#scroll-based-pagination)。

## 基于滚动的分页

基于滚动的分页提供了一种高效浏览大型结果集的方法。与传统的基于偏移的分页不同，它使用一个标记来跟踪结果集中的当前位置，这有助于高效地浏览结果的连续页面。

### 工作原理

1. 执行一个带有排序条件的初始查询（必须在 ORDER BY 中包含 `id`）
2. 获取一个封装当前位置的滚动标记
3. 在后续查询中使用此标记来获取下一批结果

### 示例

```sql
-- Initial query with sorting criteria (must include id in ORDER BY)
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC 
LIMIT 5; SHOW SCROLL\G
```

**输出：**
```
+------+----------------------------------------------+-------+----------+
| id   | title                                        | price | weight() |
+------+----------------------------------------------+-------+----------+
| 1179 | Weak spoke? fall. smartphone                 |    10 |     1272 |
| 1388 | 3; her day of. smartphone                    |    10 |     1272 |
| 1636 | Positively bad winter clean. smartphone      |    10 |     1272 |
| 5628 | Foolish from went heard low. smartphone      |    10 |     1272 |
| 8561 | Left sent. Infrequently, try lay. smartphone |    10 |     1272 |
+------+----------------------------------------------+-------+----------+

*************************** 1. row ***************************
scroll_token: eyJvcmRlcl9ieV9zdHIiOiJ3ZWlnaHQoKSBERVNDLCBwcmljZSBBU0MsIGlkIEFTQyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6InByaWNlIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjoxMCwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo4NTYxLCJ0eXBlIjoiaW50In1dfQ==
```

```sql
-- Subsequent paginated query using the scroll token
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC 
LIMIT 5 
OPTION scroll='eyJvcmRlcl9ieV9zdHIiOiJ3ZWlnaHQoKSBERVNDLCBwcmljZSBBU0MsIGlkIEFTQyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MywidHlwZSI6ImludCJ9LHsiYXR0ciI6InByaWNlIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo1MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo0NCwidHlwZSI6ImludCJ9XX0=';
```

**输出：**
```
+------+----------------------------------------------+-------+----------+
| id   | title                                        | price | weight() |
+------+----------------------------------------------+-------+----------+
| 1179 | Weak spoke? fall. smartphone                 |    10 |     1272 |
| 1388 | 3; her day of. smartphone                    |    10 |     1272 |
| 1636 | Positively bad winter clean. smartphone      |    10 |     1272 |
| 5628 | Foolish from went heard low. smartphone      |    10 |     1272 |
| 8561 | Left sent. Infrequently, try lay. smartphone |    10 |     1272 |
+------+----------------------------------------------+-------+----------+
``` 

基于滚动的分页特别适合在 Web 应用程序中实现“加载更多”或无限滚动功能。有关完整实现示例，请参见 [实际使用案例](/blog/pagination/#practical-use-cases)。

## 通过 HTTP JSON 进行分页

除了 SQL，Manticore Search 还提供通过 HTTP JSON 进行分页。这对于 Web 应用程序或微服务架构特别有用。传统的基于偏移的分页和基于滚动的分页都通过 JSON API 支持。

### 通过 HTTP JSON 的传统基于偏移的分页

HTTP JSON API 使用 `offset` 和 `limit` 参数来控制分页。这等同于 SQL 的 `LIMIT offset, count` 语法。

#### 基本语法

```json
{
  "table": "table_name",
  "query": { "match": { ... } },
  "limit": number_of_results,
  "offset": starting_position
}
```

#### 示例

```bash
# First page (results 1-5)
curl -s 0:9308/search -d '{
  "table": "products",
  "query": { "match": { "*": "smartphone" } },
  "limit": 5,
  "offset": 0,
  "sort": [
    { "_score": "desc" },
    { "price": "asc" },
    { "id": "asc" }
  ]
}'|jq .
```

**输出：**
```
{
  "took": 0,
  "timed_out": false,
  "hits": {
    "total": 10000,
    "total_relation": "eq",
    "hits": [
      {
        "_id": 1179,
        "_score": 1272,
        "_source": {
          "title": "Weak spoke? fall. smartphone",
          "description": "Made down never long 3 angry are went asked closed! cold historically got take can to commonly emotionally; socially asked get tame frequently think nowhere? carefully? scientifically small.",
          "price": 10,
          "category": "hkadwpwwk",
          "rating": 2.300000,
          "stock": 63
        }
      },
      // ... more results ...
    ]
  }
}
```

```bash
# Second page (results 6-10)
curl -s 0:9308/search -d '{
  "table": "products",
  "query": { "match": { "*": "smartphone" } },
  "limit": 5,
  "offset": 5,
  "sort": [
    { "_score": "desc" },
    { "price": "asc" },
    { "id": "asc" }
  ]
}' | jq .
```

**输出：**
```
{
  "took": 0,
  "timed_out": false,
  "hits": {
    "total": 10000,
    "total_relation": "eq",
    "hits": [
      {
        "_id": 246,
        "_score": 1272,
        "_source": {
          "title": "Soft technically took. smartphone",
          "description": "This sometimes empty short normally! been send. Locally kind! quickly might infrequently culturally softly became tried quietly good gradually; inside dead loud.",
          "price": 11,
          "category": "wuiymrjdp",
          "rating": 1.600000,
          "stock": 87
        }
      },
      // ... more results ...
    ]
  }
}
```
### 替代语法

HTTP JSON API 还支持使用 `from` 和 `size` 代替 `offset` 和 `limit` 的替代语法：

```json
{
  "table": "table_name",
  "query": { "match": { ... } },
  "size": number_of_results,
  "from": starting_position
}
```

这与之前的格式等效，但遵循 Elasticsearch 的约定，对某些用户可能更熟悉。

### 通过 HTTP JSON 进行滚动分页

HTTP JSON API 还支持滚动分页以实现更高效的深度分页。这需要设置适当的排序条件并使用滚动令牌。

#### 初始查询

```bash
curl -s 0:9308/search -d '{
  "table": "products",
  "query": { "match": { "*": "smartphone" } },
  "limit": 5,
  "track_scores": true,
  "sort": [
    { "_score": "desc" },
    { "id": "asc" }
  ],
  "options": {
    "scroll": true
  }
}' | jq .
```

**输出：**
```
{
  "took": 0,
  "timed_out": false,
  "hits": {
    "total": 10000,
    "total_relation": "eq",
    "hits": [
      {
        "_id": 1,
        "_score": 1272,
        "_source": {
          "title": "Incorrectly heavy soft. smartphone",
          "description": "Carefully late sit; put draw afraid; gave hardly been vaguely 1000 this at understood gave always east fake, dull internationally sent. At cold; hot.",
          "price": 540,
          "category": "ogqejby",
          "rating": 2.800000,
          "stock": 52
        }
      },
      // ... more results ...
    ]
  },
  "scroll": "eyJvcmRlcl9ieV9zdHIiOiJAd2VpZ2h0IGRlc2MsIGlkIGFzYyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo1LCJ0eXBlIjoiaW50In1dfQ=="
}
```

注意响应中的 `scroll` 值，这是一个 base64 编码的令牌，表示您在结果集中的当前位置。

#### 后续查询

```bash
curl -s 0:9308/search -d '{
  "table": "products",
  "query": { "match": { "*": "smartphone" } },
  "limit": 5,
  "track_scores": true,
  "options": {
    "scroll": "eyJvcmRlcl9ieV9zdHIiOiJAd2VpZ2h0IGRlc2MsIGlkIGFzYyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo1LCJ0eXBlIjoiaW50In1dfQ=="
  }
}' | jq .
```

**输出：**
```
{
  "took": 0,
  "timed_out": false,
  "hits": {
    "total": 9995,
    "total_relation": "eq",
    "hits": [
      {
        "_id": 6,
        "_score": 1272,
        "_source": {
          "title": "Negatively! it become. smartphone",
          "description": "Be quiet cold nightly poor hourly had thin got dry, informally work, 100 50 needed specifically serious do bring below began! rich loud.",
          "price": 602,
          "category": "hjpwt",
          "rating": 2.500000,
          "stock": 47
        }
      },
      // ... more results ...
    ]
  },
  "scroll": "eyJvcmRlcl9ieV9zdHIiOiJAd2VpZ2h0IGRlc2MsIGlkIGFzYyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjoxMCwidHlwZSI6ImludCJ9XX0="
}
```

每个响应都包含一个用于下一页的新滚动令牌，允许继续分页。

### HTTP JSON 分页的最佳实践

1. **包含排序条件**：为了获得可预测的分页结果，始终包含显式的排序，例如：
   ```json
   "sort": [
     { "_score": "desc" },
     { "price": "asc" },
     { "id": "asc" }
   ]
   ```
2. **正确处理滚动令牌**：存储每个响应中的滚动令牌，并在下一个请求中使用它。
3. **使用适当的错误处理**：检查错误响应，尤其是在深度分页时。
4. **考虑响应大小**：通过使用 `_source` 参数仅选择所需的字段来控制返回的数据量：
   ```json
   "_source": ["title", "price", "rating"]
   ```
5. **监控性能**：对于非常大的结果集，请监控服务器性能并考虑使用基于滚动的分页。

HTTP JSON 分页方法遵循与 SQL 对应方法相同的原则（在 [第3节](/blog/pagination/#traditional-offset-based-pagination) 和 [第5节](/blog/pagination/#scroll-based-pagination) 中描述），但使用 JSON 语法。这使它们非常适合现代 Web 应用程序和 REST API。

## 分页的排序最佳实践

正确的排序对于有效的分页至关重要，尤其是基于滚动的分页。以下是关键考虑因素：

### 1. 包含唯一标识符

为了获得一致的分页结果，始终在 ORDER BY 子句中包含唯一标识符（通常是 `id` 字段）。这确保了即使多个文档在其他排序条件上具有相同的值，分页仍会产生一致的结果。

```sql
-- Good: Includes id as the tie-breaker
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC;

-- Bad: No unique identifier, may lead to inconsistent pagination
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY price ASC;
```

### 2. 包含 weight() 用于全文搜索

当在查询中使用 `MATCH()` 时，始终在排序条件中包含 `weight()`，以确保最相关的结果首先出现：

```sql
-- Good: Includes weight() for relevance sorting
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC;

-- Less optimal: Missing relevance sorting
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY price ASC, id ASC;
```

### 3. 保持排序顺序一致性

当使用传统的基于偏移的分页时，始终在所有分页查询中保持相同的排序顺序。在页面之间更改排序顺序可能导致结果丢失或重复。

```sql
-- First page (correct)
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC LIMIT 10;

-- Second page (correct - same sort order)
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC LIMIT 10, 10;

-- Second page (incorrect - different sort order)
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY rating DESC, id ASC LIMIT 10, 10;
```

### 4. 使用确定性排序

对于基于滚动的分页，始终使用确定性排序条件。应避免使用非确定性函数（如 `RAND()`），因为它们会在查询之间产生不一致的结果。

```sql
-- Good: Deterministic sorting
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC;

-- Bad: Non-deterministic sorting, will break scroll pagination
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY RAND();
```

### 5. 使用 ID 进行排序方向

当使用 `id` 字段作为排序的平局处理时，您可以使用 ASC 或 DESC 顺序，但必须一致应用：

```sql
-- These are both valid for scroll pagination
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC;
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id DESC;
```

### 6. 优化排序性能

为了提高排序性能，请考虑使用 `sort_method` 选项：

```sql
-- For pre-sorted data (e.g., by id)
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY id ASC 
LIMIT 5 OPTION sort_method=kbuffer;

-- For other sorting criteria
SELECT * FROM products WHERE MATCH('smartphone') ORDER BY weight() DESC, price ASC, id ASC 
LIMIT 5 OPTION sort_method=pq;
``` 

您选择的排序方法会显著影响分页的性能和一致性。例如，确定性排序对于基于滚动的分页至关重要，而包含唯一标识符对于所有分页方法的一致结果至关重要。有关排序如何影响性能的更多详细信息，请参阅 [限制和性能考虑](/blog/pagination/#limitations-and-performance-considerations)。

## 限制和性能考虑

了解不同分页方法的限制将帮助您为特定用例选择合适的方法。

### 基于偏移的分页限制

1. **大型偏移量性能下降**：随着偏移量值的增加，查询性能会下降。这是因为 Manticore 仍需要处理所有记录直到偏移量值，然后才能返回请求的页面。
2. **max_matches 限制**：默认情况下，您只能分页浏览前 1000 个结果。要超出此限制，必须增加 `max_matches`，这会增加内存使用。
3. **内存使用**：较大的 `max_matches` 值需要每个查询更多的 RAM，这可能在高负载下影响服务器性能。
4. **结果一致性无法保证**：如果在页面请求之间添加或删除文档，可能会看到重复或遗漏的结果。

对于深度分页场景或实现“加载更多”功能，请考虑使用 [基于滚动的分页](/blog/pagination/#scroll-based-pagination)，如 [第5节](/blog/pagination/#scroll-based-pagination) 中所述。

### 基于滚动的分页限制

1. **需要在排序条件中包含 ID**：您必须在 ORDER BY 子句中包含 `id` 字段，这可能与您期望的排序逻辑不一致。
2. **令牌管理**：您需要在请求之间存储和管理滚动令牌，这会增加应用程序的复杂性。
3. **无一致性保证**：基于滚动的分页不保证时间点一致性。如果在请求之间添加、删除或修改文档，结果仍可能受到影响。
4. **无随机访问**：与基于偏移的分页不同，您无法直接跳转到特定页面；必须顺序导航。

## 带分页的面向方面的搜索

面向方面的搜索允许用户通过多个维度过滤和导航搜索结果。当与分页结合使用时，了解分页如何影响面向方面的结果非常重要。

```sql
-- Faceted search with pagination (first page)
SELECT id, title, price, weight()
FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 0, 5
FACET category ORDER BY COUNT(*) DESC;
```

**输出：**
```
+------+----------------------------------------------+-------+----------+
| id   | title                                        | price | weight() |
+------+----------------------------------------------+-------+----------+
| 1179 | Weak spoke? fall. smartphone                 |    10 |     1272 |
| 1388 | 3; her day of. smartphone                    |    10 |     1272 |
| 1636 | Positively bad winter clean. smartphone      |    10 |     1272 |
| 5628 | Foolish from went heard low. smartphone      |    10 |     1272 |
| 8561 | Left sent. Infrequently, try lay. smartphone |    10 |     1272 |
+------+----------------------------------------------+-------+----------+
+-----------+----------+
| category  | count(*) |
+-----------+----------+
| ogqejby   |        1 |
| unmfujgqr |        1 |
| ttefm     |        1 |
| hceihdy   |        1 |
| sicjr     |        1 |
| hjpwt     |        1 |
| tvfqyj    |        1 |
| mvdjhbexo |        1 |
| scayuo    |        1 |
| esmlh     |        1 |
| fvbhplj   |        1 |
| lcphmiqmv |        1 |
| lnjfhb    |        1 |
| qexfdulub |        1 |
| tbswa     |        1 |
| eekarf    |        1 |
| airjuod   |        1 |
| ozkbuvgj  |        1 |
| yafbhr    |        1 |
| duccr     |        1 |
+-----------+----------+
```

```sql
-- Faceted search with pagination (second page)
SELECT id, title, price, weight()
FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 5, 5
FACET category ORDER BY COUNT(*) DESC;
```

**输出：**
```
+------+-----------------------------------------------------------+-------+----------+
| id   | title                                                     | price | weight() |
+------+-----------------------------------------------------------+-------+----------+
|  246 | Soft technically took. smartphone                         |    11 |     1272 |
| 1105 | Dirty up are! temporarily called. smartphone              |    11 |     1272 |
| 3293 | Instantly thick ran that hurt. smartphone                 |    11 |     1272 |
| 3736 | Work her wrong locally. smartphone                        |    11 |     1272 |
| 6978 | Stale loud passively sweet clean. smartphone              |    11 |     1272 |
+------+-----------------------------------------------------------+-------+----------+
+-----------+----------+
| category  | count(*) |
+-----------+----------+
| ogqejby   |        1 |
| unmfujgqr |        1 |
| ttefm     |        1 |
| hceihdy   |        1 |
| sicjr     |        1 |
| hjpwt     |        1 |
| tvfqyj    |        1 |
| mvdjhbexo |        1 |
| scayuo    |        1 |
| esmlh     |        1 |
| fvbhplj   |        1 |
| lcphmiqmv |        1 |
| lnjfhb    |        1 |
| qexfdulub |        1 |
| tbswa     |        1 |
| eekarf    |        1 |
| airjuod   |        1 |
| ozkbuvgj  |        1 |
| yafbhr    |        1 |
| duccr     |        1 |
+-----------+----------+
```

**重要说明：** 分面结果是基于整个匹配查询结果集计算的，而不是仅基于分页部分。这意味着无论您查看的是哪一页，分面计数都代表所有匹配文档的分布情况。LIMIT 子句仅影响主结果集中返回的文档，不影响分面计算。如上例所示，两页的分面结果相同，确认了它们是基于整个结果集计算的。

### 分面结果中的分页

除了对主搜索结果进行分页外，Manticore Search 还允许对分面结果本身进行分页。这在处理大量分面值时特别有用，因为显示所有分面值可能不切实际。

您可以在 FACET 语句中使用 LIMIT 子句来控制返回的分面值数量。

```sql
-- Get only the first 5 categories ordered by count
SELECT id, title FROM products 
WHERE MATCH('smartphone') 
LIMIT 3 
FACET category ORDER BY COUNT(*) DESC, category asc LIMIT 0,5;
```

**输出：**
```
+------+---------------------------------------------+
| id   | title                                       |
+------+---------------------------------------------+
|    1 | Incorrectly heavy soft. smartphone          |
|    2 | Know of make afraid foolish. smartphone     |
|    3 | Poor spring evening drove young. smartphone |
+------+---------------------------------------------+
+------------+----------+
| category   | count(*) |
+------------+----------+
| aaaabzwfzn |        1 |
| aabswb     |        1 |
| aacla      |        1 |
| aaejmtubv  |        1 |
| aaethytj   |        1 |
+------------+----------+
```

您可以在同一查询中为不同的分面提供不同的分页参数：

```sql
-- Multiple facets with different pagination parameters
SELECT id, title FROM products 
WHERE MATCH('smartphone') 
LIMIT 3 
FACET category ORDER BY COUNT(*) DESC, category asc LIMIT 0,5 
FACET category ORDER BY category ASC LIMIT 3,5 
FACET price ORDER BY FACET() ASC LIMIT 5;
```

**输出：**
```
+------+---------------------------------------------+
| id   | title                                       |
+------+---------------------------------------------+
|    1 | Incorrectly heavy soft. smartphone          |
|    2 | Know of make afraid foolish. smartphone     |
|    3 | Poor spring evening drove young. smartphone |
+------+---------------------------------------------+
+------------+----------+
| category   | count(*) |
+------------+----------+
| aaaabzwfzn |        1 |
| aabswb     |        1 |
| aacla      |        1 |
| aaejmtubv  |        1 |
| aaethytj   |        1 |
+------------+----------+
+------------+----------+
| category   | count(*) |
+------------+----------+
| aaejmtubv  |        1 |
| aaethytj   |        1 |
| aaktjgaa   |        1 |
| aalfwcvwil |        1 |
| aaqmumofe  |        1 |
+------------+----------+
+-------+----------+
| price | count(*) |
+-------+----------+
|    10 |        5 |
|    11 |        8 |
|    12 |       10 |
|    13 |        9 |
|    14 |        8 |
+-------+----------+
```

此示例演示了多种分面分页技术：

1. 第一个分面返回按计数排序的前 5 个类别（偏移量 0，限制 5）
2. 第二个分面按字母顺序返回类别，跳过前 3 个（偏移量 3，限制 5）
3. 第三个分面返回前 5 个价格点，按升序排列

分面分页特别适用于：

- 处理大量分面值（例如数千个产品类别）
- 为分面值实现“显示更多”功能
- 创建动态过滤界面，首先显示最受欢迎的过滤器
- 通过限制返回的分面值数量进行性能优化

**注意：** 与主结果分页不同，主结果分页在深度偏移时会影响性能，而分面分页即使在大型偏移时也通常效率较高，因为分面计算发生在文档匹配阶段之后。

### 通过 HTTP JSON 进行分面分页

当使用 HTTP JSON 接口时，也可以使用分面分页。您可以通过聚合规范中的 `size` 参数来限制返回的分面值数量：

```json
{
  "table": "products",
  "query": { "match": { "*": "smartphone" } },
  "limit": 3,
  "aggs": {
    "category_counts": {
      "terms": {
        "field": "category",
        "size": 5
      }
    }
  }
}
```

此示例将类别分面结果限制为前 5 个值。

**重要：** 与 SQL 分面不同，HTTP JSON 接口目前仅支持使用 `size` 参数限制分面值数量，不支持分面内的 `OFFSET` 参数进行分页。您只能指定返回多少结果，而不能指定跳过哪些结果。

有关通过 HTTP JSON 进行分面分页的更多详细信息，请参阅 [Manticore Search 文档](https://manual.manticoresearch.com/Searching/Faceted_search?client=JSON#Size-of-facet-result)。

## 实际使用案例

### 带高亮的分页

```sql
-- Pagination with highlighting
SELECT id, title, price, weight(),
       HIGHLIGHT({limit=100}) as highlight
FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC
LIMIT 5, 5;
```

**输出：**
```
+------+----------------------------------------------+-------+----------+-----------------------------------------------------+
| id   | title                                        | price | weight() | highlight                                           |
+------+----------------------------------------------+-------+----------+-----------------------------------------------------+
|  246 | Soft technically took. smartphone            |    11 |     1272 | Soft technically took. <b>smartphone</b>            |
| 1105 | Dirty up are! temporarily called. smartphone |    11 |     1272 | Dirty up are! temporarily called. <b>smartphone</b> |
| 3293 | Instantly thick ran that hurt. smartphone    |    11 |     1272 | Instantly thick ran that hurt. <b>smartphone</b>    |
| 3736 | Work her wrong locally. smartphone           |    11 |     1272 | Work her wrong locally. <b>smartphone</b>           |
| 6978 | Stale loud passively sweet clean. smartphone |    11 |     1272 | Stale loud passively sweet clean. <b>smartphone</b> |
+------+----------------------------------------------+-------+----------+-----------------------------------------------------+
```

### 使用滚动分页实现“加载更多”按钮

基于滚动的分页非常适合实现“加载更多”功能，因为它可以高效地导航大型结果集。以下是实现方法：

```sql
-- Initial query with SHOW SCROLL to get the scroll token
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
ORDER BY weight() DESC, price ASC, id ASC 
LIMIT 5; SHOW SCROLL\G
```

**输出：**
```
+------+------------------------------------------------+-------+----------+
| id   | title                                          | price | weight() |
+------+------------------------------------------------+-------+----------+
| 1179 | Weak spoke? fall. smartphone                   |    10 |     1272 |
| 1388 | 3; her day of. smartphone                      |    10 |     1272 |
| 1636 | Positively bad winter clean. smartphone        |    10 |     1272 |
| 5628 | Foolish from went heard low. smartphone        |    10 |     1272 |
| 8561 | Left sent. Infrequently, try lay. smartphone   |    10 |     1272 |
+------+------------------------------------------------+-------+----------+

*************************** 1. row ***************************
scroll_token: eyJvcmRlcl9ieV9zdHIiOiJ3ZWlnaHQoKSBERVNDLCBwcmljZSBBU0MsIGlkIEFTQyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6InByaWNlIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjoxMCwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo4NTYxLCJ0eXBlIjoiaW50In1dfQ==
1 row in set (0.00 sec)
```

```sql
-- When user clicks "Load More", use the scroll token to get the next batch
SELECT id, title, price, weight() FROM products 
WHERE MATCH('smartphone') 
LIMIT 5 
OPTION scroll='eyJvcmRlcl9ieV9zdHIiOiJ3ZWlnaHQoKSBERVNDLCBwcmljZSBBU0MsIGlkIEFTQyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6InByaWNlIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjoxMCwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo4NTYxLCJ0eXBlIjoiaW50In1dfQ=='; SHOW SCROLL\G
```

**输出：**
```
+------+---------------------------------------------+-------+----------+
| id   | title                                       | price | weight() |
+------+---------------------------------------------+-------+----------+
|  246 | Soft technically took. smartphone           |    11 |     1272 |
| 1105 | Dirty up are! temporarily called. smartphone|    11 |     1272 |
| 3293 | Instantly thick ran that hurt. smartphone   |    11 |     1272 |
| 3736 | Work her wrong locally. smartphone          |    11 |     1272 |
| 6978 | Stale loud passively sweet clean. smartphone|    11 |     1272 |
+------+---------------------------------------------+-------+----------+

*************************** 1. row ***************************
scroll_token: eyJvcmRlcl9ieV9zdHIiOiJ3ZWlnaHQoKSBERVNDLCBwcmljZSBBU0MsIGlkIEFTQyIsIm9yZGVyX2J5IjpbeyJhdHRyIjoid2VpZ2h0KCkiLCJkZXNjIjp0cnVlLCJ2YWx1ZSI6MTI3MiwidHlwZSI6ImludCJ9LHsiYXR0ciI6InByaWNlIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjoxMSwidHlwZSI6ImludCJ9LHsiYXR0ciI6ImlkIiwiZGVzYyI6ZmFsc2UsInZhbHVlIjo2OTc4LCJ0eXBlIjoiaW50In1dfQ==
```

每次点击“加载更多”时，您将使用前一个查询返回的新滚动令牌。这种方法相比传统的基于偏移的分页有以下优势：

1. **性能更好**：深度分页时不会出现性能下降
2. **无最大匹配限制**：可以高效地分页数百万个结果
3. **顺序访问**：高效地导航连续的结果页

**实现提示：**
- 在应用程序的状态中存储滚动令牌（客户端或服务器端）
- 当用户点击“加载更多”时，在后续请求中包含该令牌
- 始终使用最新的令牌，因为每个令牌代表结果集中的当前位置
- 请记住滚动令牌是临时的，最终会过期

此实现方法解决了传统基于偏移分页在 [第 8 节](/blog/pagination/#limitations-and-performance-considerations) 中描述的许多限制，特别是针对深度分页场景。

## 结论

Manticore Search 提供了灵活的分页选项，以满足不同的使用场景。传统的基于偏移的分页适用于简单界面和有限的页面导航，而基于滚动的分页则在深度分页场景（如无限滚动或数据导出）中提供更好的性能。

在 Manticore Search 中实现分页时：

1. **对于浅层分页**（前几页）：使用传统的基于偏移的分页，并保持一致的排序。
2. **对于深层分页**（多页或大型数据集）：使用基于滚动的分页，并使用适当的基于 ID 的排序标准。
3. **对于一致的结果**：始终在 ORDER BY 子句中包含 `id` 字段，并在查询中保持一致的排序标准。
4. **对于全文搜索**：在排序标准中包含 `weight()` 或 `_score`，以确保最相关的结果首先出现。

通过了解每种方法的优势、限制和最佳实践，您可以为特定的应用需求实现最有效的分页策略。请记住，最佳的分页方法取决于您的使用场景、数据量和用户体验要求。
