# Introduction to full-text operators and basic search

在本教程中，我们将探讨Manticore Search中可用的全文搜索运算符。

# 全文搜索运算符和基本搜索简介

Manticore Search中的所有搜索操作都基于标准布尔运算符（AND，OR，NOT），这些运算符可以组合使用并以任意顺序排列，以组合或排除关键词以获得更相关的结果。

默认且最简单的全文搜索运算符是AND，当您只需在搜索中列出几个词时，就会默认使用该运算符。

![Full-Text-Operators-in-Manticore_AND](./manticore-full-text-operators-definitive-guide/Full-Text-Operators-in-Manticore_AND-oqrugx63xg8ezov2jmez5z2j6aehaqp0vmtov5gh6w.png "Manticore中的全文搜索运算符AND")**AND** 是默认运算符，使用 **`fast slow`** 查询将返回同时包含两个词项 'fast' 和 'slow' 的文档。如果一个词项在文档中存在而另一个不存在，该文档将不会包含在结果列表中。
默认情况下，词语将在所有可用的全文字段中进行搜索。

```sql
SELECT * FROM testrt WHERE MATCH('fast slow');

```

![Full-Text-Operators-in-Manticore_OR](./manticore-full-text-operators-definitive-guide/Full-Text-Operators-in-Manticore_OR-oqruhgwpwyzfri2eccy54c37ndp6sdvdycivxyn7k8.png "Manticore中的全文搜索运算符OR")**OR** 用于匹配任意一个词项（或两个都匹配）。词项需要用竖线分隔，例如 **`fast | slow`**。它将找到包含 `fast` 或 `slow` 的文档。

```sql
SELECT * FROM testrt WHERE MATCH('fast | slow');

```
OR 运算符的优先级高于 AND，因此查询 **'find me fast|slow'** 可以被解释为 'find me (fast|slow)'：


```sql
SELECT * FROM testrt WHERE MATCH('find me fast | slow');
```

![NOT](./manticore-full-text-operators-definitive-guide/Full-Text-Operators-in-Manticore_NOT-1-ouyihor8jcz7ulw8ge49i39crd81gfa5mhswo6fx8o.png) **NOT** 确保用 `-` 或 `!` 标记的词项不在结果中。任何包含此类词项的文档都将被排除。例如 **`fast !slow`** 将找到包含 `fast` 的文档，但前提是其中没有 `slow`。在尝试减少搜索范围时要小心使用它，因为它可能会变得过于具体并排除掉好的文档。


```sql
SELECT * FROM testrt WHERE MATCH('find !slow');

SELECT * FROM testrt WHERE MATCH('find -slow');
```

![MAYBE](./manticore-full-text-operators-definitive-guide/MAYBE-ovsmxs1rjjx4r9zfqopp5gjggdkxynqlt08kfhb56w.png "MAYBE")**MAYBE** 是一个特殊运算符，其工作方式类似于 `OR`，但要求左侧的词项始终出现在结果中，而右侧的词项是可选的。但是当两个词项都满足时，文档将获得更高的搜索排名。例如 **`fast MAYBE slow`** 将找到包含 `fast` 或 `slow` 的文档，但同时包含两个词项的文档将获得更高的评分。

```sql
SELECT * FROM testrt WHERE MATCH('find MAYBE slow');
```

## 使用示例

让我们使用mysql客户端连接到Manticore：

```sql
# mysql -P9306 -h0
```

对于布尔搜索，可以使用 OR 运算符 `|`：


```sql
MySQL [(none)]> select * from testrt where match('find | me fast');
+------+------+------------------------+----------------+
| id   | gid  | title                  | content        |
+------+------+------------------------+----------------+
|    1 |    1 | find me                |  fast and quick|
|    2 |    1 | find me fast           |  quick         |
|    6 |    1 | find me fast now       |  quick         |
|    5 |    1 | find me quick and fast |  quick         |
+------+------+------------------------+----------------+
4 rows in set (0.00 sec)
```

OR 运算符的优先级高于 AND，查询 `find me fast|slow` 被解释为 `find me (fast|slow)`：


```sql
MySQL [(none)]> SELECT * FROM testrt WHERE MATCH('find me fast|slow');
+------+------+------------------------+----------------+
| id   | gid  | title                  | content        |
+------+------+------------------------+----------------+
|    1 |    1 | find me                |  fast and quick|
|    2 |    1 | find me fast           |  quick         |
|    6 |    1 | find me fast now       |  quick         |
|    3 |    1 | find me slow           |  quick         |
|    5 |    1 | find me quick and fast |  quick         |
+------+------+------------------------+----------------+
5 rows in set (0.00 sec)
```

对于否定，运算符 NOT 可以指定为 `-` 或 `!`：


```sql
MySQL [(none)]> select * from testrt where match('find me -fast');
+------+------+--------------+---------+
| id   | gid  | title        | content |
+------+------+--------------+---------+
|    3 |    1 | find me slow |  quick  |
+------+------+--------------+---------+
1 row in set (0.00 sec)
```

必须注意，默认情况下Manticore不支持完整的否定查询，因此无法仅运行 `-fast`（从v3.5.2版本开始将支持）。

另一个基本运算符是 `MAYBE`。由 MAYBE 定义的词项可以在文档中存在或不存在。如果存在，它将影响排名，包含它的文档将获得更高的排名。


```sql
MySQL [(none)]> select * from testrt where match('find me MAYBE slow');
+------+------+------------------------+----------------+
| id   | gid  | title                  | content        |
+------+------+------------------------+----------------+
|    3 |    1 | find me slow           |  quick         |
|    1 |    1 | find me                |  fast and quick|
|    2 |    1 | find me fast           |  quick         |
|    5 |    1 | find me quick and fast |  quick         |
|    6 |    1 | find me fast now       |  quick         |
+------+------+------------------------+----------------+
5 rows in set (0.00 sec)

```

## 字段运算符

如果我们想将搜索限制在特定字段，可以使用运算符 '@'：


```sql
mysql> select * from testrt where match('@title find me fast');
+------+------+------------------------+---------+
| id   | gid  | title                  | content |
+------+------+------------------------+---------+
|    2 |    1 | find me fast           |  quick  |
|    6 |    1 | find me fast now       |  quick  |
|    5 |    1 | find me quick and fast |  quick  |
+------+------+------------------------+---------+
3 rows in set (0.00 sec)

```
我们也可以指定多个字段来限制搜索：


```sql
mysql> select * from testrt where match('@(title,content) find me fast');
+------+------+------------------------+----------------+
| id   | gid  | title                  | content        |
+------+------+------------------------+----------------+
| 1    |    1 | find me                | fast and quick |
| 2    |    1 | find me fast           | quick          |
| 6    |    1 | find me fast now       | quick          |
| 5    |    1 | find me quick and fast | quick          |
+------+------+------------------------+----------------+
4 rows in set (0.00 sec)
```

字段运算符还可以用于限制搜索仅在前x个词中进行。例如：

```sql
mysql> select * from testrt where match('@title lazy dog');
+------+------+----------------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                                      | content                               |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                                |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog          |  The five boxing wizards jump quickly |
|    8 |    1 | The  brown and beautiful fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
4 rows in set (0.00 sec)
```

但是如果我们只在前5个词中搜索，我们将找不到任何结果：

```sql
mysql> select * from testrt where match('@title[5] lazy dog');
Empty set (0.00 sec)

```
在某些情况下，搜索可能需要在多个索引上执行，而这些索引可能没有相同的全文字段。  
默认情况下，如果指定的字段在索引中不存在，将导致查询错误。为了解决这个问题，可以使用特殊运算符 `@@relaxed`：


```sql
mysql> select * from testrt where match('@(title,keywords) lazy dog');<br></br>ERROR 1064 (42000): index testrt: query error: no field 'keywords' found in schema
```

```sql
mysql> select * from testrt where match('@@relaxed @(title,keywords) lazy dog');
+------+------+----------------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                                      | content                               |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                                |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog          |  The five boxing wizards jump quickly |
|    8 |    1 | The  brown and beautiful fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
3 rows in set, 1 warning (0.01 sec)
```


## 模糊搜索

模糊匹配允许仅匹配查询字符串中的一些词，例如：


```sql
mysql> select * from testrt where match('"fox bird lazy dog"/3');
+------+------+----------------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                                      | content                               |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                                |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog          |  The five boxing wizards jump quickly |
|    8 |    1 | The  brown and beautiful fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
3 rows in set (0.00 sec)
```

在这种情况下，我们使用 `QUORUM` 运算符并指定匹配3个词即可。使用 `/1` 的搜索等同于OR布尔搜索，而使用 `/N` 的搜索（其中N是输入词的数量）等同于AND搜索。

除了绝对数字，您还可以指定0.0到1.0之间的数字（代表0%到100%），Manticore将仅匹配至少包含指定百分比的给定词的文档。上面的相同示例也可以写成 `"fox bird lazy dog"/0.3`，它将匹配至少包含4个词中30%的文档。

```sql
mysql> select * from testrt where match('"fox bird lazy dog"/0.3');
+------+------+----------------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                                      | content                               |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                                |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog          |  The five boxing wizards jump quickly |
|    8 |    1 | The  brown and beautiful fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
3 rows in set (0.00 sec)
```


## 高级运算符

除了较简单的运算符外，还有许多高级运算符使用较少，但在某些情况下可能是绝对必要的。

最常用的高级运算符之一是短语运算符。  
短语运算符仅在给定的词按逐字指定的顺序找到时才匹配。这还将限制词必须出现在相同的字段中：


```sql
mysql> SELECT * FROM testrt WHERE MATCH('"quick brown fox"');
+------+------+-------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title            | content                               |
+------+------+-------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog            |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+-------------------------------------------------------------------+---------------------------------------+
2 rows in set (0.00 sec)

```
短语运算符的更宽松版本是严格顺序运算符。  
顺序运算符要求词必须按指定的顺序找到，但允许其他词出现在它们之间：


```sql
mysql> SELECT * FROM testrt WHERE MATCH('find << me << fast');
+------+------+------------------------+---------+
| id   | gid  | title                  | content |
+------+------+------------------------+---------+
|    2 |    1 | find me fast           |  quick  |
|    6 |    1 | find me fast now       |  quick  |
|    5 |    1 | find me quick and fast |  quick  |
+------+------+------------------------+---------+
3 rows in set (0.00 sec)

```
另一对与词位置一起工作的运算符是开始/结束字段运算符。  
这些将限制词必须出现在字段的开头或结尾。

```sql
mysql> SELECT * FROM testrt WHERE MATCH('^find me fast$');
+------+------+------------------------+---------+
| id   | gid  | title                  | content |
+------+------+------------------------+---------+
|    2 |    1 | find me fast           |  quick  |
|    5 |    1 | find me quick and fast |  quick  |
+------+------+------------------------+---------+
2 rows in set (0.00 sec)

```
邻近运算符类似于AND运算符，但增加了词之间的最大距离，因此它们仍可被视为匹配。让我们以仅使用AND运算符的示例为例：

```sql
mysql> SELECT * FROM testrt WHERE MATCH('brown fox jumps');
+------+------+----------------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                                      | content                               |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                                |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog          |  The five boxing wizards jump quickly |
|    8 |    1 | The  brown and beautiful fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
3 rows in set (0.00 sec)
```

我们的查询返回2个结果：一个结果中所有词彼此靠近，另一个结果中有一个词更远。  
如果我们希望仅在词在一定距离内时才匹配，可以使用邻近运算符来限制这一点：


```sql
mysql> SELECT * FROM testrt WHERE MATCH('"brown fox jumps"~5');
+------+------+---------------------------------------------+---------------------------------------+
| id   | gid  | title                                       | content                               |
+------+------+---------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+---------------------------------------------+---------------------------------------+
1 row in set (0.00 sec)
```

邻近操作符的更通用版本是 `NEAR` 操作符。在邻近情况下，会在一个词袋中指定单个距离，而 `NEAR` 操作符使用两个操作数，这两个操作数可以是单个词或表达式。

在以下示例中，'brown' 和 'fox' 必须在距离 2 内，'fox' 和 'jumps' 必须在距离 6 内：

```sql
mysql> SELECT * FROM testrt WHERE MATCH('brown NEAR/2 fox NEAR/6 jumps');
+------+------+-------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                             | content                               |
+------+------+-------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                       |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+-------------------------------------------------------------------+---------------------------------------+
2 rows in set (0.00 sec)
```

该查询排除了不满足第一个 `NEAR` 条件的文档（此处是最后一个）：

```sql
mysql> SELECT * FROM testrt WHERE MATCH('brown NEAR/3 fox NEAR/6 jumps');
+------+------+----------------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                                      | content                               |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
|    4 |    1 | The quick brown fox jumps over the lazy dog                                |  The five boxing wizards jump quickly |
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog          |  The five boxing wizards jump quickly |
|    8 |    1 | The  brown and beautiful fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+----------------------------------------------------------------------------+---------------------------------------+
3 rows in set (0.09 sec)
```

`NEAR` 操作符的一个变体是 `NOTNEAR`，它仅在操作数之间有最小距离时匹配。

```sql
mysql> SELECT * FROM testrt WHERE MATCH('"brown fox" NOTNEAR/5 jumps');
+------+------+-------------------------------------------------------------------+---------------------------------------+
| id   | gid  | title                                                             | content                               |
+------+------+-------------------------------------------------------------------+---------------------------------------+
|    7 |    1 | The quick brown fox take a step back and  jumps over the lazy dog |  The five boxing wizards jump quickly |
+------+------+-------------------------------------------------------------------+---------------------------------------+
1 row in set (0.00 sec)
```

Manticore 还可以检测纯文本中的句子和 HTML 内容中的段落。  
为了对句子进行索引，需要启用 **[index_sp](http://mnt.cr/index_sp)** 选项，而段落还需要 **[html_strip](http://mnt.cr/html_strip)=1**。让我们以以下示例为例：


```sql
mysql> select * from testrt where match('"the brown fox" jumps')G
*************************** 1. row ***************************
id: 15
gid: 2
title: The brown fox takes a step back. Then it jumps over the lazydog
content:
1 row in set (0.00 sec)
```

该文档包含 2 个句子，而短语仅在第一个句子中找到，'jumps' 仅在第二个句子中。

使用 SENTENCE 操作符，我们可以将搜索限制为仅当操作数在同一个句子中时匹配：


```sql
mysql> select * from testrt where match('"the brown fox" SENTENCE jumps')G
Empty set (0.00 sec)
```

我们可以看到该文档不再匹配。如果我们修正搜索查询，使所有词都来自同一个句子，我们将看到匹配：

```sql
mysql> select * from testrt where match('"the brown fox" SENTENCE back')G<br></br>*************************** 1. row ***************************<br></br>id: 15<br></br>gid: 2<br></br>title: The brown fox takes a step back. Then it jumps over the lazydog<br></br>content:<br></br>1 row in set (0.00 sec)
```

为了演示 PARAGRAPH，让我们使用以下搜索：


```sql
mysql> select * from testrt where match('Samsung  Galaxy');
+------+------+-------------------------------------------------------------------------------------+---------+
| id   | gid  | title                                                                               | content |
+------+------+-------------------------------------------------------------------------------------+---------+
|    9 |    2 | <h1>Samsung Galaxy S10</h1>Is a smartphone introduced by Samsung in 2019            |         |
|   10 |    2 | <h1>Samsung</h1>Galaxy,Note,A,J                                                     |         |
+------+------+-------------------------------------------------------------------------------------+---------+
2 rows in set (0.00 sec)
```

这两个文档具有不同的 HTML 标签

如果我们添加 PARAGRAPH，只有包含搜索词在单个标签中的文档会保留。

更通用的操作符是 ZONE 及其变体 ZONESPAN。"zone" 是 HTML 或 XML 标签内的文本。

需要在 `index_zones` 设置中声明要用于区域的标签，例如 `index_zones = h*, th, title`。

例如：

```sql
mysql> select * from testrt where match('hello world');
+------+------+-------------------------------+---------+
| id   | gid  | title                         | content |
+------+------+-------------------------------+---------+
|   12 |    2 | Hello world                   |         |
|   14 |    2 | <h1>Hello world</h1>          |         |
|   13 |    2 | <h1>Hello</h1> <h1>world</h1> |         |
+------+------+-------------------------------+---------+
3 rows in set (0.00 sec)
```

我们有 3 个文档，其中 'hello' 和 'world' 出现在纯文本中，出现在相同类型的区域中或出现在单个区域中。

```sql
mysql> select * from testrt where match('ZONE:h1 hello world');
+------+------+-------------------------------+---------+
| id   | gid  | title                         | content |
+------+------+-------------------------------+---------+
|   14 |    2 | <h1>Hello world</h1>          |         |
|   13 |    2 | <h1>Hello</h1> <h1>world</h1> |         |
+------+------+-------------------------------+---------+
2 rows in set (0.00 sec)
```

在这种情况下，这些词出现在 H1 区域中，但它们不需要在同一个区域中。如果我们想将匹配限制在单个区域，可以使用 ZONESPAN：

```sql
mysql> select * from testrt where match('ZONESPAN:h1 hello world');
+------+------+----------------------+---------+
| id   | gid  | title                | content |
+------+------+----------------------+---------+
|   14 |    2 | <h1>Hello world</h1> |         |
+------+------+----------------------+---------+
1 row in set (0.00 sec)
```

希望从本文中，你已经了解了 Manticore 中 [全文搜索操作符](https://docs.manticoresearch.com/latest/html/searching/boolean_query_syntax.html) 的工作原理。如果你想通过实际操作来更好地学习，可以立即在浏览器中 [尝试我们的交互式课程](https://play.manticoresearch.com/fulltextintro/)。


### [交互式课程](https://play.manticoresearch.com/fulltextintro/)

[![img](./manticore-full-text-operators-definitive-guide/Manticore-Full-text-operators-Interactive-course.png)](https://play.manticoresearch.com/fulltextintro/)  

如果你想了解更多关于全文匹配的内容，可以尝试我们的 "全文操作符入门" [交互式课程](https://play.manticoresearch.com/fulltextintro/)，该课程包含命令行以方便学习。
