如果你曾经不得不编写多个查询来捕捉短语的所有变体,你就知道这有多么重复和混乱。通过短语中的新 OR 支持,你可以在一个简洁的查询中匹配“快乐的客户”和“悲伤的客户”——以及任何其他变体。
Manticore Search 13.6.7 中的新内容
我们很高兴地宣布,
Manticore Search 13.6.7
已发布,并增强了对这一有用功能的支持。短语操作符(引号)中的 OR 操作符(|
)提供灵活的短语匹配,可以改善你构建搜索功能的方式。
短语中的 OR 的魔力
传统搜索引擎让你在精确短语匹配和宽松关键词匹配之间做选择。但是如果你需要一些两者之间的东西呢?这就是短语中 OR 操作符的亮点。每个选项在短语内的相同位置进行检查,如果任何替代项适合该位置,则短语匹配。
合理的语法
"( a | b ) c" -- Either "a c" or "b c"
"( ( a b c ) | d ) e" -- Either "a b c e" or "d e"
"man ( happy | sad ) but all ( ( as good ) | ( as fast ) )" -- Complex nested possibilities
我们来看看实际应用
让我们使用客户反馈数据创建一个真实世界的示例。首先,我们将设置测试环境:
-- Clean slate for easy reproduction
DROP TABLE IF EXISTS phrase_or_demo;
CREATE TABLE phrase_or_demo (title TEXT, content TEXT, category TEXT);
INSERT INTO phrase_or_demo (id, title, content, category) VALUES
(1, 'Happy Customer Review', 'I am a very happy customer with excellent service', 'reviews'),
(2, 'Sad Customer Feedback', 'I am a very sad customer with poor experience', 'reviews'),
(3, 'Customer Service Report', 'The customer was happy but had some concerns', 'reports'),
(4, 'Angry Customer Complaint', 'I am an angry customer demanding refund', 'complaints'),
(5, 'Neutral Customer Survey', 'The customer seemed neutral about our service', 'surveys'),
(6, 'Fast Delivery Service', 'Our delivery service is really fast and reliable', 'services'),
(7, 'Slow Delivery Issues', 'The delivery was extremely slow this time', 'issues'),
(8, 'Good Service Quality', 'We provide good service to all customers', 'services'),
(9, 'Bad Service Report', 'There were complaints about bad service quality', 'reports'),
(10, 'Customer Happy Experience', 'The happy customer left positive feedback', 'feedback'),
(11, 'Premium Quality Product', 'This is a premium quality item with excellent features', 'products'),
(12, 'Budget Quality Option', 'A budget quality alternative for cost-conscious buyers', 'products'),
(13, 'Standard Quality Service', 'Our standard quality offering meets basic needs', 'services');
示例 1:捕捉所有情感状态
查询: "(happy | sad | angry) customer"
SELECT * FROM phrase_or_demo WHERE MATCH('"(happy | sad | angry) customer"')
结果:
+------+---------------------------+---------------------------------------------------+------------+
| id | title | content | category |
+------+---------------------------+---------------------------------------------------+------------+
| 2 | Sad Customer Feedback | I am a very sad customer with poor experience | reviews |
| 4 | Angry Customer Complaint | I am an angry customer demanding refund | complaints |
| 1 | Happy Customer Review | I am a very happy customer with excellent service | reviews |
| 10 | Customer Happy Experience | The happy customer left positive feedback | feedback |
+------+---------------------------+---------------------------------------------------+------------+
4 rows in set (0.00 sec)
这很重要的原因: 不用编写三个单独的短语查询并用 OR 组合,你只需一个优雅的查询就能实现精确的短语匹配。
示例 2:服务质量变体
查询: "(good | bad | premium | budget | standard) (service | quality)"
SELECT * FROM phrase_or_demo WHERE MATCH('"(good | bad | premium | budget | standard) (service | quality)"');
结果:
+------+--------------------------+--------------------------------------------------------+----------+
| id | title | content | category |
+------+--------------------------+--------------------------------------------------------+----------+
| 8 | Good Service Quality | We provide good service to all customers | services |
| 9 | Bad Service Report | There were complaints about bad service quality | reports |
| 11 | Premium Quality Product | This is a premium quality item with excellent features | products |
| 12 | Budget Quality Option | A budget quality alternative for cost-conscious buyers | products |
| 13 | Standard Quality Service | Our standard quality offering meets basic needs | services |
+------+--------------------------+--------------------------------------------------------+----------+
5 rows in set (0.00 sec)
优势: 一个查询捕获所有质量 - 服务组合,具有精确的短语匹配。
超越基本短语:法定人数和接近性
OR 操作符不限于简单短语。有时你需要更多的灵活性,比如匹配文档,即使并非每个术语都存在,或查找排列不一定精确但相近的术语。这就是 法定人数 和 接近性 操作符的作用,它们与 OR 无缝配合。
带 OR 的法定人数:灵活的模糊匹配
带 OR 的法定人数操作符为你提供复杂的模糊匹配,其中只有每个 OR 组中的一个词算入阈值:
-- Find documents with at least 2 out of these word groups
SELECT id, content FROM phrase_or_demo WHERE MATCH('@content "(excellent | good | premium) (service | quality | experience) customer"/2');
结果:
+------+--------------------------------------------------------+
| id | content |
+------+--------------------------------------------------------+
| 8 | We provide good service to all customers |
| 1 | I am a very happy customer with excellent service |
| 11 | This is a premium quality item with excellent features |
| 2 | I am a very sad customer with poor experience |
| 5 | The customer seemed neutral about our service |
+------+--------------------------------------------------------+
5 rows in set (0.00 sec)
说明: 这匹配包含至少 2 个词组中的 3 个词组的文档:(excellent|good|premium)、(service|quality|experience)和“customer”。
高级法定人数示例
-- Match documents with at least 50% of these emotion/service combinations
SELECT id, title FROM phrase_or_demo
WHERE MATCH('"(happy | satisfied) (customer | experience) (excellent | good) (service | quality)"/0.5');
带 OR 的接近性:相邻选项
带 OR 的接近性操作符在指定距离内单独检查每个替代项:
-- Find "delivery" within 3 words of either "fast" or "slow"
SELECT id, title, content FROM phrase_or_demo WHERE MATCH('"(fast | slow) delivery"~3');
结果:
+------+-----------------------+--------------------------------------------------+
| id | title | content |
+------+-----------------------+--------------------------------------------------+
| 7 | Slow Delivery Issues | The delivery was extremely slow this time |
| 6 | Fast Delivery Service | Our delivery service is really fast and reliable |
+------+-----------------------+--------------------------------------------------+
2 rows in set (0.00 sec)
复杂接近性示例
-- Customer and emotional state within 5 words, plus quality terms
SELECT id, title, content FROM phrase_or_demo WHERE MATCH('"customer (happy | sad | angry)"~2 (quality | service | experience)');
结果:
+------+---------------------------+---------------------------------------------------+
| id | title | content |
+------+---------------------------+---------------------------------------------------+
| 10 | Customer Happy Experience | The happy customer left positive feedback |
| 2 | Sad Customer Feedback | I am a very sad customer with poor experience |
| 1 | Happy Customer Review | I am a very happy customer with excellent service |
| 3 | Customer Service Report | The customer was happy but had some concerns |
+------+---------------------------+---------------------------------------------------+
4 rows in set (0.00 sec)
比较:传统与现代
传统方法(多个全文声明)
-- The old way: multiple separate queries
SELECT id, title FROM phrase_or_demo WHERE MATCH('"happy customer"|"sad customer"|"angry customer"');
现代方法(单个 OR 短语)
-- The elegant way: one query to rule them all
SELECT id, title FROM phrase_or_demo WHERE MATCH('"(happy | sad | angry) customer"');
真实世界的应用
1. 电子商务产品搜索
-- Capture all color and size variations
"(red | blue | green | black) (shirt | t-shirt | tee) (small | medium | large)"
2. 内容管理系统
-- Track document status changes
"(draft | published | archived | deleted) (document | article | post)"
3. 客户支持工单分析
-- Categorize support issues with quorum
"(urgent | critical | high) (priority | importance) (bug | issue | problem)"/2
4. 社交媒体情感监测
-- Capture brand mentions with emotional context
"@brand (love | hate | like | dislike) (product | service | experience)"~5
5. 医疗记录搜索
-- Find patient symptoms with proximity
"patient (experienced | reported | complained) (pain | discomfort | symptoms)"~4
6. 金融交易分析
-- Track transaction types and statuses
"(credit | debit | transfer) (completed | pending | failed | cancelled)"
高级使用模式
1. 分层精度
将短语 OR 与其他操作符结合,实现精确匹配:
@title "(urgent | critical) (update | patch)" @body "security"
2. 性能优化
使用法定人数与 OR 进行模糊匹配,可能比通配符搜索更快:
"(run | running | runner | runs) (fast | quick | speed)"/1
3. 上下文灵活性
利用接近性 OR 处理自然语言变体:
"user (wants | needs | requires) (feature | functionality)"~3
主要好处
- 精度:保持精确的短语结构,同时兼容变体
- 可维护性:一个查询进行更新,而不是管理多个变体
- 分析:统一的结果集使分析和排名更有意义
- 灵活性:有效处理现实世界中的语言变体
结论
短语中的 OR 操作符提供了一种在严格的精确匹配搜索与宽松关键词匹配之间的有用中间地带。无论你是在构建电子商务搜索,分析客户反馈,还是创建内容发现系统,这一功能都提供了短语的精确性与替代选项的灵活性。
Manticore Search 13.6.7 将这一功能作为其全面文本搜索能力的一部分。短语、接近性和法定人数操作符与 OR 功能的组合为处理复杂搜索需求提供了额外选项。
要了解有关此功能和其他改进的更多信息,请参见 Manticore Search 13.6.7 发布说明 。