如果你曾经不得不编写多个查询来捕获短语的所有变体,你就知道这会有多重复和混乱。通过新支持的短语内部OR功能,你可以在一个简洁的查询中匹配“happy customer”和“sad customer”以及其他任何变体。
Manticore Search 13.6.7的新功能
我们很高兴宣布
Manticore Search 13.6.7
已发布,增强了对这一有用功能的支持。短语操作符(引号)内部的OR操作符(|)提供了灵活的短语匹配,可以改进你构建搜索功能的方式。
短语中OR的奥秘
传统搜索引擎迫使你在精确短语匹配和宽松关键词匹配之间做出选择。但如果你需要中间的某种方式呢?这就是短语内部OR操作符的亮点。每个选项在短语的相同位置进行检查,只要任何替代项符合该位置,短语就会匹配。
合理的语法
"( a | b ) c" -- Either "a c" or "b c"
"( ( a b c ) | d ) e" -- Either "a b c e" or "d e"
"man ( happy | sad ) but all ( ( as good ) | ( as fast ) )" -- Complex nested possibilities
让我们实际看看
让我们使用客户反馈数据创建一个现实世界的例子。首先,我们设置测试环境:
-- Clean slate for easy reproduction
DROP TABLE IF EXISTS phrase_or_demo;
CREATE TABLE phrase_or_demo (title TEXT, content TEXT, category TEXT);
INSERT INTO phrase_or_demo (id, title, content, category) VALUES
(1, 'Happy Customer Review', 'I am a very happy customer with excellent service', 'reviews'),
(2, 'Sad Customer Feedback', 'I am a very sad customer with poor experience', 'reviews'),
(3, 'Customer Service Report', 'The customer was happy but had some concerns', 'reports'),
(4, 'Angry Customer Complaint', 'I am an angry customer demanding refund', 'complaints'),
(5, 'Neutral Customer Survey', 'The customer seemed neutral about our service', 'surveys'),
(6, 'Fast Delivery Service', 'Our delivery service is really fast and reliable', 'services'),
(7, 'Slow Delivery Issues', 'The delivery was extremely slow this time', 'issues'),
(8, 'Good Service Quality', 'We provide good service to all customers', 'services'),
(9, 'Bad Service Report', 'There were complaints about bad service quality', 'reports'),
(10, 'Customer Happy Experience', 'The happy customer left positive feedback', 'feedback'),
(11, 'Premium Quality Product', 'This is a premium quality item with excellent features', 'products'),
(12, 'Budget Quality Option', 'A budget quality alternative for cost-conscious buyers', 'products'),
(13, 'Standard Quality Service', 'Our standard quality offering meets basic needs', 'services');
示例1:捕获所有情绪状态
查询: "(happy | sad | angry) customer"
SELECT * FROM phrase_or_demo WHERE MATCH('"(happy | sad | angry) customer"')
结果:
+------+---------------------------+---------------------------------------------------+------------+
| id | title | content | category |
+------+---------------------------+---------------------------------------------------+------------+
| 2 | Sad Customer Feedback | I am a very sad customer with poor experience | reviews |
| 4 | Angry Customer Complaint | I am an angry customer demanding refund | complaints |
| 1 | Happy Customer Review | I am a very happy customer with excellent service | reviews |
| 10 | Customer Happy Experience | The happy customer left positive feedback | feedback |
+------+---------------------------+---------------------------------------------------+------------+
4 rows in set (0.00 sec)
为什么重要: 相比编写三个单独的短语查询并用OR组合,你只需一个优雅的查询即可实现精确的短语匹配。
示例2:服务质量的变体
查询: "(good | bad | premium | budget | standard) (service | quality)"
SELECT * FROM phrase_or_demo WHERE MATCH('"(good | bad | premium | budget | standard) (service | quality)"');
结果:
+------+--------------------------+--------------------------------------------------------+----------+
| id | title | content | category |
+------+--------------------------+--------------------------------------------------------+----------+
| 8 | Good Service Quality | We provide good service to all customers | services |
| 9 | Bad Service Report | There were complaints about bad service quality | reports |
| 11 | Premium Quality Product | This is a premium quality item with excellent features | products |
| 12 | Budget Quality Option | A budget quality alternative for cost-conscious buyers | products |
| 13 | Standard Quality Service | Our standard quality offering meets basic needs | services |
+------+--------------------------+--------------------------------------------------------+----------+
5 rows in set (0.00 sec)
优势: 一个查询捕获所有质量-服务组合,具有精确的短语精度。
超越基本短语:quorum和proximity
OR操作符不仅限于简单短语。有时你需要更多灵活性,比如即使不是每个术语都存在也能匹配文档,或者找到彼此靠近但不一定按精确顺序的术语。这就是 quorum 和 proximity 操作符的用武之地,它们与OR无缝协作。
与OR结合的quorum:灵活的模糊匹配
与OR结合的quorum操作符为你提供高级模糊匹配,其中每个OR组中的一个词即可计入阈值:
-- Find documents with at least 2 out of these word groups
SELECT id, content FROM phrase_or_demo WHERE MATCH('@content "(excellent | good | premium) (service | quality | experience) customer"/2');
结果:
+------+--------------------------------------------------------+
| id | content |
+------+--------------------------------------------------------+
| 8 | We provide good service to all customers |
| 1 | I am a very happy customer with excellent service |
| 11 | This is a premium quality item with excellent features |
| 2 | I am a very sad customer with poor experience |
| 5 | The customer seemed neutral about our service |
+------+--------------------------------------------------------+
5 rows in set (0.00 sec)
解释: 这匹配包含至少三个词组中的两个的文档:(excellent|good|premium)、(service|quality|experience)和"customer"。
高级quorum示例
-- Match documents with at least 50% of these emotion/service combinations
SELECT id, title FROM phrase_or_demo
WHERE MATCH('"(happy | satisfied) (customer | experience) (excellent | good) (service | quality)"/0.5');
与OR结合的proximity:附近的替代项
与OR结合的proximity操作符在指定距离内分别检查每个替代项:
-- Find "delivery" within 3 words of either "fast" or "slow"
SELECT id, title, content FROM phrase_or_demo WHERE MATCH('"(fast | slow) delivery"~3');
结果:
+------+-----------------------+--------------------------------------------------+
| id | title | content |
+------+-----------------------+--------------------------------------------------+
| 7 | Slow Delivery Issues | The delivery was extremely slow this time |
| 6 | Fast Delivery Service | Our delivery service is really fast and reliable |
+------+-----------------------+--------------------------------------------------+
2 rows in set (0.00 sec)
复杂proximity示例
-- Customer and emotional state within 5 words, plus quality terms
SELECT id, title, content FROM phrase_or_demo WHERE MATCH('"customer (happy | sad | angry)"~2 (quality | service | experience)');
结果:
+------+---------------------------+---------------------------------------------------+
| id | title | content |
+------+---------------------------+---------------------------------------------------+
| 10 | Customer Happy Experience | The happy customer left positive feedback |
| 2 | Sad Customer Feedback | I am a very sad customer with poor experience |
| 1 | Happy Customer Review | I am a very happy customer with excellent service |
| 3 | Customer Service Report | The customer was happy but had some concerns |
+------+---------------------------+---------------------------------------------------+
4 rows in set (0.00 sec)
对比:传统与高级
传统方法(多个全文本语句)
-- The old way: multiple separate queries
SELECT id, title FROM phrase_or_demo WHERE MATCH('"happy customer"|"sad customer"|"angry customer"');
现代方法(单个OR短语)
-- The elegant way: one query to rule them all
SELECT id, title FROM phrase_or_demo WHERE MATCH('"(happy | sad | angry) customer"');
实际应用
1. 电子商务产品搜索
-- Capture all color and size variations
"(red | blue | green | black) (shirt | t-shirt | tee) (small | medium | large)"
2. 内容管理系统
-- Track document status changes
"(draft | published | archived | deleted) (document | article | post)"
3. 客户支持工单分析
-- Categorize support issues with quorum
"(urgent | critical | high) (priority | importance) (bug | issue | problem)"/2
4. 社交媒体情绪监控
-- Capture brand mentions with emotional context
"@brand (love | hate | like | dislike) (product | service | experience)"~5
5. 医疗记录搜索
-- Find patient symptoms with proximity
"patient (experienced | reported | complained) (pain | discomfort | symptoms)"~4
6. 金融交易分析
-- Track transaction types and statuses
"(credit | debit | transfer) (completed | pending | failed | cancelled)"
高级使用模式
1. 分层精度
将短语OR与其他操作符结合以实现手术般的精度:
@title "(urgent | critical) (update | patch)" @body "security"
2. 性能优化
使用quorum与OR进行可能比通配符搜索更快的模糊匹配:
"(run | running | runner | runs) (fast | quick | speed)"/1
3. 上下文灵活性
利用proximity OR处理自然语言变体:
"user (wants | needs | requires) (feature | functionality)"~3
关键优势
- 精度:在保持精确短语结构的同时容纳变体
- 可维护性:只需更新一个查询,而不是管理多个变体
- 分析:统一的结果集使分析和排名更有意义
- 灵活性:有效处理现实世界中的语言变体
总结
短语内部的OR操作符在严格的精确匹配搜索和宽松的关键词匹配之间提供了一个有用的中间地带。无论你是在构建电子商务搜索、分析客户反馈还是创建内容发现系统,这一功能都提供了短语的精确性与替代项的灵活性。
Manticore Search 13.6.7 将其作为其全面文本搜索功能的一部分。短语、proximity和quorum操作符与OR功能的结合为处理复杂搜索需求提供了更多选项。
要了解更多关于此功能和其他改进的信息,请参阅 Manticore Search 13.6.7发布说明 。
