只是想分享一个有趣的技巧,说明如何轻松地使用Sphinx / Manticore Search为测试目的进行索引,而无需向数据库中填充大量数据或执行类似操作。下面是一个完整的Sphinx / Manticore Search配置,它允许你构建一个包含随机3个字符单词和地理坐标的1M文档索引,一个构建索引的命令示例,以及一个sphinxql查询示例,该查询在索引中进行搜索。你所需要的就是任何数据库的连接(在这种情况下,mysql -u root就可以工作)。
[snikolaev@dev01 ~]$ cat sphinx_1m.conf
source min
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass =
sql_db = test
sql_query_range = select 1, 1000000
sql_range_step = 1
sql_query = select $start, mid(md5(rand()), 1, 3) body, rand() * 180 lat, rand($end) * 90 lng
sql_attr_float = lat
sql_attr_float = lng
}
index idx
{
path = idx_1m
source = min
}
searchd
{
binlog_path = #
listen = 9314:mysql41
log = sphinx_1m.log
pid_file = sphinx_1m.pid
}
[snikolaev@dev01 ~]$ indexer -c sphinx_1m.conf --all --rotate
Manticore 2.6.1 9a706b4@180119 dev
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2018, Manticore Software LTD (http://manticoresearch.com)
using config file 'sphinx_1m.conf'...
indexing index 'idx'...
WARNING: sql_range_step=1: too small; might hurt indexing performance!
collected 1000000 docs, 3.0 MB
sorted 1.0 Mhits, 100.0% done
total 1000000 docs, 3000000 bytes
total 86.580 sec, 34649 bytes/sec, 11549.98 docs/sec
total 5 reads, 0.014 sec, 4512.0 kb/call avg, 2.9 msec/call avg
total 24 writes, 0.031 sec, 1806.1 kb/call avg, 1.3 msec/call avg
rotating indices: successfully sent SIGHUP to searchd (pid=17284).
mysql> select id, geodist(lat,lng,73.9667,40.78, {in=deg,out=km}) dist, lat, lng from idx where dist < 5;
+--------+----------+-----------+-----------+
| id | dist | lat | lng |
+--------+----------+-----------+-----------+
| 636503 | 4.880664 | 73.952385 | 40.929459 |
+--------+----------+-----------+-----------+
1 row in set (0.09 sec)
如你所见,关键部分是利用`sql_query_range`和`sql_range_step`指令,让Manticore循环直到生成1M文档集合。缺点是与从数据库中获取相同数量的真实数据相比,索引速度较慢,但毕竟,你不会在生产环境中使用这个方法,对吧?
希望当你决定尝试Manticore Search时,你会发现这个方法有帮助。
I hope you'll find it helpful when you decide to play with Manticore Search.