Hi
Just want to share an interesting trick on how to easily index something with Sphinx / Manticore Search for test purposes without need of populating database with a lot of data or doing smth like that. The below is a full Sphinx / Mantocore Search config which lets you build a 1M docs index consisting of random 3-char words and geo coordinates, an example of command to build the index and an exampe of a sphinxql query which does search in the index. All you need is just any connection to any db (in this case ‘mysql -u root’ works).
[snikolaev@dev01 ~]$ cat sphinx_1m.conf source min { type = mysql sql_host = localhost sql_user = root sql_pass = sql_db = test sql_query_range = select 1, 1000000 sql_range_step = 1 sql_query = select $start, mid(md5(rand()), 1, 3) body, rand() * 180 lat, rand($end) * 90 lng sql_attr_float = lat sql_attr_float = lng } index idx { path = idx_1m source = min } searchd { binlog_path = # listen = 9314:mysql41 log = sphinx_1m.log pid_file = sphinx_1m.pid } [snikolaev@dev01 ~]$ indexer -c sphinx_1m.conf --all --rotate Manticore 2.6.1 9a706b4@180119 dev Copyright (c) 2001-2016, Andrew Aksyonoff Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com) Copyright (c) 2017-2018, Manticore Software LTD (http://manticoresearch.com) using config file 'sphinx_1m.conf'... indexing index 'idx'... WARNING: sql_range_step=1: too small; might hurt indexing performance! collected 1000000 docs, 3.0 MB sorted 1.0 Mhits, 100.0% done total 1000000 docs, 3000000 bytes total 86.580 sec, 34649 bytes/sec, 11549.98 docs/sec total 5 reads, 0.014 sec, 4512.0 kb/call avg, 2.9 msec/call avg total 24 writes, 0.031 sec, 1806.1 kb/call avg, 1.3 msec/call avg rotating indices: successfully sent SIGHUP to searchd (pid=17284). mysql> select id, geodist(lat,lng,73.9667,40.78, {in=deg,out=km}) dist, lat, lng from idx where dist < 5; +--------+----------+-----------+-----------+ | id | dist | lat | lng | +--------+----------+-----------+-----------+ | 636503 | 4.880664 | 73.952385 | 40.929459 | +--------+----------+-----------+-----------+ 1 row in set (0.09 sec)
As you can see the tricky part is to utilize directives sql_query_range and sql_range_step to let Manticore loop until it makes 1M docs collection. The drawback is slower indexing comparing to real fetching the same amount of data from db, but come on, you’re not going to use this in production, right?
I hope you’ll find it helpful when you decide to play with Manticore Search.