Improvements in Manticore Search 2.7: local indexes management

Resource sharing was done until now using RW locks.  Under high load, usage of locks could run into issues when trying to make changes on the indexes. To overcome these issues, we had to rethink the relation between threads and indexes.

Indexes can be big or even huge and they are shared among workers. On multi-core CPU you can fire simultaneously many queries, and they will distribute on cores, using one and same index. That’s simple and plain. However, sometimes you need to update the indexes. Seamless rotation worked quite well with the old fork workers: we just load new index files, leaving working forks to serve with old ones. And in one moment new queries goes to the forks which shares already new loaded index. So, no delays in serving: you just seamlessly step from old index to new one. Workers which used old version will finish (or crash), and so, finally previous index became freed.

In case of threads workers a simple RW-locks  – ‘shared-exclusive locks‘ – mechanism was used,  where many shared workers (readers) may access a resource simultaneously, but one exclusive worker (writer) may modify it.  So indexes were always either ‘shared‘ between queries, either ‘locked‘ by rotation. They were loaded at the beginning and freed at the end. When a rotation was made one thread loaded the new index, but it needed exclusive access on active index descriptor.  However, query workers have shared access and nobody stop them acquiring new tasks. This means on heavy load you just can’t perform ‘seamless’ rotation because of the last step of it – atomically change old index into new.

In the new model introduced in 2.7  indexes now live between threads independently, similar to old ‘fork’ case, as  they are now neither ‘shared‘ or ‘locked‘, but just immutable. So, it make the running simpler: worker just doesn’t care about ‘lock‘ or ‘share‘ the index, it just uses it. When you need to rotate (load a new index), the daemon just does it without any care about running queries (and their threads also doesn’t care about ‘writer’). Finally rotation just switches the pointer of active index to the new one (loaded), and then all works as with forks: old queries still use previous index, new directed to just loaded.  The only situation where an exclusive lock is still used is when doing UPDATEs.

Another advantage of the new model is that we can change the type of an index on the fly by a configuration reload. Before,  changing the type of an index was not always possible by making changes in the configuration file and try just a reload (by HUP signal), for example to switch a ‘plain’ index to ‘distributed’ or ‘distributed’ to ‘template’. Now when you reload new config, where all from the old index is it’s name  – the daemon will first parse new one. Then, if it could be used immediately (which is the case for templates and distributes) – it exchange them seamlessly. In case when new indexes are ‘heavy‘ (i.e. need to be precached into RAM and it may take minutes) – the process of rotation is deferred until they loaded. «deferred» means that caller see ‘ok’ return, ‘(all rotated)’. But actual workers will still issue all new queries to old indexes until the new one finally loaded and active.

 

Leave a Reply

© 2018 Manticore Software Ltd. Registered Address: Office 2, Derby House, 123 Watling Street, Gillingham, Kent, ME7 2YY
Company No. 10772872