# Threads in Manticore Search

In this article we talk about the current workers implemented in Manticore Search and how to tune the parameters of the workers.



In Manticore Search there are currently two multi-processing modes controlled by directive [workers](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#workers). Default MPM currently is ***thread\_pool***, with the alternative of ***threads***.

### Threads


In the multi-processing mode '**[threads](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#workers)**' a new dedicated threads is created for **every** incoming network ***connection***. The thread is active until the client disconnect - that is, as long as the connection is alive. In this time, the thread will execute incoming queries from the connection.

By default, the engine will spawn an unlimited number of threads - if the operating system allows it. While this might look good, in reality allowing the creation of unlimited threads comes with some costs. First, creating/destruction a thread has a CPU cost. While this is low compared to forks, it still exists.

Spawning hundreds or thousands of threads per second can affect the performance of the system, depending on the CPU cores available. Having a much larger number of threads than the available cores lead to these threads fighting over the resources - one being the CPU cores, second can be the storage.

For example if we have a 16 core system, but we let to spawn 160 search threads, that means 10 threads per CPU core. The threads will not run simultaneous, but will have to wait in line for the CPU core to allocate cycles for them. On the storage side, we can have 160 potential requesters for reading data from it. If the storage can't keep serving the threads without delays (latencies), the delays can affect the allocation on CPU cores as a thread could receive green light to execute it's calculations on the CPU core, but has to wait for the data.
**[max\_children](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#max-children)** setting allow to put a limit on how many threads can be active at once. When the limit is reached, the engine will start to dismiss incoming connections with a **'maxed out'** error. When sizing max\_children, you need to keep in mind that the thread is killed when client close the connection. If the connection is ***idling***, the thread will remain active and counted by max\_children. max\_children can be increased to a limit that depends on the server capabilities to handle a number of active queries. This includes processing capabilities (CPU) and IO operations (storage). One often mistake is to raise the max\_children to very high values. There is a point, depending on system capabilities, when spawning too many threads leads only to slowdown of the queries. If there's nothing more in terms of optimizing indexes and queries, a new search server should be considered for splitting the load.

 
  
  
### Thread\_pool


In the previous section we talked about threading having a cost and that connections can keep threads alive even if they are not used.

A different strategy to managing threads is to not create a thread per connection, but use a **pool** of threads for processing the work. In this case, the working threads are not connected with the incoming connections as the connections are handled by a different class of threads. When the daemon starts, it creates a number of threads. Incoming network connections are handle by so called network threads, their job being to receive the queries from the connections and route them to the threads of the pool.

If the pool is too busy (all workers are processing queries), new queries are sent to a **pending queue**. The server will '**max out**' if the pending queue gets filled. By default, no limit is set, however enqueuing for too much time does nothing by getting queries that take long time. It's a good idea to have a limit on the queue to signal that the server is receiving more queries than it can handle. This can set with directive **[queue\_max\_length](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#queue-max-length)**. Compared with max\_children in threads model, where maxing out starts after a certain number of open connections, in case of thread\_pool the daemon will starting 'max out' after active queries are more than max\_children (number of threads in the pool) + queue\_max\_length.

In case of [thread\_pool](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#workers), [max\_children](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#max-children) directive defines the number of threads in the pool, which are created at daemon start. By default, a value of **1.5x** the number of CPU cores is used. Setting max\_children to high values will not increase the performance or capacity of the server. As these threads handle just do the processing, they need to have access to the CPU as soon as possible.

Setting high values will only lead, as in previous section, to CPU contention and slower performance, but also a slower start of the daemon. Compared with threads mode, where max\_children can be set several times the numbers of CPU cores and values in terms of even several hundreds can make sense, in case of thread\_pool it makes little sense to go 2-3x the number of number of cores.

By default, to handle the network connections one dedicated thread is used. The job of the thread is generally easy, as it only acts as proxy to the thread\_pool, and generally it can sustain a good amount of traffic. The number of network threads can be increased with **[net\_workers](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#net-workers)** directive. Raising the network threads showed gains for setups with extreme rates of queries per second.

The network thread(s) also have several fine tuning settings. A network thread might not be always busy,as there can be times (in sub-second terms) when it doesn't receive queries. The thread will go to sleep and loose it's priority to CPU. The times spend on wake up or get back it's place on the CPU are small, but for high-performance every millisecond matter.

To overcome this, a busy loop can be activated with **[net\_wait\_tm](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#net-wait-tm)** directive. If net\_wait\_tm value is positive, the thread will 'tick' the CPU every 10\***net\_wait\_tm** ( where net\_wait\_tm value represents milliseconds). If net\_wait\_tm is zero, the busy loop runs continuous - it should be noted that this will generate extra CPU usage even if the network thread doesn't receive much traffic. To disable the busy loop, a negative value of -1 should be used. By default net\_wait\_tm has value 1, meaning busy loop ticks every 10ms.

The network thread can also be throttle. Option **[net\_throttle\_accept](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#net-throttle-accept-net-throttle-action)** enforce a limit on how many clients the network thread to accept at once and **[net\_throttle\_action](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#net-throttle-accept-net-throttle-action)** explicits how many requests to process per iteration. By default, no limit is enforced. Throttling the network thread make sense in cases of high load scenarios.

An aspect that is something misunderstood is the relation between max\_children and [dist\_threads](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#dist-threads). dist\_threads is responsible for multi-threading some operations, like a local [distributed](https://docs.manticoresearch.com/latest/html/searching/distributed_searching.html#distributed-searching) index, [snippets](https://docs.manticoresearch.com/latest/html/api_reference/additional_functionality.html#build-excerpts) with load\_files or [CALL PQ](https://docs.manticoresearch.com/latest/html/sphinxql_reference/call_pq_syntax.html) command. However these are subprocessing threads and they don't count for [max\_children](https://docs.manticoresearch.com/latest/html/conf_options_reference/searchd_program_configuration_options.html#max-children) directive which counts only the main query threads.
