TF-IDF in a nutshell

luhnBack in 1958 Hans Peter Luhn assumed in his paper “The Automatic Creation of Literature Abstracts” that “the frequency of word occurrence in an article furnishes a useful measurement of word significance” which is until now probably one of the most significant things in the Information Retrieval science and is used in all well known big and small search engines starting from Google and Yahoo to custom search solutions such as ElasticSearch and Manticore Search. […]

Manticore 2.7.5 vs Sphinx 3.1.1

Hi Here we benchmarked Sphinx 3.0.2 vs Manticore 2.6.2. This was 8 months ago and both Manticore and Sphinx changed since then. As it’s said in Sphinx 3.0.3 announcement Sphinx 3.0.3 is up to 2x faster compared to 3.0.2, so it’s interesting to do another benchmark. This time let’s test on a real dataset – Hacker News comments. The benchmark was conducted with the following conditions: […]

Sphinx UDF example

Hi Many databases and search engines allow you to customize your queries using your own so called “user defined functions” or UDF. Sphinx and Manticore are not exceptions. There’s a long section in documentation about this – https://docs.manticoresearch.com/latest/html/extending.html#udfs-user-defined-functions Here I want to give just a quick example of how you can make a UDF which enables some fucntionality which can be really useful in some cases, but missing out of the box – sleep() function. […]

Training

Personal and team training will maximize them performance. 

Custom development

Need cone custom or individual features?

Fill the form and don’t forget to make the description of what you need.

Free config review

There are often optimizations that can be made to a Sphinx / Manticore setup by changing some simple directives in the configuration or making quick changes to an index definition.

Some common mistakes and issues can include:

  • doing main+delta without kill-lists, even if the delta does include updated records found in the main
  • using wildcarding with very short prefix/infix which can hammer performance in some cases
  • disabled (unintentional) seamless rotates and getting stalls on index rotations
  • adding texts as string attributes even if they are not using for any kind of operation (filtering, grouping, sorting) or mandatory to be present in results
  • using deprecated settings 

Having a quick look on the configuration can show issues or potential issues, this is why we want to offer a gift to our growing community!

When uploading your configuration file, we recommend to remove any database credentials first.

We suggest also you give as many possible details about your setup: how big is the data you have, how typical queries look and what issues you experience.

Contact us