# Manticore Buddy: challenges and solutions

Hey there! 🤗 We hope y'all have already checked out our [Buddy Intro](https://manticoresearch.com/blog/manticoresearch-buddy-intro/) and have a good understanding of how it works. We want to share our journey and experiences developing it and the challenges we faced.

At **Manticore Software**, we encountered two main challenges:

- Expanding **Manticore Search** with non-performance critical features without modifying C++.
- Making it easier to contribute to enhancements and new feature implementations.

We were determined to find a solution. So, let's dive into our journey to develop **Buddy** and the issues we faced. Ready? Let's go!

## The beginning of the journey

Our journey started by examining the issue closely. Although **Manticore Search** is an exceptional search database product, we faced difficulties in releasing new features at the desired speed because the codebase is written in **C++**. Writing **C++** code requires low-level interaction with data structures, bytes, deep knowledge of how the machine works, compilation, debugging, and finding the best approach to write it, which takes a great deal of time but results in faster program execution. This drastically delayed the development process. Although **C++** is a great option for performance, it takes time to develop. We wanted to move quickly, ship more features, and do so consistently.

We came up with the idea of creating a companion for our primary **searchd** process, which could process failed queries from **Manticore Search** and return results to the original client. We didn't take long to decide which language to use and settled on **PHP** for several reasons:

- Most of the core team was familiar with it, so it would take less time to make it work.
- **PHP** is fast, especially with the newest version (8+), even when we did not require performant execution from the Buddy. It's faster than **Python** or **JavaScript**, so it was a good fit for our requirements.
- **PHP** is not only fast but also simple, reducing the level of expertise needed to contribute to the future ecosystem.

That's why we chose **PHP** and began implementing basic code to understand what we would need later.

We still use C++ to develop speed-critical features. C++ is ideal for tasks that require speed. For tasks that don't need much speed, **Buddy** is the optimal choice.

This is how **Buddy** was created. To make this possible on the C++ side of Manticore Search, we implemented a separate loop and communication between the **searchd** daemon and the **Buddy** PHP process using the **CURL** extension. We developed our internal protocol, which is a simple **JSON**, to route queries to **Buddy**; it handles these queries and sends us an appropriate response to be passed back to the original client.

## Implementing the communication protocol

When starting a new project, it's important to stay flexible and not overplan. In our case, we began with a basic implementation of communication using the [sockets](https://www.php.net/manual/en/book.sockets.php) extension in **PHP**. While it worked well, it wasn't scalable. Our goal was to connect **Manticore Search** with **Buddy**, and this simple implementation allowed us to validate that idea.

Instead of reinventing the wheel, we researched options for making the system more scalable and non-blocking. We initially considered **OpenSwoole**, but due to a license issue, we couldn't use it. We then found **ReactPHP**, which had a suitable license. So, we decided to go with **ReactPHP**.

> ReactPHP is a plain PHP framework that allows us to run a TCP server in async mode.
>

This choice worked well for us since it allowed us to handle multiple requests simultaneously and easily scale the system.

Next, we rewrote our simple **Buddy** MVP and created a core structure that would make it easy to add handling of new SQL commands in the future. The process is as follows:

- **Manticore Search** receives an SQL query from the user and attempts to handle it.
- If **Manticore Search** can handle the query, it returns the response to the client without involving **Buddy**.
- If **Manticore Search** cannot handle the query, it sends a special structure with all information about the input query and any errors to **Buddy**.
- **Buddy** parses the structure and checks if there is a handler for it. If there isn't, it returns the same error that Manticore Search would send to the client.
- If everything is good and we have an implementation to handle the query, we split the process into two steps: preparing the request with the required data and handling it with the handler logic. The request is a simple structure that represents a class with predefined variables and input parameters parsed from the input SQL query. If anything goes wrong, it may fail and return an error to the client.
- The handler then receives the request, does the necessary work, and returns the final result to the HTTP request.

This system is easy to maintain, simple, and can be easily extended with new functionality. However, there is an issue with this approach. If we have something heavy or need to wait in **Buddy**, it can slow down concurrent requests. This is because, although async isn't parallel, **PHP** is still blocking code, and **ReactPHP** uses [fibers](https://www.php.net/manual/en/language.fibers.php) to emulate the async approach. We'll discuss this issue in more detail in the next section.

## Async problem in PHP and scale for concurrency

To handle heavy loads, our team at Manticore Search needed a solution that could handle more requests than **ReactPHP**. While **ReactPHP** worked for implementing an async HTTP server and handling some concurrency, it wasn't scalable enough. After a quick search, we chose to use the **[parallel](https://www.php.net/manual/en/book.parallel.php)** extension over **pthreads** because of its maintainability and reliability.

But what is **parallel**? It's about parallelization, creating independent **Runtimes** that represent threads running in parallel. These threads can communicate through channels, which **Parallel** provides, allowing us to send data from the main **ReactPHP** loop process to the paralleled and detached thread in another **Runtime** without reinventing the wheel.

This approach was our silver bullet to handle high levels of concurrency and keep response time low. To make it happen, we implemented a **Task** component to run tasks in a parallel thread at the **Handler** level. This way, the main process didn't block, allowing us to handle many concurrent requests easily.

Initially, we created a **Runtime** on each request and destroyed it at the end of execution. However, this approach caused performance degradation with many requests received. To handle more requests, we prepared **Runtimes** on the first launch and used them in a round-robin fashion. This way, we could limit the maximum threads created and not exceed the total number of cores, which would also affect performance.

While we solved the performance and concurrency problem with ease, we faced another challenge: deployment. Not all operating systems support the latest **PHP** 8 version, and some uncommon extensions are not included in the default installation. But we found a solution, and you can too.

## Say hi to manticore-executor

We conducted research to find a painless way to ship the new tool to customers and discovered a great approach - [compiling both **PHP** and Buddy into a single static binary](https://github.com/crazywhalecc/static-php-cli/blob/master/README-en.md). This involved injecting **PHP** into its sources and creating a binary that could run. However, we encountered an obstacle because we wanted to mix different licenses - **PHP 3.01** and **GPL 2.0** - which was not feasible. As a result, we chose to pre-build **PHP**, link it statically, and name it manticore-executor.

Unfortunately, the process was not simple. We attempted to build it with Ubuntu but encountered a problem - we needed OpenSSL to establish secure connections to external domains. However, when using dynamic GCC, we couldn't link OpenSSL statically.

Why did we use GCC? It was necessary for compiling PHP and its extensions. The issue was that we required a statically-built GCC to link statically, which is not straightforward and necessitates a lot of work. As a result, we sought out alternatives.

Thankfully, we discovered **MUSL** and **Alpine**, which allowed us to build a fully static version of **PHP** with all required extensions and libs without difficulty! Furthermore, it works on any Linux distribution.

**Alpine Linux** is an excellent choice for compiling C programs due to its small size and lightweight nature, making it suitable for systems with limited resources such as embedded devices or containers. Additionally, Alpine Linux is secure. It employs a hardened kernel and few packages, limiting the attack surface and making it less vulnerable to security threats. This is particularly important for C programs, which can be susceptible to security vulnerabilities.

In addition, Alpine Linux employs **MUSL** libc as its standard C library, which is a lightweight and efficient C library that results in faster and more efficient code than other C libraries.

As a result, we utilized it and set up actions to utilize an **Alpine** image and build it in **Docker**. The beauty of this approach is that it also made it easier for us to build for ARM because **Docker** has the `buildx` command, allowing us to utilize QEMU in a ready-to-build schema and accomplish the same flow to build for AMD and ARM architectures on the same machine! Check out our build flow [here](https://github.com/manticoresoftware/executor/blob/main/.github/workflows/release.yml).

**Github Actions** automates the building and deploying of packages for all supported operating systems. For users, installation is simple: just run `apt-get install manticore-executor` or `yum install manticore-executor`, and you'll have a **PHP** version ready to use with all necessary packages pre-installed to run any Manticore-shipped **PHP** project. Easy!

## How we ship our source code

At Manticore Search, we faced the challenge of providing our PHP application, made up of multiple source code files, to the user. We had many files that were spread across separate folders and dependencies that had to be installed with Composer, making the installation process complicated.

As you remember, we developed a custom PHP version, called [manticore-executor](https://github.com/manticoresoftware/executor), which could be easily installed from repositories. However, this still did not solve the problem of providing the entire PHP application to the user.

We found a solution in **PHAR**, which allowed us to build a single file that could be added as a package to the repository. This simplified the installation process. However, ensuring that all dependencies were included correctly in the final **PHAR** archive was tricky. To solve this, we created and separated an external build system, which we also use for our [manticore-backup](https://github.com/manticoresoftware/manticoresearch-backup) tool.

To make the package executable, we decided to use a Bash and Shebang script with our **Manticore-Executor** package. This script checks the date of the modified **PHAR** in the system's temporary folder and extracts the **PHAR** data there, allowing for multiple launches that remain performant and up-to-date on new versions installed. For more information on how we implemented this, you can refer to our [phar_builder](https://github.com/manticoresoftware/phar_builder/blob/main/templates/sh) project on **GitHub**.

## Lessons learned

1. Start with a simple, basic solution without using software design patterns when uncertain about the future success of a project. Prioritize validation first and then refactor and iterate on updates.
2. Concurrency in **PHP** can be challenging, but using threading and async frameworks can help achieve high throughput. For optimal performance, it's recommended to use both. Preallocating runtimes for threads can help reach desired performance.
3. Simplify the shipping process for users. Reduce the number of instructions needed. In our case, one **PHAR** archive and one binary with all included extensions for our custom **PHP** solved the issue.
4. Use the most recent versions of **PHP** or other tools to stay on top of the latest developments and keep your data secure. Outdated software can be vulnerable to hacks and security breaches. Upgrading offers improved performance, the latest features, and an efficient coding process.
5. Look for packages that can solve your problem and examine their dependencies. Choose packages with minimal dependencies to avoid dependency hell. Use small packages like building blocks rather than creating a custom solution.

## Final outcome

Throughout the development of **Buddy**, we faced numerous challenges, which we overcame with excitement. The tool is entirely written in **PHP** and is shipped as an OS package, making it incredibly easy for users to install and for us to maintain and automate builds, thanks to **GitHub Actions**. While there is still room for improvement to make the tool even simpler, our story demonstrates how it's possible to build an easy-to-maintain and easy-to-install tool, all with the power of **PHP**.

We hope you enjoyed reading about our journey leading up to the release of [Manticore 6.0.0](https://manticoresearch.com/blog/manticore-search-6-0-0/). Be sure to stay tuned for our next article, where we explore the new pluggable design in **Buddy** and its easy-to-contribute ecosystem, which benefits the entire community. It's truly an exciting time, and we can't wait to share more with you.
