Since we introduced the backup tool in Manticore Search 6, backing up your data has become significantly easier. But we kept hearing the same question: "What about cloud storage?" Today, we're excited to announce that manticore-backup now supports S3-compatible storage with streaming uploads — no intermediate files, no local disk space headaches, just direct-to-cloud backups.
The Problem with Traditional Backups
When you're running Manticore Search in production, your datasets can grow quickly. Backing up to local storage has its limitations:
- Disk space constraints: You need free space equal to your backup size on the same machine
- Manual transfer steps: Backup locally, then upload to cloud storage
- Time overhead: The copy-then-upload dance doubles your backup window
- Complexity: Scripting reliable uploads with resume capability, encryption, and error handling
Streamable S3 Backup: How It Works
The new S3 storage support streams your backup data directly to S3-compatible storage. Here's what happens under the hood:
- No intermediate files: Data streams from Manticore straight to S3
- Automatic multipart uploads: Large files are automatically chunked and uploaded in parallel
- Built-in encryption: SSE-S3 encryption is enabled by default for AWS S3 (configurable for other providers)
- Compression support: Optional zstd compression reduces transfer time and storage costs
- Manifest-based restore: No
s3:ListBucketpermission required for restores
Supported Storage Providers
We've tested with AWS S3, MinIO, and Cloudflare R2, but any S3-compatible storage should work. The implementation uses the standard AWS SDK for PHP, so if it speaks the S3 API, it should work.
Usage
Using S3 backup is as simple as changing your destination path:
CLI
# Set your credentials
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_REGION=us-east-1
# Backup to S3
manticore-backup --config=/etc/manticore/manticore.conf --backup-dir=s3://my-bucket/manticore-backups
# With custom endpoint (MinIO, Wasabi, etc.)
export AWS_ENDPOINT_URL=https://minio.example.com
manticore-backup --config=/etc/manticore/manticore.conf --backup-dir=s3://my-bucket/backups
Environment Variables
| Variable | Description |
|---|---|
AWS_ACCESS_KEY_ID | Your S3 access key |
AWS_SECRET_ACCESS_KEY | Your S3 secret key |
AWS_REGION | S3 region (e.g., us-east-1) |
AWS_ENDPOINT_URL | Custom endpoint for S3-compatible storage |
AWS_S3_ENCRYPTION | Set to 0 to disable SSE-S3 encryption (for MinIO/custom endpoints) |
Performance Considerations
S3 streaming backup performance depends primarily on your network bandwidth and the S3 provider's upload speeds. Unlike local disk backups where you're limited by disk I/O, S3 backups are network-bound. The key advantage is eliminating the "write locally, then upload" overhead — data streams directly from Manticore to S3 without touching the local filesystem.
For optimal performance:
- Ensure adequate upload bandwidth to your S3 endpoint
- Consider using compression (
--compress) to reduce data transfer - Multipart uploads are automatic for files over 5MB, improving reliability for large datasets
Restore from S3
Restoring works seamlessly too. The tool downloads files to a temporary directory first, then performs the restore:
# List available backups
manticore-backup --backup-dir=s3://my-bucket/manticore-backups --list
# Restore a specific backup
manticore-backup --config=/etc/manticore/manticore.conf --backup-dir=s3://my-bucket/manticore-backups --restore=backup-20250115120000
Required S3 Permissions
For backup:
s3:PutObjects3:PutObjectAcl(if using ACLs)
For listing backups:
s3:ListBucket
For restore:
s3:GetObject
Note: While listing backups requires s3:ListBucket, restoring a specific backup does not. If you know the backup folder name (e.g., backup-20250115120000), you can restore directly using --restore with just s3:GetObject permission. The manifest file tracks all backup contents, so no directory listing is needed.
Use Cases
Cloud-Native Deployments
Running Manticore in Kubernetes or Docker? S3 backup fits naturally into cloud-native workflows:
# Kubernetes CronJob example
apiVersion: batch/v1
kind: CronJob
metadata:
name: manticore-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: manticoresearch/manticore:latest
command:
- manticore-backup
- --config=/etc/manticore/manticore.conf
- --backup-dir=s3://my-backup-bucket/manticore
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: s3-credentials
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: s3-credentials
key: secret-key
restartPolicy: OnFailure
Disaster Recovery
Store backups in a different region or even a different cloud provider:
# Primary backup to local S3-compatible storage
export AWS_ENDPOINT_URL=https://minio.internal.company.com
manticore-backup --backup-dir=s3://backups-primary/manticore
# Secondary backup to AWS S3 for DR
unset AWS_ENDPOINT_URL
export AWS_REGION=eu-west-1
manticore-backup --backup-dir=s3://company-dr-backups/manticore
Reducing Local Storage Requirements
For large datasets, local backup storage can be expensive. With S3 streaming:
- No need to provision large backup volumes
- Pay only for the S3 storage you use
- Lifecycle policies can automatically move old backups to cheaper storage classes
Technical Details
Streaming Architecture
The S3 storage implementation uses a streaming approach:
- File-by-file streaming: Each table file is read and uploaded as a stream
- Automatic multipart: Files over 5MB automatically use multipart upload for reliability
- Compression on-the-fly: If enabled, zstd compression happens during the stream
- Checksum verification: Each file is checksummed to ensure integrity
Storage Interface
The S3 support is built on a new StorageInterface that abstracts storage operations. This means:
- Local filesystem and S3 share the same code path
- Future storage backends (GCS, Azure Blob) can be added easily
- Consistent behavior regardless of storage type
Migration from Local Backups
Already using local backups? Migration is straightforward:
- Set up your S3 credentials
- Change
--backup-dirfrom/local/pathtos3://bucket/path - That's it! The same commands work exactly the same way
Your existing local backups remain accessible, and you can gradually transition to S3 or maintain both for redundancy.
Conclusion
S3 streamable backup brings Manticore Search backup capabilities to the cloud era. Whether you're running in a cloud-native environment, need cross-region disaster recovery, or simply want to reduce local storage overhead, direct-to-S3 streaming makes backups simpler and more efficient.
The feature is available now in manticore-backup. Check out the documentation for more details, and let us know what you think!
Ready to try it? Install Manticore Search and start backing up to S3 today. Questions or feedback? Join us on Slack or GitHub .
