Is there a feature/config allowing for whole volumes/shards/etc to be exported to or hosted from cold storage? e.g. S3/GCS/Backblaze, and “re-loaded” on the fly?
Ideally automatic by Kubernetes spinning up another shard.
The concept is that I have a large dataset timeseries dataset, 98% of which will never be queried, but upon the rare need to query something from a document, that document will become part of the live DB (pay once for iops retrieval from cold storage -> SSD).
It seems like the “Data Lake” feature is about what I want here, but I don’t want to get saddled with repeated iops cost if queries or that 2% of active data becomes very active and is repeatedly queried.
Caching in front of cold storage, basically.
The explicit intent is to minimize bulk storage costs for mostly archival data, using cloud-based storage with “unlimited” scaling.
If that pattern doesn’t already exist in the feature set of Mongo:
would the existing plugin/integration architecture support implementing this?
(This question is based on a real-life example of a startup that was/is spending $200K+/year on scan costs on BigQuery becasue BQ is being used as a general app engine behind a data API)