-
Notifications
You must be signed in to change notification settings - Fork 629
Description
Is your feature request related to a problem? Please describe.
- Tempo queriers in multitenant mode issue delimited listings (delimiter="/", prefix=None) at startup to discover tenant prefixes.
- We use Dell ECS, which uses a flat index
- In Dell ECS, this listing operation is extremely expensive and slow.
- ECS must scan millions of object keys to simulate folder structures, unlike AWS S3 which optimizes this.
- In times of heavy load, this causes querier startup delays, which can end up in crash loops if the queriers do not start up before the liveness probes timeout
- The issue is specific to non-AWS object storage platforms that lack hierarchical indexing.
- Our buckets may contain upwards of 100 tenant prefixes, each with many thousands of objects, amplifying the problem.
- We’ve observed this behavior during recent outages, confirmed by storage operations and ECS engineers.
Describe the solution you'd like
We’d like Tempo to support alternative tenant discovery strategies that avoid delimited listings. Specifically:
Support for static tenant prefix lists or cached tenant indexes.
Staggered or delayed querier startup to reduce simultaneous load on the backend.
Describe alternatives you've considered
We’ve tried increasing querier startup timeouts, but this doesn’t prevent the initial delimited listing. Dell ECS does not currently support hierarchical indexing, so backend optimizations are limited.
Additional context
This issue is specific to non-AWS object storage platforms like Dell ECS, which do not optimize for delimited listings. AWS S3 handles these queries efficiently due to internal indexing strategies, but ECS must simulate hierarchy by scanning the flat index. We’ve confirmed this behavior with our storage operations team and observed it during recent outages. A sample bucket listing looks like:
PRE 102911-dev/
PRE 102911-prod/
PRE 284746-qa/
...
Each prefix contains hundreds to thousands of objects.
A more storage-agnostic approach would improve compatibility and reliability across platforms.