Sunday, November 23, 2025

Horizontal scaling of Chroma DB

Horizontal scaling of Chroma DB

  • Chroma DB has become the critical component in our ads infrastructure for serving ads to our end users from our indirect ads source.
  • Our Ads Search service does the cosine similarity to determine from the indirect ads source to determine the closest related ads.
  • Increase in the volume of users has necessitated us in exploring the options to scale the Chroma DB infrastructure.
  • Our chroma db has ads from our third party advertisers that are published once per day. So it is mostly a read heavy system.

Challenges in scaling -

  1. The collection id is generated as uuid on the chroma instance on the chroma service in the v2 apis. So, this defeats the idea of just putting a load balancer directly behind a bunch of chroma nodes.
  2. We also need to have high availability of the chroma infrastructure as it is a critical path and any downtime would result in loss of revenue. 
  3. We currently have east / west chroma instances serving the traffic for east / west search service traffic in the US. However, we also want to improve the resiliency by having search service as fallbacks.

Infrastructure


  • Our infra in a particular region looks like the above picture.
  • The indirect ads pipelines ingests and after filters the undesired traffic based on various classifiers. In its final stages it embeds the ad data and publishes that to one of our chroma write replica.
  • After publishing is complete the written chroma replica compresses the sqllite and other metadata files and uploads it to the object storage.
  • Pipeline later kicks of download and restore on each of read replicas currently in a sequential way. Although we have plans to do it in a batchwise where we can batch wise do the replication in parallel across east / west clusters.

Accomplishments

  • We achieved the goals we set forth to achieve at the beginning
  • In the process we learned the multi region replication based on a storage based off of one geography will become a bottle neck or the slowest part of the infra when we are doing multi gig transfers. We have plans to create multi storages accounts for each region.
  • Average latency for the search api which is the actual consumer for the ads data dropped. This has given us enough confidence that we can scale our services close to 5Mn requests per day which is what the team hopes to achieve.






 

No comments:

Post a Comment

Horizontal scaling of Chroma DB

Horizontal scaling of Chroma DB Chroma DB has become the critical component in our ads infrastructure for serving ads to our end users from ...