Abstract: Massive monitoring systems that require high availability and performance for both ingestion and retrieval of data are often encountered with rogue streams of data having a high cardinality. The management of such high cardinality data sets for time-series data and a performance sensitive system is challenging. The challenges primarily arise as the time-series data sets, typically needs to be loaded onto a limited memory space before results can be returned to the client. This affects the number of incoming queries that can be supported simultaneously. Too many time-series can potentially degraded read performance and thereby affect user experience. Our proposed rate-limiting system described herein seeks to address a key availability issue on a high-volume, time-series system by using a dynamic cardinality computation in combination with a central assessment service to detect and block high cardinality data streams. As a result of this technical improvement, anomalous logging behavior is detected quickly, affected tenants are notified, and hardware resources are used optimally.
Authors: Deepak K Vasthimal (eBay Inc, USA); Sudeep Kumar (eBay Inc., USA)
Email: firstname.lastname@example.org, email@example.com