Download PDF
ClickHouse > Case Studies > High-Speed Content Distribution Analytics for Disney+ with ClickHouse
ClickHouse Logo

High-Speed Content Distribution Analytics for Disney+ with ClickHouse

Technology Category
  • Infrastructure as a Service (IaaS) - Cloud Computing
  • Infrastructure as a Service (IaaS) - Cloud Middleware & Microservices
Applicable Industries
  • Buildings
  • Construction & Infrastructure
Applicable Functions
  • Logistics & Transportation
Use Cases
  • Last Mile Delivery
  • Time Sensitive Networking
Services
  • System Integration
The Challenge
Disney+'s Observability team was faced with the challenge of processing and analyzing access logs for their content distribution system. The team had to deal with a massive amount of data generated by the users of Disney+, which required a highly scaled and distributed database system. The existing solutions, such as Elasticsearch, Hadoop, and Flink, were not able to handle the volume of data efficiently. Elasticsearch, for instance, required a lot of rebalancing and used a Java virtual machine, adding an unnecessary layer of virtualization. The team was struggling to ingest all the logs due to the size of the data.
About The Customer
The customer in this case study is the Observability team within Disney+. This team is responsible for managing the peripheral systems that support Disney+'s content distribution system. They work with access logs for video data to help identify any issues like latency. The team had to deal with a massive amount of data generated by the users of Disney+, which required a highly scaled and distributed database system. The team was initially struggling with the ingestion of all the logs due to the size of the data, but after choosing ClickHouse, they have been able to efficiently process and analyze the logs.
The Solution
The Observability team at Disney+ chose ClickHouse as their data pipeline for processing and analyzing access logs. ClickHouse was chosen over other options due to its simplicity, single-binary setup, and lightweight architecture. The current cluster consists of 20 nodes with 2X replication, providing 160 terabytes of storage and 2.5 terabytes of RAM. This setup has allowed the team to write 3 million rows a second and read 2 billion rows a second of CDN access logs. The team is planning to upgrade to a larger cluster with 32 nodes each with 154 terabytes and 112 cores, which will significantly improve their data management capabilities. ClickHouse was set up like an HTTP server, making it flexible and easy to use. The team also recommended using unstructured logs in formats like JSON or key equals value to make the system more flexible.
Operational Impact
  • The adoption of ClickHouse has significantly improved the efficiency of Disney+'s Observability team. The lightweight architecture of ClickHouse has enabled the team to handle the massive amount of data generated by the users of Disney+. The team has been able to write 3 million rows a second and read 2 billion rows a second of CDN access logs. The planned upgrade to a larger cluster will further enhance their data management capabilities. The flexibility of ClickHouse, set up like an HTTP server, has made it easy to use. The team has also been able to make the system more flexible by using unstructured logs in formats like JSON or key equals value.
Quantitative Benefit
  • Able to write 3 million rows a second
  • Able to read 2 billion rows a second of CDN access logs
  • Current setup provides 160 terabytes of storage and 2.5 terabytes of RAM

Related Case Studies.

Contact us

Let's talk!

* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.