FireworkTV's Infrastructure Overhaul: Enhancing Video Recommendation System with AWS
- Analytics & Modeling - Machine Learning
- Sensors - Camera / Video Systems
- Cement
- Construction & Infrastructure
- Quality Assurance
- Construction Management
- Infrastructure Inspection
- System Integration
- Training
FireworkTV, a decentralized short video network, was facing challenges with its existing machine learning (ML) infrastructure. The ML team recognized the limitations of their current system, which included lagging productivity, growing overhead costs, and a lack of automation. These issues were hindering the performance, quality, and reliability of their video recommendation model. The model, which is crucial for engaging users and driving ad revenue, needed to deliver highly accurate and real-time recommendations based on user-video interactions and specific content features. However, the existing infrastructure, based on Lambda and PyTorch, was not only expensive but also cumbersome, limiting the project's potential to scale and grow. The team sought to build a new, more efficient infrastructure on AWS to drive improvements.
FireworkTV is the world's first decentralized short video network. It connects creators, fans, and engaged audiences by curating interactive 30-second videos that are tailored to each person's unique lifestyle and tastes. The company is developing a Netflix-like personalized video recommender system that delivers engaging and interactive videos, tailored to users' unique lifestyles. To boost app usage and drive ad revenue, FireworkTV relies heavily on its video recommendation system, which needs to deliver highly accurate and real-time recommendations to engage users.
In collaboration with Provectus, FireworkTV's ML team reviewed the existing infrastructure and inference process and decided to build a new ML infrastructure using Amazon SageMaker. The inference and training pipelines were migrated to Amazon SageMaker, enabling at-scale deployment. The team also proposed moving from EC2 instances to CPUs for ML model serving. This shift away from Lambda was aimed at decreasing admin overhead, unifying the ML tool stack for ease of use by engineers, and moving towards an automated pipeline. The performance and costs of the inferencing and training pipelines were compared post-migration, demonstrating a significant improvement. The new infrastructure provided a more efficient system to serve and train ML models, improving productivity, cutting overhead costs, and increasing user satisfaction.