IMVU's Transformation: Leveraging AWS for Advanced Analytics and Data Streaming
- Analytics & Modeling - Machine Learning
- Platform as a Service (PaaS) - Application Development Platforms
- Cement
- Construction & Infrastructure
- Logistics & Transportation
- Product Research & Development
- Behavior & Emotion Tracking
- Livestock Monitoring
- Data Science Services
IMVU, the world’s largest avatar-based social network, was facing challenges with its on-premises infrastructure which was limiting its capacity for advanced analytics. The company had been an early adopter of Apache Hadoop and Big Data technologies, but found it difficult to support and upgrade their 90-node Hadoop cluster and in-house built tooling. IMVU’s analysts lacked the tools to rapidly generate a range of business-critical reports on customer in-game behavior at scale. They were working with historical data in batches, which made analytics more complex and created multiple bottlenecks. Late reports resulted in inaccurate assumptions about customer in-game purchases, leading to slower sales and loss of profit. The analytics team also lacked a test environment to efficiently check analytics assumptions. The company sought to modernize their platform’s data architecture by introducing CI/CD, Infrastructure as Code (IaC), and other best practices, to achieve faster analytics iterations, better maintainability, and lower TCO.
IMVU is the world’s largest avatar-based social network, with over 7 million users monthly. Users can customize their avatars, chat with friends, shop, hang out at cool parties, and earn real money by creating virtual products. The company was one of the pioneers and early adopters of Apache Hadoop, taking advantage of Big Data technologies before they became mainstream. IMVU was looking to enhance and re-architect their platform by augmenting it through advanced analytics and data streaming. The company sought to modernize their platform’s data architecture by introducing CI/CD, Infrastructure as Code (IaC), and other best practices, to achieve faster analytics iterations, better maintainability, and lower TCO.
IMVU partnered with Provectus and AWS to drive innovation, implement streaming architecture, enable advanced analytics, and build several AI-powered apps to improve customer retention in the long term. They developed a comprehensive migration and modernization strategy towards NextGen architecture. Their data pipelines were modernized and re-architected to meet the requirements for a modern Data Platform, utilizing open source solutions and AWS services. The Data Platform was designed to use Airflow for both job scheduling and monitoring, to run on Amazon EKS. Data pipelines were optimized to utilize Amazon’s EMR Autoscaling policies. Provectus optimized Hive/Spark jobs, decoupled compute and storage layers, introduced a new query engine based on Apache Presto and implemented concise and self-documented delivery pipelines. The Business Intelligence (BI) layer was optimized to serve the needs of IMVU’s data analytics team. As part of data platform modernization and re-architecture, Provectus migrated Hadoop clusters to Amazon EMR with Data and Compute decoupling.