IMVU's Transformation: Leveraging AWS for Advanced Analytics and Machine Learning
- Analytics & Modeling - Machine Learning
- Platform as a Service (PaaS) - Application Development Platforms
- Cement
- Construction & Infrastructure
- Logistics & Transportation
- Product Research & Development
- Behavior & Emotion Tracking
- Livestock Monitoring
- Cloud Planning, Design & Implementation Services
- Data Science Services
IMVU, the world’s largest avatar-based social network, was facing challenges with its aging on-premise data platform. The company wanted to enhance and re-architect their platform to support advanced analytics and Machine Learning use cases. However, with an exponentially growing data volume and a monolithic Hadoop architecture, the IMVU team was struggling to efficiently utilize user-generated data. The existing infrastructure limited innovation and capacity for advanced analytics. IMVU’s analysts lacked the tools to rapidly generate business-critical reports on customer in-game behavior at scale. They were working with historical data in batches, which resulted in late reports, inaccurate assumptions about customer in-game purchases, slower sales, and loss of profit. The analytics team also lacked a test environment to efficiently check analytics assumptions. The platform was powered by a 90-node on-premise Hadoop cluster, which was not cost-efficient and resulted in high costs and low efficiency.
IMVU is the world’s largest avatar-based social network with over 7 million monthly users. The platform allows users to customize their avatars, chat with friends, shop, hang out at cool parties, and earn real money by creating virtual products. IMVU was one of the pioneers and early adopters of Apache Hadoop, taking advantage of Big Data technologies before they became mainstream. Despite maintaining deep internal expertise, the company was facing challenges in supporting and upgrading their 90-node Hadoop cluster and in-house built tooling. The company sought to modernize their platform’s data architecture by introducing CI/CD, Infrastructure as Code (IaC), and other best practices.
IMVU partnered with Provectus and AWS to re-architect their data platform for the AWS cloud. The company migrated Apache Hadoop clusters to Amazon EMR, optimized Hive/Spark jobs, and mirrored Apache Kafka data streams to the cloud. The data pipelines were modernized to meet the requirements for a modern Data Platform using open source solutions and AWS services. The platform was designed to use Airflow for job scheduling and monitoring, running on Amazon EKS. Data pipelines were optimized to utilize Amazon’s EMR Autoscaling policies, to account for increasing data volumes without sacrificing time of delivery for reports. The Business Intelligence (BI) layer was optimized to serve the needs of IMVU’s data analytics team. Provectus deployed and built a custom solution using PrestoSQL for data access and Apache Ranger for managing the security aspects of the Data Platform.