Download PDF
WinWire > Case Studies > Hadoop to Apache Spark Migration: A Case Study on Performance Improvement
WinWire Logo

Hadoop to Apache Spark Migration: A Case Study on Performance Improvement

 Hadoop to Apache Spark Migration: A Case Study on Performance Improvement - IoT ONE Case Study
Technology Category
  • Analytics & Modeling - Big Data Analytics
  • Platform as a Service (PaaS) - Application Development Platforms
Use Cases
  • Time Sensitive Networking
Services
  • Data Science Services
The Challenge
The customer, a leading American multinational software firm, was facing significant challenges with their existing Big Data platform. They had initially created a solution using the Hadoop Map Reduce engine and Hive Queries (HQL), but this setup was proving to be inefficient. The main issues were slower code execution speed, higher storage requirements, and difficulty in maintaining workflows. These issues were impacting their business performance and slowing down their digital innovation. As part of a multiyear initiative, the company was planning to move their Big Data platform from Cloudera Hadoop On-Prem instance to Cloudera Data Platform (CDP) on Azure. The first step in this process was to explore the prioritized MapReduce jobs in the current state and consider migrating them to Spark to reduce execution and processing time.
The Customer

American multinational computer software company

About The Customer
The customer is an American multinational computer software company. They are known for their game-changing innovations that are redefining the possibilities of digital experiences. As a leader in their field, they are constantly looking for ways to improve their operations and stay ahead of the competition. Their commitment to digital innovation is evident in their multiyear initiative to move their Big Data platform to Azure. However, they were facing challenges with their existing setup, which was slowing down their progress and impacting their business performance.
The Solution
WinWire, in collaboration with the customer, took on the challenge of converting two prioritized jobs [LTV & AES] from MapReduce to Spark. These jobs were categorized as high complexity. The WinWire team successfully transitioned the MapReduce code to Spark code, enabling the customer to process data faster and improve the overall performance of the job. This transition resulted in a reduction of the execution time by more than 50%. This successful migration not only addressed the immediate issues of slow execution speed, high storage requirements, and workflow maintenance but also set the stage for the subsequent move of the Big Data platform to Azure.
Operational Impact
  • The migration from Hadoop MapReduce to Apache Spark resulted in significant operational improvements for the customer. The most notable improvement was the reduction in execution and processing time by more than 50%. This allowed the customer to process data faster and improve the overall performance of their jobs. Additionally, the transition made it easier for the customer to maintain their workflows, reducing the time and resources required for this task. The successful migration also paved the way for the next step in their multiyear initiative - moving their Big Data platform to Azure. This move will further enhance their operational efficiency and enable them to continue leading the way in digital innovation.
Quantitative Benefit
  • Reduced the execution & processing time of jobs by 50%
  • Transitioned high complexity jobs from MapReduce to Spark
  • Set the stage for the subsequent move of the Big Data platform to Azure

Related Case Studies.

Contact us

Let's talk!

* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.