Download PDF
Optimizing Genome Pipelines: A Case Study of Wellcome Sanger Institute
Technology Category
- Application Infrastructure & Middleware - Database Management & Storage
- Infrastructure as a Service (IaaS) - Cloud Storage Services
Applicable Industries
- Equipment & Machinery
- Oil & Gas
Applicable Functions
- Product Research & Development
Use Cases
- Time Sensitive Networking
Services
- Cloud Planning, Design & Implementation Services
The Challenge
The Wellcome Sanger Institute, a leading center for genomic discovery, was facing a significant challenge in managing the immense amount of data generated by their cancer genome projects. Each cancer sample produced approximately 250GB of data after initial processing, necessitating efficient data storage solutions. The team needed to make one of their cancer pipelines portable and optimize it for cloud deployment. Most pipelines were written and tested on local machines and then run in parallel on compute clusters with shared storage. However, the I/O behavior on clusters was very different, and without comprehensive I/O profiling tools, inefficient I/O patterns could negatively impact storage performance and hinder other work processes.
About The Customer
The Wellcome Sanger Institute is a globally recognized center for genomic discovery and understanding. It spearheads ambitious collaborations worldwide to lay the groundwork for further research and transformative healthcare innovations. The Institute's Cancer Genome Project uses high-throughput genome sequencing to identify somatically acquired mutations, with the goal of characterizing cancer genes, mutational processes, and patterns of clonal evolution in human tumors. The Institute is at the forefront of genomic research, carrying out genome projects to find cures for cancer, a disease that, according to Cancer Research UK, will affect 1 in 2 people born after 1960 at some point in their lives.
The Solution
The Wellcome Sanger Institute employed Altair Mistral™ to profile the pipeline and identify inefficient I/O patterns. While the pipeline had been optimized in certain areas, Mistral revealed areas that required further improvement. It identified a large number of small reads, which could negatively impact computational performance and create suboptimal I/O patterns on shared storage. By optimizing these small reads, storage could run at maximum bandwidth with minimal impact on other jobs. The team also used Altair Breeze™ to profile the containerized workload in the cloud on Amazon Web Services (AWS). Breeze determined that the default storage option provided the best value over faster, more expensive options.
Operational Impact
Quantitative Benefit
Related Case Studies.
Case Study
Smart Water Filtration Systems
Before working with Ayla Networks, Ozner was already using cloud connectivity to identify and solve water-filtration system malfunctions as well as to monitor filter cartridges for replacements.But, in June 2015, Ozner executives talked with Ayla about how the company might further improve its water systems with IoT technology. They liked what they heard from Ayla, but the executives needed to be sure that Ayla’s Agile IoT Platform provided the security and reliability Ozner required.
Case Study
IoT enabled Fleet Management with MindSphere
In view of growing competition, Gämmerler had a strong need to remain competitive via process optimization, reliability and gentle handling of printed products, even at highest press speeds. In addition, a digitalization initiative also included developing a key differentiation via data-driven services offers.
Case Study
Taking Oil and Gas Exploration to the Next Level
DownUnder GeoSolutions (DUG) wanted to increase computing performance by 5 to 10 times to improve seismic processing. The solution must build on current architecture software investments without sacrificing existing software and scale computing without scaling IT infrastructure costs.
Case Study
Predictive Maintenance for Industrial Chillers
For global leaders in the industrial chiller manufacturing, reliability of the entire production process is of the utmost importance. Chillers are refrigeration systems that produce ice water to provide cooling for a process or industrial application. One of those leaders sought a way to respond to asset performance issues, even before they occur. The intelligence to guarantee maximum reliability of cooling devices is embedded (pre-alarming). A pre-alarming phase means that the cooling device still works, but symptoms may appear, telling manufacturers that a failure is likely to occur in the near future. Chillers who are not internet connected at that moment, provide little insight in this pre-alarming phase.
Case Study
Premium Appliance Producer Innovates with Internet of Everything
Sub-Zero faced the largest product launch in the company’s history:It wanted to launch 60 new products as scheduled while simultaneously opening a new “greenfield” production facility, yet still adhering to stringent quality requirements and manage issues from new supply-chain partners. A the same time, it wanted to increase staff productivity time and collaboration while reducing travel and costs.