Download PDF
Altair > Case Studies > Optimizing Genome Pipelines: A Case Study of Wellcome Sanger Institute
Altair Logo

Optimizing Genome Pipelines: A Case Study of Wellcome Sanger Institute

Technology Category
  • Application Infrastructure & Middleware - Database Management & Storage
  • Infrastructure as a Service (IaaS) - Cloud Storage Services
Applicable Industries
  • Equipment & Machinery
  • Oil & Gas
Applicable Functions
  • Product Research & Development
Use Cases
  • Time Sensitive Networking
Services
  • Cloud Planning, Design & Implementation Services
The Challenge
The Wellcome Sanger Institute, a leading center for genomic discovery, was facing a significant challenge in managing the immense amount of data generated by their cancer genome projects. Each cancer sample produced approximately 250GB of data after initial processing, necessitating efficient data storage solutions. The team needed to make one of their cancer pipelines portable and optimize it for cloud deployment. Most pipelines were written and tested on local machines and then run in parallel on compute clusters with shared storage. However, the I/O behavior on clusters was very different, and without comprehensive I/O profiling tools, inefficient I/O patterns could negatively impact storage performance and hinder other work processes.
About The Customer
The Wellcome Sanger Institute is a globally recognized center for genomic discovery and understanding. It spearheads ambitious collaborations worldwide to lay the groundwork for further research and transformative healthcare innovations. The Institute's Cancer Genome Project uses high-throughput genome sequencing to identify somatically acquired mutations, with the goal of characterizing cancer genes, mutational processes, and patterns of clonal evolution in human tumors. The Institute is at the forefront of genomic research, carrying out genome projects to find cures for cancer, a disease that, according to Cancer Research UK, will affect 1 in 2 people born after 1960 at some point in their lives.
The Solution
The Wellcome Sanger Institute employed Altair Mistral™ to profile the pipeline and identify inefficient I/O patterns. While the pipeline had been optimized in certain areas, Mistral revealed areas that required further improvement. It identified a large number of small reads, which could negatively impact computational performance and create suboptimal I/O patterns on shared storage. By optimizing these small reads, storage could run at maximum bandwidth with minimal impact on other jobs. The team also used Altair Breeze™ to profile the containerized workload in the cloud on Amazon Web Services (AWS). Breeze determined that the default storage option provided the best value over faster, more expensive options.
Operational Impact
  • The use of Breeze and Mistral I/O profiling tools led to significant time and cost savings for the Wellcome Sanger Institute during a complex and high-value project. The profiling work allowed the Institute's team to optimize its pipeline, making it portable and easy to run. The team was able to identify and implement 'easy wins' that could only have been discovered by measuring with the right tools. This optimization became increasingly important when scaling up to full genomes run in parallel, where speed and cost savings are paramount. The tools also helped the team make informed decisions about storage options, leading to significant cost savings without compromising performance.
Quantitative Benefit
  • Saved 10% of project costs by choosing a less expensive storage option without compromising performance.
  • Reduced run time from 32 hours to 18 hours.
  • Optimized pipeline to handle large amounts of memory, profile file I/O, and avoid small reads and writes.

Related Case Studies.

Contact us

Let's talk!

* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.