Download PDF
Altair > Case Studies > Clemson University's Adoption of PBS Professional for Enhanced HPC Workload Management
Altair Logo

Clemson University's Adoption of PBS Professional for Enhanced HPC Workload Management

Technology Category
  • Application Infrastructure & Middleware - Data Visualization
  • Networks & Connectivity - Ethernet
Applicable Industries
  • Cement
  • Education
Applicable Functions
  • Procurement
  • Product Research & Development
Use Cases
  • Inventory Management
  • Smart Campus
Services
  • System Integration
  • Training
The Challenge
Clemson University's IT department, Clemson Computing and Information Technology (CCIT), was facing a significant challenge in managing the workload of their rapidly growing user base. The department utilized the Palmetto cluster, a 17,032-core, 262 TFlop HPC system, as the university's primary HPC resource. This system was heavily used by the university's faculty, staff, students, and 144 external users, including researchers and faculty from other universities. The cluster operated on a 'condo model', where users could purchase nodes for their own priority usage. However, the open-source Maui scheduler previously used by CCIT was unable to handle the scalability and reliability needs of their expanding user base. The system frequently crashed and some advanced features did not function properly, leading to unreliability with the scheduler.
About The Customer
Clemson University is a major land-grant, science- and engineering-oriented research university that ranks in the top 25 among national public universities. The university is committed to teaching and student success, fostering an inclusive, student-centered community characterized by high academic standards, a culture of collaboration, school spirit, and a competitive drive to excel. The university's IT department, Clemson Computing and Information Technology (CCIT), provides cyberinfrastructure resources and advanced research computing capabilities. CCIT supports an array of advanced computing infrastructure made possible through the integration of high-performance computing (HPC), high-performance networks, data visualization, storage architectures, and middleware.
The Solution
To address the challenges, CCIT decided to adopt a commercial-grade workload management solution. After evaluating several vendors, they chose Altair’s PBS Professional® for its massive scalability and technical support. The PBS Professional scheduling software was able to meet the HPC needs of the university, providing reliability and scalability that the previous open-source tool could not handle. Altair's technical team provided comprehensive support, helping CCIT understand the advanced features of PBS Professional before purchase and offering hands-on training before the installation process. The cost was also a crucial factor in the decision-making process. Altair was able to provide an attractive academic pricing offer that fit within CCIT's budget. The implementation of PBS Professional began in September 2011, supporting 1,623 nodes. Today, the node count has increased to 1,804, and PBS Professional can easily scale to support additional nodes for the rapidly growing user base.
Operational Impact
  • The adoption of PBS Professional has led to improved usability and productivity for CCIT and the university's users. The HPC administration overhead has been significantly reduced, and the demand for end-user support has decreased due to the immediate and automatic feedback provided by PBS Professional's hooks plug-in technology. Users can now easily submit numerous jobs, even queuing up thousands of jobs with confidence in their execution by the scheduler. The system is also integrated with Clemson’s “Hadoop on demand” job framework, which uses myHadoop with their own customized open source file system, OrangeFS. This integration has led to major efficiency benefits as PBS jobs can directly access data stored on OrangeFS from any compute node without the need for data staging, and the data persists between jobs.
Quantitative Benefit
  • PBS Professional supports 1,804 nodes, up from 1,623 nodes at the time of implementation.
  • The system is scalable and can support additional nodes for the rapidly growing user base.
  • The Palmetto Cluster is benchmarked at 262 TFlops and is connected to Internet2's 100 GbE Advanced Layer 2 Service.

Related Case Studies.

Contact us

Let's talk!

* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.