Full text
DOI
Source Code
Data
Projects
Share
High Performance Computing Algorithms for Accelerating Peptide Identification from Mass-Spectrometry Data Using Heterogeneous Supercomputers

Haseeb, Muhammad; Saeed, Fahad; , (2023).

Abstract

Fast and accurate identification of peptides and proteins from the mass spectrometry (MS) data is a critical problem in modern systems biology. Database peptide search is the most commonly used computational method to identify peptide sequences from the MS data. In this method, giga-bytes of experimentally generated MS data are compared against tera-byte sized databases of theoretically simulated MS data resulting in a compute-and data-intensive problem requiring days or weeks of computational times on desktop machines. Existing serial and high performance computing (HPC) algorithms strive to accelerate and improve the computational efficiency of the search, but exhibit sub-optimal performances due to their inefficient parallelization models, low resource utilization and high overhead costs.