DOI
Source Code
Data
Projects
Share
High-Performance Reductive Strategies for Big Data from LC-MS/MS Proteomics

Awan, Muaaz Gul; , Western Michigan University (2019).

Abstract

Mass Spectrometry (MS)-based proteomics utilizes high performance liquid chromatography in tandem with high-throughput mass spectrometers. These experiments can produce MS data sets with astonishing speed and volume that can easily reach peta-scale level, creating storage and computational problems for large-scale systems biology studies. Each spectrum output by a mass spectrometer may consist of thousands of peaks, which must all be processed to deduce the corresponding peptide. However, only a small percentage of peaks in a spectrum are useful for further processing, as most of the peaks are either noise or are not useful. Our experiments have shown that 90 to 95% of the peaks are not required for reliable results. This leads to a lot of redundant processing and causes a hindrance to high-throughput processing of big MS data. The existing pre-processing algorithms for noise-removal or …