Exploiting thread-level and instruction-level parallelism to cluster mass spectrometry data using multicore architectures
Saeed,
Fahad; Hoffert,
Jason D; Pisitkun,
Trairak; Knepper,
Mark A; ,
Springer Vienna Network Modeling Analysis in Health Informatics and Bioinformatics
3
:1-19
(2014).
Abstract
Modern mass spectrometers can produce large numbers of peptide spectra from complex biological samples in a short time. A substantial amount of redundancy is observed in these data sets from peptides that may get selected multiple times in liquid chromatography tandem mass spectrometry experiments. A large number of spectra do not get mapped to specific peptide sequences due to low signal-to-noise ratio of the spectra from these machines. Clustering is one way to mitigate the problems of these complex mass spectrometry data sets. Recently, we presented a graph theoretic framework, known as CAMS, for clustering of large-scale mass spectrometry data. CAMS utilized a novel metric to exploit the spatial patterns in the mass spectrometry peaks which allowed highly accurate clustering results. However, comparison of each spectrum with every other spectrum makes the clustering problem …