DOI
Source Code
Data
Projects
Share
Cams-rs: clustering algorithm for large-scale mass spectrometry data using restricted search space and intelligent random sampling

Saeed, Fahad; Hoffert, Jason D; Knepper, Mark A; , IEEE Computer Society Press IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 11 :128-141 (2014).

Abstract

High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many of them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present an efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra using Restricted Space and Sampling) for clustering of raw mass spectrometry data. CAMS-RS utilizes a novel metric (called F-set) that exploits the temporal and spatial patterns to accurately assess similarity between two given spectra. The F-set similarity metric is independent of the retention time and allows clustering of mass spectrometry data from independent LC-MS/MS runs. A novel restricted search space strategy is devised to limit the comparisons of the number of spectra. An intelligent sampling method is executed on individual …