Bioinformatics:Interim Update October 2006

The Bioinformatics project aims to build and evaluate a protein identification server (BUPID), the project’s overall research objective. BUPID—probability-based protein identification by searching sequence databases using peptide mass fingerprint data—is designed as a web-based search engine offering robust and accurate statistical modeling for protein identification using MALDI MS data. Development of BUPID and related software, independent of commercially available software, is envisioned as a way to have full control of source code as well as the means to facilitate needed modifications, add supplementary functionalities, and share with collaborators.

Progress: The BUPID prototype has been developed and it is being disseminated to the scientific community by means of presentation at scientific meetings, website, and downloadable copies. BU-PID researchers have evaluated ways to speed it up by optimizing its algorithm. They parallelized the program to run on multiple CPUs. They improved sensitivity and specificity of BUPID searches, which are at least as good as those of leading commercial software searches such as SWISS-PROT and Mascot. BUPID now has a better scoring function and additional functionality. Previously, researchers performed large-scale testing of the BUPID server with artificially-generated data; over a thousand mass spectra with randomized theoretical peptide masses were generated for testing during the two-year period in which BUPID was receiving sample data from the CPC’s various proteomics projects.

Extending BU-PID to the cryoFTMS project, researchers built a prototype for the feedback loop between BU-PID and the MALDI-cryoFTMS instrument.

The BUPID user interface offers standard parameter limits, such as taxonomy, enzyme, maximum cleavage, peptide mass tolerance and mass value, peak type. Advanced options for database searching also are available.

BUPID software is available online to the research community.

