Gene annotation using peptide mass spectrometry
This webpage presents supplemental material relevant to the
paper "Gene annotation using peptide mass
spectrometry" (Stephen Tanner, Zhouxin Shen, Julio Ng, Liliana Florea, Roderic Guigó,
Steven P. Briggs, and Vineet Bafna), to appear in Genome Research. Please visit
our lab webpage to download our software or learn
more about our research.
Our largest source of mass spectra was HEK293 cell culture. The mass spectra are
available for download. (See here for a list of
samples). In addition, several data-sets were
downloaded from the PeptideAtlas repository.
The samples analyzed were as follows:
A8_IP
HUPO12_run31
HUPO12_run32
HUPO12_run33
HUPO12_run34
HUPO22_M_CA_S
HUPO28_b1-CIT
HUPO28_b1-SERUM
HUPO28_b2-CIT
HUPO28_b2-SERUM
HUPO28_b3-CIT
HUPO28_b3-SERUM
HUPO28_Ref-CIT
HUPO28_Ref-SERUM
HUPO29_b1-CIT_1
HUPO29_b1-CIT_win1
HUPO29_b1-CIT_win2
HUPO29_b1-EDTA_1
HUPO29_b1-EDTA_win1
HUPO29_b1-EDTA_win2
HUPO29_b1-HEP
HUPO29_b1-SERUM
HUPO34
HUPO37_b1-HEP_2LCQ
HUPO40
The annotation table for the genomic search contains all
spectrum annotations which passed the p-value cutoff. Annotations were compared to
the genomic locations of a corpus of known proteins (the "mapped proteins", as described
in the paper) These annotations are assigned to the following categories:
Cat1 - Perfect match to a mapped exon
Cat2 - Contained in a known exon, but has mismatches. Often a coding SNP; occasionally an out-of-frame match.
Cat3 - A single-exon peptide not matching any known protein
Cat4 - Perfect match to a known protein, spanning an exon
Cat5 - Imperfect match to a known protein, spanning an exon. Often a coding SNP.
Cat6 - An intron-spanning peptide, one end of which hits a known exon.
Cat7 - An intron-spanning peptide, neither end of which hits a known exon.
Click here to return to the supplemental data index,
or here to visit our lab webpage.