A bioinformatic solution for identifying non-genomic peptides in the immunopeptidome — ASN Events

A bioinformatic solution for identifying non-genomic peptides in the immunopeptidome (#103)

Kate E Scull 1 , David N Perkins 2 , Nadine L Dudek 3 , Anthony W Purcell 3 , Nicholas A Williamson 2
  1. Biochemistry and Molecular Biology, University of Melbourne, Parkville, VIC, Australia
  2. The Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Vic, Australia
  3. Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC, Australia

Human leukocyte antigen (HLA) molecules are cell-surface glycoproteins that present peptides for surveillance by T lymphocytes seeking signs of disease. It has been shown that the sequences of some class I HLA-bound peptides are not found in the genome, such as peptides arising from post-translational peptide splicing in the proteasome (Vigneron and Van den Eynde, 2012). Our group identifies large numbers of HLA-bound peptides by mass spectrometric analysis of purified immunopeptidome (i.e. HLA peptide) samples. However, non-genomic peptides such as spliced peptides are not susceptible to identification by conventional analysis software such as Mascot, since these search methods rely on matching tandem mass spectra to sequences in genome-based databases. This means that peptide sequences which are not present in the databases will be ignored or falsely identified. Here, we introduce software to address this problem by enabling searches to consider non-genomic sequences. Our program generates comprehensive ‘artificial databases’ which include all of the possible permutations of amino acids for peptides of a given length, then searches spectra using Mascot-based scoring. Our preliminary results, for which we analysed complex immunopeptidome samples by LC-MS/MS and searched for peptides of 8-11 amino acids in length, shows that the program can identify many conventional sequences in agreement with Mascot, ProteinPilot and PEAKS DB searches. Despite statistical challenges necessitating more manual inspection than desired, it has also revealed a number of novel, non-genomic sequences for further investigation by MRM. This technique has potential to aid not only studies of the immunopeptidome but also analyses of other peptide samples for which genomic protein databases are incomplete or inapplicable.

Vigneron, N., Van den Eynde, B.J. (2012). Proteasome subtypes and the processing of tumor antigens: increasing antigenic diversity. Curr. Opin. Immunol. 24, 84-91.