Automated <em>N</em>-glycopeptide identification in glycoproteomics — ASN Events

Automated N-glycopeptide identification in glycoproteomics (#134)

Ling Yen Lee 1 , Edward S.X. Moh 1 , Benjamin L. Parker 2 , Marshall Bern 3 , Nicki H. Packer 1 , Morten Thaysen-Andersen 1
  1. Biomolecular Discovery Research Centre, Macquarie University, Sydney, NSW, Australia
  2. Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
  3. Protein Metrics Inc., San Carlos, CA

Recent developments in LC-MS/MS-based N-glycoproteomics including advances in software-driven glycopeptide identification have facilitated biochemical studies reporting thousands of intact N-glycopeptides. In this early phase it is particularly important to scrutinize the automated glycopeptide identification to ensure confidence in the process. Herein, we explore the accuracy of software-assisted site-specific glycoprofiling using the PTM-centric search-engine Byonic (Protein Metrics) relative to manual expert annotation. To allow an appropriately deep comparison, the study utilised glycoproteomics-typic acquisition and data analysis strategies, but of a single glycoprotein, the previously uncharacterised triply (Asn160, Asn268 and Asn302) N-glycosylated human basigin. Initially, detailed site-specific reference N-glycoprofiles of purified basigin were established using manual annotation and relative quantitation of ion trap (CID) and high-resolution Q-Exactive Orbitrap (HCD) LC-MS/MS data of tryptic N-glycopeptides. Conventional N-glycome profiling supported the manual annotation. The N-glycosylation sites of basigin showed extensive and diverse micro- and macro-heterogeneity. Subsequently, the basigin peptide mixture was glycoprofiled using Byonic with or without a background of complex peptides using the same Q-Exactive Orbitrap LC-HCD-MS/MS data. By monitoring the software-assisted glycoprofiling accuracy and coverage relative to the reference profile, the influence of multiple search parameters and scoring thresholds in Byonic was investigated. In general, the search criteria and confidence thresholds suggested by the vendor provided highly accurate and sensitive automated glycopeptide detection. As expected, several search parameters i.e. search space (proteome and glycome size), mass tolerance and peptide modifications and the confidence thresholds influenced the glycoprofiling accuracy and coverage in particularly for Asn268 basigin peptides displaying extensive peptide heterogeneity formed by incomplete trypsinization and methionine oxidation and, unexpectedly, also carbamidomethylation. The latter produced abundant neutral losses in the HCD-MS/MS spectra that were left unassigned by Byonic reducing the identification scores. These are valuable lessons learned in our ambition to ensure high glycoprofiling accuracy and coverage as we transition rapidly into automated FDR-based N-glycopeptide identification in glycoproteomics.