Integrated GlycoProteome Analyzer (I-GPA) for Automated Identification and Quantitation of Site-Specific N-Glycosylation

Park, Gun Wook; Kim, Jin Young; Hwang, Heeyoun; Lee, Ju Yeon; Ahn, Young Hee; Lee, Hyun Kyoung; Ji, Eun Sun; Kim, Kwang Hoe; Jeong, Hoi Keun; Yun, Ki Na; Kim, Yong-Sam; Ko, Jeong-Heon; An, Hyun Joo; Kim, Jae Han; Paik, Young-Ki; Yoo, Jong Shin

doi:10.1038/srep21175

Integrated GlycoProteome Analyzer (I-GPA) for Automated Identification and Quantitation of Site-Specific N-Glycosylation

Article
Open access
Published: 17 February 2016

Volume 6, article number 21175, (2016)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Integrated GlycoProteome Analyzer (I-GPA) for Automated Identification and Quantitation of Site-Specific N-Glycosylation

Download PDF

Gun Wook Park^1,2^na1,
Jin Young Kim¹^na1,
Heeyoun Hwang¹,
Ju Yeon Lee¹,
Young Hee Ahn³,
Hyun Kyoung Lee^1,2,
Eun Sun Ji^1,4,
Kwang Hoe Kim^1,2,
Hoi Keun Jeong^1,2,
Ki Na Yun^1,5,
Yong-Sam Kim⁶,
Jeong-Heon Ko⁶,
Hyun Joo An²,
Jae Han Kim⁷,
Young-Ki Paik⁸ &
…
Jong Shin Yoo^1,2

6184 Accesses
79 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

Human glycoproteins exhibit enormous heterogeneity at each N-glycosite, but few studies have attempted to globally characterize the site-specific structural features. We have developed Integrated GlycoProteome Analyzer (I-GPA) including mapping system for complex N-glycoproteomes, which combines methods for tandem mass spectrometry with a database search and algorithmic suite. Using an N-glycopeptide database that we constructed, we created novel scoring algorithms with decoy glycopeptides, where 95 N-glycopeptides from standard α1-acid glycoprotein were identified with 0% false positives, giving the same results as manual validation. Additionally automated label-free quantitation method was first developed that utilizes the combined intensity of top three isotope peaks at three highest MS spectral points. The efficiency of I-GPA was demonstrated by automatically identifying 619 site-specific N-glycopeptides with FDR ≤ 1%, and simultaneously quantifying 598 N-glycopeptides, from human plasma samples that are known to contain highly glycosylated proteins. Thus, I-GPA platform could make a major breakthrough in high-throughput mapping of complex N-glycoproteomes, which can be applied to biomarker discovery and ongoing global human proteome project.

pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification

Article Open access 05 September 2017

Quantitative Glycoproteomics for N-Glycoproteome Profiling

N-GlycositeAtlas: a database resource for mass spectrometry-based human N-linked glycoprotein and glycosylation site mapping

Article Open access 07 September 2019

Introduction

Protein N-glycosylation, one of the most prevalent post-translational modifications (PTMs) in proteins, plays important roles in biological systems via its influence on various processes, including adhesion, signaling through cellular recognition, and response to abnormal biological states. Because each N-glycosite on a glycoprotein consists of a mixture of numerous glycoforms, each protein glycoform is generally present at low concentrations (i.e., sub-stoichiometric). Alterations in the distribution of protein glycoforms, as well as the presence of aberrant glycoforms, are closely associated with a variety of illnesses, including cancer¹ and neurodegenerative diseases². The ability to identify of aberrant protein glycoforms and monitor changes in protein glycoform distribution in biological and clinical samples would facilitate a deeper understanding of glycoprotein structure–function relationships and would also aid in discovery of biomarkers associated with aberrant glycosylation.

Glycoproteomics has attracted a great deal of attention in recent decades. Mass Spectrometry (MS) technology, a powerful tool in proteomics in general, is also a core tool in glycoproteomics³. Nonetheless, efficient high-throughput global mapping of complex glycoproteomes is very difficult, mainly due to the exceptional complexity of the chemical and physical features of glycoprotein. To overcome these problems, various approaches have been explored, including glycan composition profiling focusing on glycan moieties released from glycoprotein^4,5, glycosite profiling focusing on deglycosylated sites following endoglycosidase treatment^6,7,8, and differential quantitation of protein glycoforms fractionated by glycan structure^9,10. However, because each of these approaches can provide only limited information, they must be used in combination in order to obtain a comprehensive picture^11,12.

By contrast, because glycopeptides encompass intact glycan and peptide moieties together within the same molecule, a glycoproteomic approach that profiles N-glycopeptides could provide comprehensive information regarding protein N-glycosylation¹³. The concomitant presence of glycan and peptide moieties with different physical and chemical properties makes the full structural characterization of N-glycopeptide extraordinarily difficult. However, previous works have shown that various powerful tandem MS fragmentation techniques enable direct identification of intact N-glycopeptides and monitoring of the site-specific glycoform distributions of pure glycoproteins isolated from clinical specimens such as organ tissue and plasma in which most of proteins are heterogeneously glycosylated in complex mixtures^14,15,16. Recently, large-scale site-specific N-glycopeptide identification has been attempted using complex glycoproteome samples^17,18,19. However, high-throughput global mapping of site-specific glycopeptides of N-glycoproteins in blood samples is much more challenging, due to the extremely high degree of sample complexity, wide dynamic ranges of the abundance of analytes, and current lack of automatic search algorithms capable of confidently identifying N-glycopeptides from large amounts of tandem MS data. There are some search algorithms such as GlycoFragwork¹⁹, GP Finder²⁰, Sweet-Heart²¹, GPS²², Byonic²³, and MAGIC²⁴ to identify glycopeptides with false discovery rate (FDR) less than 5%. Even using the best algorithms, an FDR of 1% does not actually give you 1% false positives. For example, S. W. Wu et al.²⁵ reported more than 37% false positives in the analysis of a protein despite of a claim of zero FDR. Frequently, we need manual validation of individual tandem mass spectra, which is tedious and takes long time.

To address these issues, in this study we attempted to develop a fast search engine termed Integrated GlycoProteome Analyzer (I-GPA), which is capable of identification of site-specific N-glycopeptides without manual validation and automated label-free quantitation of large capacity of glycoproteins. We then newly developed platform to high-throughput comparative mapping and quantifying glycoproteomes present in the plasma of liver cancer patients (hepatocellular carcinoma: HCC) and healthy donor controls, with the goal of revealing potential novel N-glycopeptide biomarkers. Here we show that I-GPA is a new automated N-glycoproteome analyzer which facilitates high-speed mapping and quantifying glycoproteins, suitable for ongoing chromosome-centric human proteome project (C-HPP)²⁶.

Results

Given that most of proteins present in human plasma are glycoproteins we chose plasma sample for construction of plasma glycoprotein DB and analysis of glycoproteins in this study. We wanted to design a concept of I-GPA which may facilitate the high-throughput analysis of the N-glycoproteins with tandem MS and previously built in-house glycoproteome DBs. To this end, we first analyzed the HILIC-enriched site-specific N-glycopeptides of human plasma by nano-reversed-phase liquid chromatography (nRPLC) coupled to MS with both HCD and CID-MS/MS fragmentation. The resultant data were then computationally analyzed using specific algorithms suite within the I-GPA platform: N-glycopeptides were identified against the GPA-DB (id-GPA), quantitated (q-GPA), and finally compared between multiple samples (c-GPA) as outlined in Fig. 1a,b. All resultant data as well as mass spectrometry raw data have been deposited to the ProteomeXchange Consortium via the MassIVE repository with the dataset identifier PXD003369 or MSV000079426, respectively. The RAW files are available for download at ftp://massive.ucsd.edu/MSV000079426/.

**Figure 1: Structural and functional components of the Integrated N-glycoproteome analyzer (I-GPA).**

Construction and evaluation of composite GPA-DBs containing GPA-DB-AGP, GPA-DB-Mixture, and GPA-DB-Human Plasma

GPA-DBs were constructed for each sample, using the software GPA-DB-Builder, by combining possible tryptic peptides and 351 N-linked glycans, where 331 retrosynthetic glycans came from references of Kronewitter, S.R. et al.²⁷ and 20 glycans from penta and hexa polylactosamine series of Ozohanics, O. et al.²⁸ (Supplementary Excel 1). The GPA-DB includes isotope pattern information for masses and relative intensities of intact N-glycopeptides. GPA-DB-AGP (n = 4,212), GPA-DB-Mixture (n = 6,318), and GPA-DB-HumanPlasma (n = 254,826)²⁹ were used for the analysis of α1-acid glycoprotein (AGP), three standard protein mixtures, and depleted (or non-depleted) human plasma, respectively (see Supplementary Note 1, Supplementary Fig. 1, Supplementary Table 1, and Supplementary Excels 2, 3).

Automated identification of N-glycopeptides using id-GPA algorithms

id-GPA was designed to automatically identify site-specific N-glycopeptides using converted to MS (.txt) and MS/MS (.mgf) format (see Supplementary Note 2). Scoring entailed three steps: 1) Selection of N-glycopeptide from 15 glycan-specific oxonium ions using HCD-MS/MS spectra; (M-score); 2) Selection of candidates by matching the isotope pattern to intact N-glycopeptides in the GPA-DB (S-score); and 3) Identification of N-glycopeptide from CID and HCD-MS/MS fragment ions (Y-score) with FDR ≤ 1%.

Selection of N-glycopeptide spectra using M-score

We noted that glycan oxonium ions, singly protonated mono- and oligo-saccharide ions resulting from fragmentation of glycans and glycopeptides, are highly sensitive markers of glycopeptide fragmentation in HCD-MS/MS spectra³⁰. Generally previous studies have used only 3–5 oxonium ions in HCD spectra just to manually “flag” glycopeptide data without knowledge of their statistical distribution in MS/MS spectra. On the other hand, we use total 15 oxonium ions (Fig. 2a, left) differently weighted according to their frequency of appearance in HCD spectra (for example m/z 204, 186, 168, 138 series, etc.), and then determined how often they appeared in HCD-MS/MS spectra using the M-score calculated using Equation 1 (Supplementary Note 3).

**Figure 2: Computational algorithm of id-GPA for identification of standard α1-acid glycoprotein (AGP).**

where N is the number of expected oxonium ion, n is the number of matched oxonium ion, I_i is the peak intensity of matched oxonium ion, I_max(<700Da) is the highest intensity of peak less than 700Da, MassError is absolute difference of the mass of matched peak from the theoretical mass of the oxonium ion, and C is the theoretical frequency of appearance in HCD spectra. MassError +1.0 was considered not to make the O_i infinity. If the mass error is zero, O_i represents the weighted relative intensity of matched oxonium ion. The M-score allows us to select only those MS/MS spectra that contain markers of N-glycopeptides from the large number of spectra obtained during an LC-MS run. We analyzed tryptic digests of a standard AGP, which has glycoforms of complex types. Fig. 2a (middle) shows the M-score distribution of the HCD-MS/MS spectra. Most MS/MS spectra had an M-score of <0.5, but higher M-scores were also present. After Gaussian fitting, we can automatically select 1,674 MS/MS spectra with M-score ≥ 1.3 from a total of 5,818 spectra. We manually confirmed those spectra contained markers of 15 oxonium ions with an FDR of 2.5% (Fig. 2a, right). The selection of N-glycopeptides by M-score was usefully presented in the analysis of same sample with enrichment by HILIC. Its distribution (Supplementary Fig. 2, Supplementary Table 2) revealed most N-glycopeptides with an M-score ≥ 1.6 by HILIC enrichment.

Selection of N-glycopeptide candidates using S-score

Once the MS/MS spectra were selected by M-score, we obtained isotope patterns of their precursor ions, and then searched against the previously constructed GPA-DB for the best match. We compared the isotope patterns of molecular ions between experimental and theoretical data (Fig. 2b, left) and calculated the similarity (S-score) to select N-glycopeptide candidates according to Equation 2.

where X1 is the m/z of theoretical isotope peak, X2 is the intensity of theoretical isotope peak, Y1 is the m/z of experimental isotope peak, Y2 is the relative intensity of experimental isotope peak. When calculating the S-score, we considered mass accuracy and relative intensity at a ratio of 9:1, because the best AUC (Area Under ROC Curves) value for true N-glycopeptide matching in the analysis of AGP was close to this value (i.e., 0.899) (Supplementary Fig. 3). Here, we assumed that the N-glycopeptides matched against GPA-DB-AGP represented true assignments if they were present in the reference list of Ozohanics et al.²⁸ Fig. 2b (middle) shows the distribution of S-scores among 1,674 precursor ions with M-score ≥ 1.3 in the analysis described above. We manually confirmed those spectra contained 924 precursor ions of N-glycopeptide candidates with S-score ≥ 98 (FDR of 19.7%, Fig. 2b, right). Of these, 195 were found in the reference list of AGP (Estimated FDR of 14.7%).

Identification of N-glycopeptides using Y-score with FDR

CID-MS/MS spectra of N-glycopeptides exhibit specific spectral characteristics: Y-ions, intact peptide ions with partially fragmented glycan moiety attached, and B-ions (multi-mono-saccharide fragments of non-reducing end of the attached glycan (Fig. 2c, left). For example, the precursor of ENGTISR_5402 (+3) glycopeptide from AGP are fragmented into the Y-ions (+2), Y-ions (+1) and B-ions (+1). (Here, the string of digits following the amino-acid sequence of the peptide denotes the composition of the attached glycan: for example, the glycoform with 5 Hex, 4 HexNAc, 0 Fucose, and 2 NeuAc, in that order, was designated 5402). HCD-MS/MS spectra of N-glycopeptides exhibit only Y-ions (+1) regardless of precursor ion charge, and y- and b-ions, which reveal the amino-acid sequences of peptides, as well as oxonium ions. We compared experimental CID- and HCD-MS/MS spectra to the theoretical ones expected from the N-glycopeptide candidates selected by S-scoring. Then CID_match and HCD_match, which represent the matches between the experimental and theoretical CID and HCD fragment peaks of N-glycopeptides, respectively, were calculated according to Equation 3 (Supplementary Note 3 and 4).

where n is the number of peaks, I_max is the intensity of the highest peak in the spectrum, M_i is the matched peak intensity, and S_i is the individual peak intensity in the spectrum. We combined CID_match and HCD_match to select as many true identifications of N-glycopeptide as possible. Eventually, the CID_match:HCD_match ratio of 7:3 yielded the highest AUC value (Supplementary Fig. 4). Therefore we defined the Y-score according to Equation 4.

In order to estimate the number of false positive identifications, we calculated the FDR using a decoy method: After S-scoring, we obtained N-glycopeptide candidates, including their glycoforms and peptides and then made decoy B-and Y-ions candidates by changing the numbers of hexosamine (Hex), N-acetyl hexosamine (HexNAc), Fucose (Fuc), and N-acetylneuraminic acid (NeuAc) for glycoforms and amino-acid sequences of peptides. Based on these information, we constructed a decoy MS/MS database by exchanging the numbers of 1) Hex into HexNAc and HexNAc into Hex 2) Fuc into NeuAc and NeuAc into Fuc, and reverse 3) the amino acid sequence. As an example, we listed up the calculated B-, Y-ions, and their corresponding decoy ions from N-glycopeptide of ENGTISR_5402 (+3). (Supplementary Table 3).

Using this decoy MS/MS database, we obtained a Y-score distribution that enabled us to distinguish between false and true identifications (Fig. 2c, middle). According to this distribution, the Y-score ≥69.5 was determined for the selection of true identifications at an estimated FDR of 0.9%. Eventually we manually confirmed those 456 N-glycopeptide spectra from 95 unique N-glycopeptides at Y-score69.5 with 0% false positives in the analysis of the AGP standard sample (Fig. 2c, right, Supplementary Table 8, Supplementary Excel 4, and Supplementary PDF 1). We then validated the identification of N-glycopeptides by id-GPA with GPA DBs of various sizes from the standard AGP to mixture samples. (Supplementary Note 5, Supplementary Fig. 5, Supplementary Tables 4–6, and Supplementary Excels 5,6–7).

High-speed label-free quantitation using q-GPA algorithm

For automated label-free quantitation of the identified N-glycopeptides, we developed q-GPA using a new algorithm, named top three-isotopes quantification (TIQ). 3TIQ uses the combined intensity of top three isotope peaks at three highest MS spectral points (Fig. 3). This approach has several advantages. First, because it requires no peak area generation from the extracted ion chromatogram (XIC), it allows high-speed quantitation. Second, as in the case of evaluating the isotope pattern by S-scoring, we effectively remove signal interference from co-eluted ions of similar m/z values. Third, considering top three isotope peaks provides more sensitive results with better S/N ratios, because for N-glycopeptides ( > 4,000 Da) the M + 1, M + 2 and M + 3 isotope peaks are generally more intense than the M peak. To determine how many MS data points are needed for 3TIQ based quantitation, we quantitated N-glycopeptides according to the number of MS spectral points and compared the results (Fig. 3d, Supplementary Excel 8). Considering the R² ≥ 0.95 with XIC and % of quantitated N-glycopeptides ≥ 99.0%, we found that three data points with the highest values gave the best results (Fig. 3e). Overall, 99.2% of the identified N-glycopeptides were quantitated with a correlation of R² = 0.959 against the XIC. Label free quantification by 3TIQ was validated with standard RNase B spiked at different concentrations in AGP standard solution and calibration curves of all N-glycopeptides exhibited good linearity (R² ≥ 0.99). (Supplementary Note 5, Supplementary Fig. 6, and Supplementary Table 7).

**Figure 3: q-GPA algorithm for label-free quantitation of standard α1-acid glycoprotein (AGP).**

Comparative analysis of multiple samples using c-GPA algorithm

After quantitation of N-glycopeptides in each sample using q-GPA, we performed comparative analysis using c-GPA. First, we combined all q-GPA data from multiple samples and compiled a total N-glycopeptide list that included information about isotope pattern, retention time, and abundance. As shown in Fig. 4a, if N-glycopeptide A were quantitated in all samples, we used their abundances obtained by q-GPA without additional processing. However, if N-glycopeptide B and C were not identified in some samples, as determined from the isotope pattern and retention time in the total N-glycopeptide list, we obtained the abundances of the corresponding N-glycopeptides using the 3TIQ method. Here, the similarity of the isotope pattern should be reflected by an S-score ≥ 98, with a retention time within 5 min of the previously observed one. In comparison with the abundances obtained from the XIC (Fig. 4b, Supplementary Excel 9), all N-glycopeptides at four N-glycosylation sites exhibited a good correlation (R² > 0.93). The Pearson correlation coefficients on the coefficients of variation (CVs) (r = 0.8199, P < 0.0001) also demonstrated the similarity between the quantitative results obtained by c-GPA and conventional XIC (Fig. 4c). We evaluated the reproducibility in biological and technical replicates with benchmark datasets of standard glycoproteins spiked into HeLa cell lysates. (Supplementary Note 6 and Supplementary Fig. 7).

**Figure 4: c-GPA algorithm for quantitation of three different HILIC-enriched batches of standard α1-acid glycoprotein (AGP).**

Application of I-GPA to analysis of N-glycopeptides in the HCC plasma

To analyze N-glycopeptides that might be differentially expressed in HCC plasma using I-GPA platform, ten plasma samples from normal individuals, from which six major plasma proteins were either depleted or not depleted, were pooled (Supplementary Note 7). Twelve nano-LC/MS runs were analyzed by I-GPA (Supplementary Fig. 8). Table 1 presents all the results, including the numbers of N-glycopeptide spectra selected by M-score, candidates selected by S-score, and N-glycopeptides identified by Y-score at an FDR of ≤1%. More N-glycopeptide related data were obtained from a comparative analysis of depleted and non-depleted plasma. All analyses exhibited similar changes in the numbers of spectra and peptides: ~50% of N-glycopeptide candidates selected by M-score were filtered out by S-score, ~20% of N-glycopeptide spectra were identified as N-glycopeptides, and half of those were ultimately characterized as unique site-specific N-glycopeptides.

Table 1 id-GPA search results (M-, S-, Y-score and FDR) by triplicate analysis of pools of normal and HCC human plasma.

Full size table

Our method identified 123 N-glycoproteins present in plasma at concentrations spanning five orders of magnitude, ranging from highly abundant proteins such as immunoglobulin G (IgG, ~1 mg/ml) to low-abundance proteins such as α-Fetoprotein (AFP, ~10 ng/ml). We also found a total of 619 unique N-glycopeptides (Fig. 5a): 449 and 352 N-glycopeptides, respectively, in normal and HCC plasmas from which six proteins were or were not depleted (Supplementary Fig. 9,10, Supplementary Excels 10,11).

**Figure 5: Analysis of N-glycopeptides in normal and hepatocellular carcinoma (HCC) plasma samples by I-GPA.**

For quantitative analysis, the abundance of N-glycopeptides obtained by the 3TIQ method was globally normalized and compared using c-GPA. Among 566 N-glycopeptides identified from six experiments with depleted plasma, the average 549 XICs and 529 c-GPAs were compared (Fig. 5b, Supplementary Excels 12,13–14). Among 440 N-glycopeptides identified from six experiments with non-depleted plasma, the average 423 XICs and 404 c-GPA were compared (Supplementary Fig. 11, Supplementary Excels 15,16–17). Figure 5c shows that the abundances obtained by conventional XIC and c-GPA based on the 3TIQ method were very similar. In c-GPA, the abundances of N-glycopeptides were averaged when they were quantitated two or more times by 3TIQ in three experiments. Nineteen N-glycopeptides with CV ≥ 30% were excluded. In cases, where the same N-glycopeptides were yielded at different numbers of charged ions, the results were combined. A total of 435 and 342 unique site-specific N-glycopeptides from depleted and non-depleted plasma, respectively, were quantitatively compared. Collectively, we were able to automatically identify 619 site-specific N-glycopeptides with FDR ≤ 1% and simultaneously quantitate 598 N-glycopeptides from human plasma.

Statistical analysis of the human plasma glycoproteome using a volcano plot

Based on a quantitative comparison of site-specific N-glycopeptides, we performed a statistical analysis of the human plasma glycoproteome using a volcano plot. Figure 5d shows the fold changes calculated by abundances of N-glycoproteins in normal and HCC plasma. The abundance of each N-glycoprotein was determined by summing the abundances of all site-specific N-glycopeptides used for identification of that N-glycoprotein. Fourteen N-glycoproteins, including AFP, exhibited >2-fold differences in abundance in HCC relative to normal plasma. We found that the N-glycopeptide VNFTEIQK_5402 from AFP (Supplementary Fig. 12). It is a well-known liver cancer marker currently in clinical use (green circle in Fig. 5d), was only represented in depleted HCC plasma.

Several differentially presented glycoproteins in HCC plasma that were previously reported as candidate cancer biomarkers exhibited the same tendency in our analysis: levels of α2-macroglobulin (A2M)^31,32, sex hormone–binding globulin (SHBG)^31,33, and complement component C7^34,35 were elevated in HCC plasma, whereas levels of SERPINA5³⁵ and laminin (LAMC1)^36,37 were reduced. However, most N-glycoproteins including clusterin (CLU, blue circle in Fig. 5d) reported as HCC markers³⁸ showed no significant fold changes.

As our approach can identify specific changes in each N-glycopeptide of a single glycoprotein, we performed the same statistical analysis on site-specific N-glycopeptides (Fig. 5e). The results revealed 110 site-specific N-glycopeptides exhibiting >2-fold differences in abundance in HCC versus normal plasma. This number was much larger than the analogous value obtained at the glycoprotein level (i.e., 32 and 78 N-glycopeptides present at high levels in normal and HCC plasma, respectively). Some N-glycopeptides exhibited different fold changes according to glycoforms attached at a given glycosylation site. In addition, 69% of the 78 N-glycopeptides that were elevated in HCC plasma contain more than one fucose.

Typically, N-glycoproteins exhibited significant changes not at the glycoprotein level but at the N-glycopeptide level, as in the cases of AGP (red circle in Fig. 5e and Supplementary PDF 2), α1-antichymotrypsin (AACT, purple circle in Fig. 5e) and Hemopexin (HPX, cyan circle in Fig. 5e) in depleted plasma, and IgG (Supplementary PDF 3) in non-depleted plasma. Due to averaging effects at the protein level, these specific differences could not be detected without site-specific N-glycopeptide analysis. We examined site-specific N-glycosylation microheterogeneity in detail for each individual N-glycoprotein. The relative abundances of all site-specific N-glycopeptides identified in a single N-glycoprotein (IgG, AGP, or AACT) were compared between normal and HCC plasma (Supplementary Note 8 and Supplementary Fig. 13). These observations suggested that HCC was closely associated with fucosylation on branched glycoforms such as tri-antennary glycoforms. It is also consistent with the previous reports, regarding AGP³⁹ and Hemopexin⁴⁰, where hyper fucosylation and increased branching appear in liver diseases⁴¹.

Discussion

In this study, we describe the I-GPA platform for high-throughput glycoproteomics and demonstrate the efficacy of this approach in an analysis of the glycoproteome of human plasma. I-GPA has several advantages. First, using the GPA-DB we constructed, id-GPA can directly identify site-specific N-glycopeptides from complex N-glycoprotein mixtures in plasma. The GPA-DB consists of intact N-glycopeptides produced by in silico trypsin digestion, including information about isotope mass and intensity, and can be freely expanded as required according to the sample. Second, id-GPA can calculate the FDR based on a decoy method to ‘tune’ the method to detect true matches. Third, searching by id-GPA is fast, because it works with only qualified mass data; unsatisfactory mass data with low scores are filtered out before the subsequent search step. Fourth, q-GPA can easily quantitate N-glycopeptides by the 3TIQ method in a label-free manner, using the peak intensities of the major isotope ions rather than the peak area. In addition, it does not require generation of peak areas from the extracted ion chromatogram. Previous methods for comparative analysis were time-consuming and laborious because they were generally performed by chromatography of monoisotope ions, manually extracted from LC-MS/MS analyses within limited MS tolerances and retention time windows. By contrast, q-GPA enables rapid quantitative analysis. Finally, I-GPA supports a variety of high-resolution MS equipment (Rs ≥ 30,000) with MS/MS fragmentation, including LTQ-FT, Orbitrap, and Q-Tof. For example, the identification of N-glycopeptides from standard AGP using id-GPA search in Orbitrap and QTOF MS analysis gave almost similar results at estimated FDR ≤ 1% (Supplementary Fig. 14, Supplementary Note 9, Supplementary Excel 5 and 18, Supplementary PDF 4).

Taken together, I-GPA can serve as a new versatile search engine for automated analysis of complex standard glycoproteins as well as biological/clinical samples. In the comparison of I-GPA and commercial software (Byonic) for the analysis of standard AGP, Byonic gave 27.5% false positives at an FDR of 0% as the tool offered. On the other hand, I-GPA gave 95 site-specific N-glycopeptides from standard AGP sample were identified with 0% false positives at an estimated FDR ≤ 1% using GPA decoy method (Supplementary Note 10, Supplementary Figs 15–17, Supplementary Table 8, Supplementary Excel 4). For potential use of this I-GPA platform in the biomarker discovery field, we applied our strategy to the analysis of N-glycoproteins present in human plasma, a representative bio-fluid containing a mixture of various glycoproteins. An automatic spectrum annotation of identified 619 site-specific N-glycopeptides with estimated FDR ≤ 1% from 123 glycoproteins marks the largest number reported to date, spanning five orders of magnitude in concentration and simultaneously quantifying 598 N-glycopeptides, from human plasma sample that are known to contain highly glycosylated proteins (Supplementary Figs 18,19, Supplementary PDF 5).

Our mapping performance was proven to be superior to the work recently reported by Mayampurath et al.^19,42, where only 103 N-glycopeptides (FDR < 5%) with manual quantification of 40 N-glycopeptides by label-free mass analysis of human sera. Furthermore, we confirmed the mapping performance of I-GPA for IgG molecule, a representative serological glycoprotein, can be quantified with a total of 46 site-specific N-glycopeptides that were from IgG glycoforms 1, 2, 3, and 4 in human plasma. However, a recently published study by Huffman, J.E. et al.⁴³ identified only 16 N-glycopeptides from purified IgG glycoforms 1 and 2 (Supplementary Table 9). Remarkably, in the analysis of the site-specific N-glycopeptides, changes in fucosylation of highly branched glycoforms were frequently observed in HCC plasma proteins, making its potential utility in clinical diagnostic research.

Given a great deal of biological interest, many approaches for the analysis of glycoproteins to date shows some limitations: lack of integration of proteomics and glycomics, a relatively small set of targeted glycoproteins and insufficient glycoprotein DBs. However, I-GPA, a newly developed search engine, covers both areas (i.e., glycoproteomics) and allows direct analysis of site-specific N-glycopeptides from complex glycoprotein mixtures using the efficient glycoprotein DBs, where an analytical efficiency was similar to that currently available in proteomics with FDR ≤ 1%. By fulfilling an unmet need for an automated method for high-throughput glycoproteomic analysis of broad biological samples, I-GPA will also contribute to C-HPP which commonly carries out comprehensive in-depth studies on cells, tissues, organs, and biological fluids.

Methods

Materials

Glycoprotein standards (RNase B(source: bovine, Cat. No. R1153), α1-acid glycoprotein (source: human, Cat. No. G9885) and IgG (source: human, Cat. No. I4505), 1,4-dithiothreitol (DTT), iodoacetamide (IAA), trifluoroacetic acid (TFA) and formic acid (FA) were purchased from Sigma-Aldrich (St. Louis, MO). For glycopeptides enrichment, ZIC-HILIC kit (ProteoExtract® Glycopeptide Enrichment Kit) was from EMD Millipore (Cat. No. 72103-3). Trypsin Gold (mass Spectrometry grade, V5280) for protein digestion was obtained from Promega (Madison, WI). HPLC grade acetonitrile from J.T. Baker (Phillipsburg, NJ) and deionized water through Millipore (Milli-Q Advantage A10 System) system were used. Plasma samples collected with an appropriate concentration of K₂EDTA were obtained from Yonsei University College of Medicine (Seoul, Korea) along with IRB guideline for informed consent and approval and stored at −80 °C until use.

Enzyme digestion of standard and plasma samples

RNase B, AGP and IgG, each glycoprotein standard solution was prepared at concentrations of 50 μg/100 μL in 20 mM Tris-HCl, pH 8.00 was denatured at 95 °C for 5 min. The protein solution cooled at room temperature was reduced by adding 2.5 μL of 200 mM DTT at 60 °C for 45 min and alkylated by adding 10 μL of 200 mM IAA at room temperature for 45 min (in the dark). 5 μL of 200 mM DTT was added and incubated at room temperature during 30 min for alkylation quenching. This solution was incubated with trypsin (total protein:trypsin = 10:1, by weight) at 37 °C overnight for digestion. For validation of M-, S- and Y-score in GPA algorithm, glycopeptides from digested AGP were concentrated by AmiconUltra 3 K MWCO (molecular weight cutoff) filters (product UFC500396; Millipore Ireland Ltd). For the calibration curves for N-linked glycopeptides from RNase B, six different concentrations of RNase B digest was each spiked in same amount AGP digest (0.15ug). For optimization of tandem mass spectrometry condition (CID and HCD), same amount of RNase B, AGP and IgG digests were combined. Digested samples were diluted with 0.1% FA/99.9 H₂O for UPLC/LTQ-Orbitrap Elite mass spectrometry analysis or dried in a SpeedVac for glycopeptides enrichment

Multiple affinity removal column (MARC; Agilent) with HP1100LC system (Agilent) was used for the depletion of the six (albumin, transferrin, IgG, IgA, haptoglobin, and α₁-antitrypsin) most abundant proteins in plasma according to the manufacturer’s specifications. Flow-through fractions that are “Depleted plasma” were collected and stored at −20 °C until use. Depleted plasma samples or non-depleted plasma samples were desalted and concentrated by centrifugal filtration using 10,000-Da MWCO (molecular weight cutoff) filter (VIVASPIN 6: product No. VS0602, Sartorius Stedim Biotech GmbH, Göttingen, Germany). The human plasma samples were quantitatively analyzed by Bradford protein assay. Ten individual plasma samples were respectively pooled for non-depleted normal and HCC, and depleted normal and HCC samples. Pooled plasma samples were diluted with 20 mM Tris-HCl buffer (pH 8.00). Diluted plasma samples (100 μg) were reduced, alkylated and digested with respectively DTT, IAA and trypsin such as above glycoprotein standard digestion protocol. Digested samples were diluted with 0.1% FA/99.9 H₂O for UPLC/LTQ-Orbitrap Elite mass spectrometry analysis or dried in a SpeedVac for glycopeptides enrichment

Enrichment of N-glycopeptides

Glycopeptides with ZIC-HILIC kit(ProteoExtract® Glycopeptide Enrichment Kit) was enriched according as the manufacturer’s processes (EMD Millipore). This kit includes ZIC^® Glycocapture Resin, ZIC^® Binding Buffer, ZIC^® Wash Buffer and ZIC^® Elution Buffer. Briefly, 10μl AGP digest or real plasma digest of 2–4μg /μl concentration was prepared. 10 μl digested sample was diluted by adding 50 μl ZIC^® Binding Buffer. ZIC^® Glycocapture Resin was mixed and then, 50 μl homogenous ZIC^® Glycocapture Resin was taken to a new microcentrifuge tube. The tube that contains 50 μl homogenous ZIC^® Glycocapture Resin was centrifuged for 1–2 min at 2,000–2,500 × g. The supernatant was thoroughly removed and discarded. The diluted digest sample was transferred to the ZIC^® Glycocapture Resin, mixed by pipetting up and down, and incubated at 1,200 rpm for 10–20 min. The tube was centrifuged for 1–2 min at 2,000–2,500 × g. The supernatant was thoroughly removed and discarded. The ZIC^® Glycocapture Resin was washed with 150 μl ZIC^® Wash Buffer, mixed by pipetting up and down, and incubated at 1,200 rpm for 5–10 min. The tube was centrifuged for 1–2 min at 2,000–2,500 × g. The supernatant in the tube was thoroughly removed and discarded. The ZIC^® Glycocapture Resin was totally washed three times. 75–100 μl ZIC^® Elution Buffer was added for glycopeptides elution. The tube was mixed by pipetting up and down, incubated at 1,200 rpm for 2–5 min and centrifuged for 1–2 min at 2,000–2,500 × g. The supernatant which contains glycopeptides was transferred in a new microcentrifuge tube. The new microcentrifuge tube including glycopeptides was again centrifuged for 2 min at 10,000 × g and transferred in a new microcentrifuge tube to avoid the transfer of resin particles. Elutions were dried in SpeedVac and redissolved in 0.1% FA/99.9 H₂O for UPLC/LTQ-Orbitrap Elite mass spectrometry analysis.

Nano-LC-ESI-MS/MS analysis

Resolved or diluted samples with 0.1% FA/99.9 H₂O were separated a Nano Acquity UPLC system (Waters, USA) and measured by an LTQ Orbitrap Elite mass spectrometer (Thermo Scientific, USA) equipped with a nano-electrospray source. An autosampler was used to load Each 5-μL aliquot of the peptide solution was loaded into a C₁₈ trap-column of i.d. 180 μm, length 20 mm, and particle size of 5 μm (Waters, USA) with an autosampler. The peptides were desalted and concentrated on the trap column for 10 min at a 5 μL/min flow rate. Then, the trapped peptides were back-flushed on a homemade microcapillary column (i.d. 100 μm and length 200 mm, C₁₈ of 3 μm particle size −125Å) for separation. Mobile phase A and B were composed with 100% water contained 0.1% formic acid and 100% acetonitrile (ACN) (B) contained 0.1% formic acid respectively. The LC gradient :5% B maintained from 0 to 15 min. Then, mobile phase B was ramped to 15% for 5 min, to 50% B for 75 min and to 95% B for 1 min. 95% B was remained for 13 min. B was decreased to 5% B for 1 min. The column was finally re-equilibrated with 5% B for 10 min. For plasma sample analysis, the LC gradient time was extended until 180 min. The voltage applied to produce the electrospray was 2.2 kV. The LTQ Orbitrap Elite mass spectrometry was operated in a data-dependent mode during the liquid chromatography separation. The MS acquisition parameters: resolution of full scans was 120,000 in Orbitrap for each sample; five data-dependent MS/MS scans were acquired by collision induced dissociation (CID) or(and) higher energy collision dissociation(HCD) per one full scan; CID scans and HCD scans were acquired in linear trap quadrupole (LTQ) with 30 ms activation time and were acquired in Orbitrap at resolution 15,000 with 20 ms activation time for each sample respectively; 35% normalized collision energy (NCE) in CID and HCD; 5.0 Da isolation window CID and HCD. Previously fragmented ions were excluded for 180 seconds for all MS/MS scans. The MS1 mass scan range was 400–2500 m/z for glycoprotein standard and 800–1800 m/z for plasma samples.

The condition of nano-LC-ESI-MS/MS for Q-TOF data : Digested AGP sample was separated by Ekspert™ nanoLC 400(Eksigent) and measured by an AB SCIEX TripleTOF® 5600⁺ mass spectrometer equipped with a nano-electrospray source in information-dependent acquisition (IDA) experiment mode. Sample was desalted and concentrated by a C₁₈ trap-column (i.d. 180 μm, length 20 mm, and particle size 5 μm (Waters, USA)) for 10 min at a 5 μL/min flow rate. The trapped peptides were back-flushed on homemade microcapillary column (i.d. 100 μm and length 200 mm, C₁₈ of 3 μm particle size −125Å) for separation. The LC gradient was performed for 120 min as same as that of nano-LC-ESI-MSMS for LTQ Orbitrap data acquisition. MS parameters were set to a MS1 scan of 250–1800 Da (250 msec accumulation time, positive ion mode) coupled to IDA criteria of a charge state of 2–5 exceeding 5 cps set to trigger a MS/MS product ion scan of 100–2000 Da (100 msec accumulation time, positive ion mode).

Additional Information

How to cite this article: Park, G. W. et al. Integrated GlycoProteome Analyzer (I-GPA) for Automated Identification and Quantitation of Site-Specific N-Glycosylation. Sci. Rep. 6, 21175; doi: 10.1038/srep21175 (2016).

References

Barrabes, S. et al. Glycosylation of serum ribonuclease 1 indicates a major endothelial origin and reveals an increase in core fucosylation in pancreatic cancer. Glycobiology 17, 388–400 (2007).
Article CAS PubMed Google Scholar
Hwang, H. et al. Glycoproteomics in neurodegenerative diseases. Mass Spectrom. Revs. 29, 79–125 (2010).
Article CAS ADS Google Scholar
Ahn, Y. H., Kim, J. Y. & Yoo, J. S. Quantitative mass spectrometric analysis of glycoproteins combined with enrichment methods. Mass Spectrom. Revs. 34, 148–165 (2015).
Article ADS CAS Google Scholar
Kirmiz, C. et al. A serum glycomics approach to breast cancer biomarkers. Mol. Cell. Proteomics 6, 43–55 (2007).
Article CAS PubMed Google Scholar
Toyama, A. et al. Quantitative structural characterization of local N-glycan microheterogeneity in therapeutic antibodies by energy-resolved oxonium ion monitoring. Anal. Chem. 84, 9655–9662 (2012).
Article CAS PubMed Google Scholar
Chen, R. et al. Development of glycoprotein capture-based label-free method for the high-throughput screening of differential glycoproteins in hepatocellular carcinoma. Mol. Cell. Proteomics 10, M110 006445 (2011).
Article PubMed PubMed Central CAS Google Scholar
Ishihara, T. et al. Development of quantitative plasma N-glycoproteomics using label-free 2-D LC-MALDI MS and its applicability for biomarker discovery in hepatocellular carcinoma. J. Proteom. 74, 2159–2168 (2011).
Article CAS Google Scholar
Zielinska, D. F., Gnad, F., Wisniewski, J. R. & Mann, M. Precision mapping of an in vivo N-glycoproteome reveals rigid topological and sequence constraints. Cell 141, 897–907 (2010).
Article CAS PubMed Google Scholar
Yang, G. et al. Selective isolation and analysis of glycoprotein fractions and their glycomes from hepatocellular carcinoma sera. Proteomics 13, 1481–1498 (2013).
Article CAS PubMed Google Scholar
Ahn, Y. H. et al. A lectin-coupled, targeted proteomic mass spectrometry (MRM MS) platform for identification of multiple liver cancer biomarkers in human plasma. J. Proteom. 75, 5507–5515 (2012).
Article CAS Google Scholar
Taylor, A. D. et al. Towards an integrated proteomic and glycomic approach to finding cancer biomarkers. Genome Medicine 1, 57 (2009).
Article PubMed PubMed Central CAS Google Scholar
Liu, Y. et al. Identification and confirmation of biomarkers using an integrated platform for quantitative analysis of glycoproteins and their glycosylations. J. Proteome. Res. 9, 798–805 (2010).
Article CAS PubMed PubMed Central Google Scholar
Doerr, A. Glycoproteomics. Nat. Methods 9, 36 (2012).
Article CAS Google Scholar
Wada, Y., Tajiri, M. & Yoshida, S. Hydrophilic affinity isolation and MALDI multiple-stage tandem mass spectrometry of glycopeptides for glycoproteomics. Anal. Chem. 76, 6560–6565 (2004).
Article CAS PubMed Google Scholar
Rebecchi, K. R., Wenke, J. L., Go, E. P. & Desaire, H. Label-free quantitation: a new glycoproteomics approach. J. Am. Soc. Mass Spectrom. 20, 1048–1059 (2009).
Article CAS PubMed Google Scholar
Pompach, P. et al. Site-specific glycoforms of haptoglobin in liver cirrhosis and hepatocellular carcinoma. Mol. Cell. Proteomics 12, 1281–1293 (2013).
Article CAS PubMed PubMed Central Google Scholar
Trinidad, J. C., Schoepfer, R., Burlingame, A. L. & Medzihradszky, K. F. N- and O-glycosylation in the murine synaptosome. Mol. Cell. Proteomics 12, 3474–3488 (2013).
Article CAS PubMed PubMed Central Google Scholar
Parker, B. L. et al. Site-specific glycan-peptide analysis for determination of N-glycoproteome heterogeneity. J. Proteome. Res. 12, 5791–5800 (2013).
Article CAS PubMed Google Scholar
Mayampurath, A. et al. Computational framework for identification of intact glycopeptides in complex samples. Anal. Chem. 86, 453–463 (2014).
Article CAS PubMed Google Scholar
Strum, J. S. et al. Automated assignments of N- and O-site specific glycosylation with extensive glycan heterogeneity of glycoprotein mixtures. Anal. Chem. 85, 5666–5675 (2013).
Article CAS PubMed PubMed Central Google Scholar
Wu, S. W. et al. Sweet-Heart—an integrated suite of enabling computational tools for automated MS2/MS3 sequencing and identification of glycopeptides. J. Proteom. 84, 1–16 (2013).
Article CAS MathSciNet Google Scholar
Chandler, K. B., Pompach, P., Goldman, R. & Edwards, N. Exploring site-specific N-glycosylation microheterogeneity of haptoglobin using glycopeptide CID tandem mass spectra and glycan database search. J. Proteome. Res. 12, 3652–3666 (2013).
Article CAS PubMed PubMed Central Google Scholar
Bern, M., Kil, Y. J. & Becker, C. Byonic: advanced peptide and protein identification software. Curr. Protoc. Bioinformatics. Chapter 13, Unit13 20 (2012).
Lynn, K. S. et al. MAGIC: an automated N-linked glycoprotein identification tool using a Y1-ion pattern matching algorithm and in silico MS(2) approach. Anal. Chem. 87, 2466–2473 (2015).
Article CAS PubMed Google Scholar
Wu, S. W., Pu, T. H., Viner, R. & Khoo, K. H. Novel LC-MS2 product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides. Anal. Chem. 86, 5478–5486 (2014).
Article CAS PubMed Google Scholar
Paik Y. K. et al. The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome. Nat. Biotechnol. 30, 221–3 (2012).
Article CAS PubMed Google Scholar
Kronewitter, S. R. et al. The development of retrosynthetic glycan libraries to profile and classify the human serum N-linked glycome. Proteomics 9, 2986–2994 (2009).
Article CAS PubMed PubMed Central Google Scholar
Ozohanics, O. et al. High-performance liquid chromatography coupled to mass spectrometry methodology for analyzing site-specific N-glycosylation patterns. J. Chromatogr. A 1259, 200–212 (2012).
Article CAS PubMed Google Scholar
Farrah, T. et al. A high-confidence human plasma proteome reference set with estimated concentrations in Peptide Atlas. Mol. Cell. Proteomics 10, M110 006353 (2011).
Article PubMed PubMed Central CAS Google Scholar
Hart-Smith, G. & Raftery, M. J. Detection and characterization of low abundance glycopeptides via higher-energy C-trap dissociation and orbitrap mass analysis. J. Am. Soc. Mass Spectrom. 23, 124–140 (2012).
Article ADS CAS PubMed Google Scholar
Tessitore, A. et al. Serum biomarkers identification by mass spectrometry in high-mortality tumors. Int. J. Proteomics 2013, 125858 (2013).
Article PubMed PubMed Central CAS Google Scholar
Sukata, T. et al. alpha(2)-Macroglobulin: a novel cytochemical marker characterizing preneoplastic and neoplastic rat liver lesions negative for hitherto established cytochemical markers. Am. J. Pathology 165, 1479–1488 (2004).
Article CAS Google Scholar
Lukanova, A. et al. Prediagnostic plasma testosterone, sex hormone-binding globulin, IGF-I and hepatocellular carcinoma: etiological factors or risk markers? Int. J. Cancer 134, 164–173 (2014).
Article PubMed CAS Google Scholar
Qin, X. & Gao, B. The complement system in liver diseases. Cell. Mol. Immunol. 3, 333–340 (2006).
CAS PubMed Google Scholar
Poon, T. C. et al. Application of classification tree and neural network algorithms to the identification of serological liver marker profiles for the diagnosis of hepatocellular carcinoma. Oncology 61, 275–283 (2001).
Article CAS PubMed Google Scholar
Jing, Y. et al. SERPINA5 inhibits tumor cell migration by modulating the fibronectin-integrin beta1 signaling pathway in hepatocellular carcinoma. Mol. Oncology 8, 366–377 (2014).
Article CAS Google Scholar
Akhavan, A. et al. Loss of cell-surface laminin anchoring promotes tumor growth and is associated with poor clinical outcomes. Cancer Res. 72, 2578–2588 (2012).
Article CAS PubMed PubMed Central Google Scholar
Nafee, A. M. et al. Clinical significance of serum clusterin as a biomarker for evaluating diagnosis and metastasis potential of viral-related hepatocellular carcinoma. Clinical Biochem. 45, 1070–1074 (2012).
Article CAS Google Scholar
Asao, T. et al. Development of a novel system for mass spectrometric analysis of cancer-associated fucosylation in plasma alpha1-acid glycoprotein. BioMed Res. International 2013, 834790 (2013).
Article CAS Google Scholar
Kobayashi, S. et al. Clinical utility of serum fucosylated hemopexin in Japanese patients with hepatocellular carcinoma. Hepatology Res. 42, 1187–1195 (2012).
Article CAS Google Scholar
Blomme, B., Van Steenkiste, C., Callewaert, N. & Van Vlierberghe, H. Alteration of protein glycosylation in liver diseases. J. Hepatology 50, 592–603 (2009).
Article CAS Google Scholar
Mayampurath, A. et al. Label-Free Glycopeptide Quantification for Biomarker Discovery in Human Sera. J. Proteome. Res. 13, 4821–4832 (2014).
Article CAS PubMed Google Scholar
Huffman, J. E. et al. Comparative Performance of Four Methods for High-throughput Glycosylation Analysis of Immunoglobulin G in Genetic and Epidemiological Research. Mol. Cell. Proteomics 13, 1598–1610 (2014).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This research was supported by the National Research Council of Science and Technology (CAP-15-03-KRIBB); by the Korea Health Technology R&D Project, through the Korea Health Industry Development Institute (KHIDI) funded by the Ministry of Health & Welfare, Republic of Korea (HI13C2098); and by the research program through the Korea Basic Science Institute (G35110).

Author information

Gun Wook Park and Jin Young Kim: These authors contributed equally to this work.

Authors and Affiliations

Department of Mass Spectrometry, Korea Basic Science Institute, Ochang, Republic of Korea
Gun Wook Park, Jin Young Kim, Heeyoun Hwang, Ju Yeon Lee, Hyun Kyoung Lee, Eun Sun Ji, Kwang Hoe Kim, Hoi Keun Jeong, Ki Na Yun & Jong Shin Yoo
Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon, Republic of Korea
Gun Wook Park, Hyun Kyoung Lee, Kwang Hoe Kim, Hoi Keun Jeong, Hyun Joo An & Jong Shin Yoo
Department of Biomedical Science, Cheongju University, Cheongju, Republic of Korea
Young Hee Ahn
Department of Chemistry, Hannam University, Daejeon, Republic of Korea
Eun Sun Ji
Department of Chemistry, Sogang University, Seoul, Republic of Korea
Ki Na Yun
Cancer Biomarkers Development Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
Yong-Sam Kim & Jeong-Heon Ko
Department of Food Nutrition, Chungnam National University, Daejeon, Republic of Korea
Jae Han Kim
Yonsei Proteome Research Center and Department of Integrated OMICS for Biomedical Science, and Department of Biochemistry, College of Life Science and Biotechnology, Yonsei University, Seoul, Republic of Korea
Young-Ki Paik

Authors

Gun Wook Park
View author publications
You can also search for this author in PubMed Google Scholar
Jin Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Heeyoun Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Ju Yeon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Young Hee Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Hyun Kyoung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Eun Sun Ji
View author publications
You can also search for this author in PubMed Google Scholar
Kwang Hoe Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hoi Keun Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Ki Na Yun
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Sam Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jeong-Heon Ko
View author publications
You can also search for this author in PubMed Google Scholar
Hyun Joo An
View author publications
You can also search for this author in PubMed Google Scholar
Jae Han Kim
View author publications
You can also search for this author in PubMed Google Scholar
Young-Ki Paik
View author publications
You can also search for this author in PubMed Google Scholar
Jong Shin Yoo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.W.P., J.Y.K. and J.S.Y. designed and invented the method; Y.H.A., Y.S.K., J.H.K., H.J.A. and J.H.K. provided critical input and contributed in the developments; J.Y.L., H.K.L., H.H., E.S.J., K.H.K., H.K.J. and K.N.Y. performed the experiments; G.W.P., J.Y.K., Y.K.P. and J.S.Y. interpreted the experiments and wrote the manuscript.

Corresponding author

Correspondence to Jong Shin Yoo.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information (PDF 6128 kb)

Supplementary Excel 1 (XLSX 36 kb)

Supplementary Excel 2 (XLSX 32 kb)

Supplementary Excel 3 (XLSX 69 kb)

Supplementary Excel 4 (XLSX 92 kb)

Supplementary Excel 5 (XLSX 39 kb)

Supplementary Excel 6 (XLSX 28 kb)

Supplementary Excel 7 (XLSX 28 kb)

Supplementary Excel 8 (XLSX 22 kb)

Supplementary Excel 9 (XLSX 46 kb)

Supplementary Excel 10 (XLSX 120 kb)

Supplementary Excel 11 (XLSX 94 kb)

Supplementary Excel 12 (XLSX 283 kb)

Supplementary Excel 13 (XLSX 47 kb)

Supplementary Excel 14 (XLSX 29 kb)

Supplementary Excel 15 (XLSX 226 kb)

Supplementary Excel 16 (XLSX 44 kb)

Supplementary Excel 17 (XLSX 31 kb)

Supplementary Excel 18 (XLSX 90 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Park, G., Kim, J., Hwang, H. et al. Integrated GlycoProteome Analyzer (I-GPA) for Automated Identification and Quantitation of Site-Specific N-Glycosylation. Sci Rep 6, 21175 (2016). https://doi.org/10.1038/srep21175

Download citation

Received: 14 June 2015
Accepted: 19 January 2016
Published: 17 February 2016
DOI: https://doi.org/10.1038/srep21175
Springer Nature Limited

This article is cited by

Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis
- Rebeca Kawahara
- Anastasia Chernykh
- Morten Thaysen-Andersen
Nature Methods (2021)
High-throughput detection of low abundance sialylated glycoproteins in human serum by TiO2 enrichment and targeted LC-MS/MS analysis: application to a prostate cancer sample set
- Caterina Gabriele
- Francesco Cantiello
- Marco Gaspari
Analytical and Bioanalytical Chemistry (2019)
Large-scale intact glycopeptide identification by Mascot database search
- Ravi Chand Bollineni
- Christian Jeffrey Koehler
- Bernd Thiede
Scientific Reports (2018)
Sequential fragment ion filtering and endoglycosidase-assisted identification of intact glycopeptides
- Zixiang Yu
- Xinyuan Zhao
- Wantao Ying
Analytical and Bioanalytical Chemistry (2017)
Direct analysis of site-specific N-glycopeptides of serological proteins in dried blood spot samples
- Na Young Choi
- Heeyoun Hwang
- Jong Shin Yoo
Analytical and Bioanalytical Chemistry (2017)

Integrated GlycoProteome Analyzer (I-GPA) for Automated Identification and Quantitation of Site-Specific N-Glycosylation

Abstract

Similar content being viewed by others

Introduction

Results

Construction and evaluation of composite GPA-DBs containing GPA-DB-AGP, GPA-DB-Mixture, and GPA-DB-Human Plasma

Automated identification of N-glycopeptides using id-GPA algorithms

Selection of N-glycopeptide spectra using M-score

Selection of N-glycopeptide candidates using S-score

Identification of N-glycopeptides using Y-score with FDR

High-speed label-free quantitation using q-GPA algorithm

Comparative analysis of multiple samples using c-GPA algorithm

Application of I-GPA to analysis of N-glycopeptides in the HCC plasma

Statistical analysis of the human plasma glycoproteome using a volcano plot

Discussion

Methods

Materials

Enzyme digestion of standard and plasma samples

Enrichment of N-glycopeptides

Nano-LC-ESI-MS/MS analysis

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Navigation