1
|
Čaval T, Hecht ES, Tang W, Uy‐Gomez M, Nichols A, Kil YJ, Sandoval W, Bern M, Heck AJR. The lysosomal endopeptidases Cathepsin D and L are selective and effective proteases for the middle-down characterization of antibodies. FEBS J 2021; 288:5389-5405. [PMID: 33713388 PMCID: PMC8518856 DOI: 10.1111/febs.15813] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 01/23/2021] [Accepted: 03/08/2021] [Indexed: 01/18/2023]
Abstract
Mass spectrometry is gaining momentum as a method of choice to de novo sequence antibodies (Abs). Adequate sequence coverage of the hypervariable regions remains one of the toughest identification challenges by either bottom-up or top-down workflows. Methods that efficiently generate mid-size Ab fragments would further facilitate top-down MS and decrease data complexity. Here, we explore the proteases Cathepsins L and D for forming protein fragments from three IgG1s, one IgG2, and one bispecific, knob-and-hole IgG1. We demonstrate that high-resolution native MS provides a sensitive method for the detection of clipping sites. Both Cathepsins produced multiple, albeit specific cleavages. The Abs were cleaved immediately after the CDR3 region, yielding ~ 12 kDa fragments, that is, ideal sequencing-sized. Cathepsin D, but not Cathepsin L, also cleaved directly below the Ab hinge, releasing the F(ab')2. When constrained by the different disulfide bonds found in the IgG2 subtype or by the tertiary structure of the hole-containing bispecific IgG1, the hinge region digest product was not produced. The Cathepsin L and Cathepsin D clipping motifs were related to sequences of neutral amino acids and the tertiary structure of the Ab. A single pot (L + D) digestion protocol was optimized to achieve 100% efficiency. Nine protein fragments, corresponding to the VL, VH, CL, CH1, CH2, CH3, CL + CH1, and F(ab')2, constituted ~ 70% of the summed intensities of all deconvolved proteolytic products. Cleavage sites were confirmed by the Edman degradation and validated with top-down sequencing. The described work offers a complementary method for middle-down analysis that may be applied to top-down Ab sequencing. ENZYMES: Cathepsin L-EC 3.4.22.15, Cathepsin D-EC 3.4.23.5.
Collapse
Affiliation(s)
- Tomislav Čaval
- Biomolecular Mass Spectrometry and ProteomicsBijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical SciencesUtrecht UniversityThe Netherlands
- Netherlands Proteomics CentreUtrechtThe Netherlands
| | - Elizabeth Sara Hecht
- Department of Microchemistry, Proteomics, and Lipidomics & Next Generation SequencingGenentech, Inc.South San FranciscoCAUSA
| | | | - Maelia Uy‐Gomez
- Department of Microchemistry, Proteomics, and Lipidomics & Next Generation SequencingGenentech, Inc.South San FranciscoCAUSA
| | | | | | - Wendy Sandoval
- Department of Microchemistry, Proteomics, and Lipidomics & Next Generation SequencingGenentech, Inc.South San FranciscoCAUSA
| | | | - Albert J. R. Heck
- Biomolecular Mass Spectrometry and ProteomicsBijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical SciencesUtrecht UniversityThe Netherlands
- Netherlands Proteomics CentreUtrechtThe Netherlands
| |
Collapse
|
2
|
Roushan A, Wilson GM, Kletter D, Sen KI, Tang W, Kil YJ, Carlson E, Bern M. Peak Filtering, Peak Annotation, and Wildcard Search for Glycoproteomics. Mol Cell Proteomics 2020; 20:100011. [PMID: 33578083 PMCID: PMC8724605 DOI: 10.1074/mcp.ra120.002260] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/02/2020] [Accepted: 09/03/2020] [Indexed: 12/11/2022] Open
Abstract
Glycopeptides in peptide or digested protein samples pose a number of analytical and bioinformatics challenges beyond those posed by unmodified peptides or peptides with smaller posttranslational modifications. Exact structural elucidation of glycans is generally beyond the capability of a single mass spectrometry experiment, so a reasonable level of identification for tandem mass spectrometry, taken by several glycopeptide software tools, is that of peptide sequence and glycan composition, meaning the number of monosaccharides of each distinct mass, e.g., HexNAc(2)Hex(5) rather than man5. Even at this level, however, glycopeptide analysis poses challenges: finding glycopeptide spectra when they are a tiny fraction of the total spectra; assigning spectra with unanticipated glycans, not in the initial glycan database; and finding, scoring, and labeling diagnostic peaks in tandem mass spectra. Here, we discuss recent improvements to Byonic, a glycoproteomics search program, that address these three issues. Byonic now supports filtering spectra by m/z peaks, so that the user can limit attention to spectra with diagnostic peaks, e.g., at least two out of three of 204.087 for HexNAc, 274.092 for NeuAc (with water loss), and 366.139 for HexNAc-Hex, all within a set mass tolerance, e.g., ± 0.01 Da. Also, new is glycan "wildcard" search, which allows an unspecified mass within a user-set mass range to be applied to N- or O-linked glycans and enables assignment of spectra with unanticipated glycans. Finally, the next release of Byonic supports user-specified peak annotations from user-defined posttranslational modifications. We demonstrate the utility of these new software features by finding previously unrecognized glycopeptides in publicly available data, including glycosylated neuropeptides from rat brain.
Collapse
Affiliation(s)
- Abhishek Roushan
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - Gary M Wilson
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - Doron Kletter
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - K Ilker Sen
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - Wilfred Tang
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - Yong J Kil
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - Eric Carlson
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA
| | - Marshall Bern
- Research and Development Group, Protein Metrics Inc, Cupertino, California, USA.
| |
Collapse
|
3
|
Bern M, Caval T, Kil YJ, Tang W, Becker C, Carlson E, Kletter D, Sen KI, Galy N, Hagemans D, Franc V, Heck AJR. Parsimonious Charge Deconvolution for Native Mass Spectrometry. J Proteome Res 2018; 17:1216-1226. [PMID: 29376659 PMCID: PMC5838638 DOI: 10.1021/acs.jproteome.7b00839] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
![]()
Charge
deconvolution infers the mass from mass over charge (m/z) measurements in electrospray ionization
mass spectra. When applied over a wide input m/z or broad target mass range, charge-deconvolution algorithms
can produce artifacts, such as false masses at one-half or one-third
of the correct mass. Indeed, a maximum entropy term in the objective
function of MaxEnt, the most commonly used charge deconvolution algorithm,
favors a deconvolved spectrum with many peaks over one with fewer
peaks. Here we describe a new “parsimonious” charge
deconvolution algorithm that produces fewer artifacts. The algorithm
is especially well-suited to high-resolution native mass spectrometry
of intact glycoproteins and protein complexes. Deconvolution of native
mass spectra poses special challenges due to salt and small molecule
adducts, multimers, wide mass ranges, and fewer and lower charge states.
We demonstrate the performance of the new deconvolution algorithm
on a range of samples. On the heavily glycosylated plasma properdin
glycoprotein, the new algorithm could deconvolve monomer and dimer
simultaneously and, when focused on the m/z range of the monomer, gave accurate and interpretable
masses for glycoforms that had previously been analyzed manually using m/z peaks rather than deconvolved masses.
On therapeutic antibodies, the new algorithm facilitated the analysis
of extensions, truncations, and Fab glycosylation. The algorithm facilitates
the use of native mass spectrometry for the qualitative and quantitative
analysis of protein and protein assemblies.
Collapse
Affiliation(s)
- Marshall Bern
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | - Tomislav Caval
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Science4Life, Utrecht University and Netherlands Proteomics Centre , Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Yong J Kil
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | - Wilfred Tang
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | | | - Eric Carlson
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | - Doron Kletter
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | - K Ilker Sen
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | - Nicolas Galy
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Science4Life, Utrecht University and Netherlands Proteomics Centre , Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Dominique Hagemans
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Science4Life, Utrecht University and Netherlands Proteomics Centre , Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Vojtech Franc
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Science4Life, Utrecht University and Netherlands Proteomics Centre , Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Albert J R Heck
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Science4Life, Utrecht University and Netherlands Proteomics Centre , Padualaan 8, 3584 CH Utrecht, The Netherlands
| |
Collapse
|
4
|
Sen KI, Tang WH, Nayak S, Kil YJ, Bern M, Ozoglu B, Ueberheide B, Davis D, Becker C. Automated Antibody De Novo Sequencing and Its Utility in Biopharmaceutical Discovery. J Am Soc Mass Spectrom 2017; 28:803-810. [PMID: 28105549 PMCID: PMC5392168 DOI: 10.1007/s13361-016-1580-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 12/02/2016] [Accepted: 12/04/2016] [Indexed: 05/12/2023]
Abstract
Applications of antibody de novo sequencing in the biopharmaceutical industry range from the discovery of new antibody drug candidates to identifying reagents for research and determining the primary structure of innovator products for biosimilar development. When murine, phage display, or patient-derived monoclonal antibodies against a target of interest are available, but the cDNA or the original cell line is not, de novo protein sequencing is required to humanize and recombinantly express these antibodies, followed by in vitro and in vivo testing for functional validation. Availability of fully automated software tools for monoclonal antibody de novo sequencing enables efficient and routine analysis. Here, we present a novel method to automatically de novo sequence antibodies using mass spectrometry and the Supernovo software. The robustness of the algorithm is demonstrated through a series of stress tests. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- K Ilker Sen
- Protein Metrics Inc, 1622 San Carlos Ave, Suite C, San Carlos, CA, 94070, USA.
| | - Wilfred H Tang
- Protein Metrics Inc, 1622 San Carlos Ave, Suite C, San Carlos, CA, 94070, USA
| | - Shruti Nayak
- Langone Medical Center, New York University, 430 East 29th street, 8th floor room 860, New York, NY, 10016, USA
| | - Yong J Kil
- Protein Metrics Inc, 1622 San Carlos Ave, Suite C, San Carlos, CA, 94070, USA
| | - Marshall Bern
- Protein Metrics Inc, 1622 San Carlos Ave, Suite C, San Carlos, CA, 94070, USA
| | - Berk Ozoglu
- Janssen Research and Development, LLC, 1400 McKean Road, Spring House, PA, 19477, USA
| | - Beatrix Ueberheide
- Langone Medical Center, New York University, 430 East 29th street, 8th floor room 860, New York, NY, 10016, USA
| | - Darryl Davis
- Janssen Research and Development, LLC, 1400 McKean Road, Spring House, PA, 19477, USA
| | - Christopher Becker
- Protein Metrics Inc, 1622 San Carlos Ave, Suite C, San Carlos, CA, 94070, USA
| |
Collapse
|
5
|
Kil YJ, Bern M, Crowell K, Kletter D, Bern N, Tang W, Carlson E, Becker C. Towards a Comprehensive Bioinformatic Analysis of the NIST Reference mAb. ACS Symposium Series 2015. [DOI: 10.1021/bk-2015-1202.ch014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Affiliation(s)
- Yong J. Kil
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | - Marshall Bern
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | - Kevin Crowell
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | - Doron Kletter
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | - Nicholas Bern
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | - Wilfred Tang
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | - Eric Carlson
- Protein Metrics, Inc., San Carlos, California 94070, United States
| | | |
Collapse
|
6
|
Abstract
Byonic is the name of a software package for peptide and protein identification by tandem mass spectrometry. This software, which has only recently become commercially available, facilitates a much wider range of search possibilities than previous search software such as SEQUEST and Mascot. Byonic allows the user to define an essentially unlimited number of variable modification types. Byonic also allows the user to set a separate limit on the number of occurrences of each modification type, so that a search may consider only one or two chance modifications such as oxidations and deamidations per peptide, yet allow three or four biological modifications such as phosphorylations, which tend to cluster together. Hence, Byonic can search for tens or even hundreds of modification types simultaneously without a prohibitively large combinatorial explosion. Byonic's Wildcard Search allows the user to search for unanticipated or even unknown modifications alongside known modifications. Finally, Byonic's Glycopeptide Search allows the user to identify glycopeptides without prior knowledge of glycan masses or glycosylation sites.
Collapse
|
7
|
Abstract
Byonic is the name of a software package for peptide and protein identification by tandem mass spectrometry. This software, which has only recently become commercially available, facilitates a much wider range of search possibilities than previous search software such as SEQUEST and Mascot. Byonic allows the user to define an essentially unlimited number of variable modification types. Byonic also allows the user to set a separate limit on the number of occurrences of each modification type, so that a search may consider only one or two chance modifications such as oxidations and deamidations per peptide, yet allow three or four biological modifications such as phosphorylations, which tend to cluster together. Hence, Byonic can search for tens or even hundreds of modification types simultaneously without a prohibitively large combinatorial explosion. Byonic's Wildcard Search allows the user to search for unanticipated or even unknown modifications alongside known modifications. Finally, Byonic's Glycopeptide Search allows the user to identify glycopeptides without prior knowledge of glycan masses or glycosylation sites.
Collapse
Affiliation(s)
- Marshall Bern
- Protein Metrics Inc, San Carlos, California
- Palo Alto Research Center, Palo Alto, California
| | - Yong J Kil
- Protein Metrics Inc, San Carlos, California
| | | |
Collapse
|
8
|
Bhatia S, Kil YJ, Ueberheide B, Chait BT, Tayo L, Cruz L, Lu B, Yates JR, Bern M. Constrained de novo sequencing of conotoxins. J Proteome Res 2012; 11:4191-200. [PMID: 22709442 DOI: 10.1021/pr300312h] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
De novo peptide sequencing by mass spectrometry (MS) can determine the amino acid sequence of an unknown peptide without reference to a protein database. MS-based de novo sequencing assumes special importance in focused studies of families of biologically active peptides and proteins, such as hormones, toxins, and antibodies, for which amino acid sequences may be difficult to obtain through genomic methods. These protein families often exhibit sequence homology or characteristic amino acid content; yet, current de novo sequencing approaches do not take advantage of this prior knowledge and, hence, search an unnecessarily large space of possible sequences. Here, we describe an algorithm for de novo sequencing that incorporates sequence constraints into the core graph algorithm and thereby reduces the search space by many orders of magnitude. We demonstrate our algorithm in a study of cysteine-rich toxins from two cone snail species (Conus textile and Conus stercusmuscarum) and report 13 de novo and about 60 total toxins.
Collapse
Affiliation(s)
- Swapnil Bhatia
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, California 94304, United States
| | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Abstract
The target-decoy approach to estimating and controlling false discovery rate (FDR) has become a de facto standard in shotgun proteomics, and it has been applied at both the peptide-to-spectrum match (PSM) and protein levels. Current bioinformatics methods control either the PSM- or the protein-level FDR, but not both. In order to obtain the most reliable information from their data, users must employ one method when the number of tandem mass spectra exceeds the number of proteins in the database and another method when the reverse is true. Here we propose a simple variation of the standard target-decoy strategy that estimates and controls PSM and protein FDRs simultaneously, regardless of the relative numbers of spectra and proteins. We demonstrate that even if the final goal is a list of PSMs with a fixed low FDR and not a list of protein identifications, the proposed two-dimensional strategy offers advantages over a pure PSM-level strategy.
Collapse
Affiliation(s)
- Marshall W Bern
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, California 94304, United States.
| | | |
Collapse
|
10
|
Abstract
Database search programs for peptide identification by tandem mass spectrometry ask their users to set various parameters, including precursor and fragment mass tolerances, digestion specificity, and allowed types of modifications. Even proteomics experts with detailed knowledge of their samples may find it difficult to make these choices without significant investigation, and poor choices can lead to missed identifications and misleading results. Here we describe a program called Preview that analyzes a set of mass spectra for mass errors, digestion specificity, and known and unknown modifications, thereby facilitating parameter selection. Moreover, Preview optionally recalibrates mass over charge measurements, leading to further improvement in identification results. In a study of Bruton's tyrosine kinase, we find that the use of Preview improved the number of confidently identified mass spectra and phosphorylation sites by about 50%.
Collapse
Affiliation(s)
- Yong J Kil
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, California 94304, USA
| | | | | | | | | |
Collapse
|
11
|
Abstract
Everett et al. recently reported on a statistical bias that arises in the target-decoy approach to false discovery rate estimation in two-pass proteomics search strategies as exemplified by X!Tandem. This bias can cause serious underestimation of the false discovery rate. We argue here that the "unbiased" solution proposed by Everett et al., however, is also biased and under certain circumstances can also result in a serious underestimate of the FDR, especially at the protein level.
Collapse
Affiliation(s)
- Marshall Bern
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, California 94304, USA.
| | | |
Collapse
|