1
|
Chiang Y, Welker F, Collins MJ. Spectra without stories: reporting 94% dark and unidentified ancient proteomes. OPEN RESEARCH EUROPE 2024; 4:71. [PMID: 38903702 PMCID: PMC11187534 DOI: 10.12688/openreseurope.17225.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/15/2024] [Indexed: 06/22/2024]
Abstract
Background Data-dependent, bottom-up proteomics is widely used for identifying proteins and peptides. However, one key challenge is that 70% of fragment ion spectra consistently fail to be assigned by conventional database searching. This 'dark matter' of bottom-up proteomics seems to affect fields where non-model organisms, low-abundance proteins, non-tryptic peptides, and complex modifications may be present. While palaeoproteomics may appear as a niche field, understanding and reporting unidentified ancient spectra require collaborative innovation in bioinformatics strategies. This may advance the analysis of complex datasets. Methods 14.97 million high-impact ancient spectra published in Nature and Science portfolios were mined from public repositories. Identification rates, defined as the proportion of assigned fragment ion spectra, were collected as part of deposited database search outputs or parsed using open-source python packages. Results and Conclusions We report that typically 94% of the published ancient spectra remain unidentified. This phenomenon may be caused by multiple factors, notably the limitations of database searching and the selection of user-defined reference data with advanced modification patterns. These 'spectra without stories' highlight the need for widespread data sharing to facilitate methodological development and minimise the loss of often irreplaceable ancient materials. Testing and validating alternative search strategies, such as open searching and de novo sequencing, may also improve overall identification rates. Hence, lessons learnt in palaeoproteomics may benefit other fields grappling with challenging data.
Collapse
Affiliation(s)
- Yun Chiang
- Globe Institute, University of Copenhagen, Copenhagen, Denmark
- The Nice Institute of Chemistry, Universite Cote d'Azur, Nice, France
| | - Frido Welker
- Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Matthew James Collins
- Globe Institute, University of Copenhagen, Copenhagen, Denmark
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, England, UK
| |
Collapse
|
2
|
Taniguchi K, Miyaguchi H. COL1A2 Barcoding: Bone Species Identification via Shotgun Proteomics. J Proteome Res 2024; 23:377-385. [PMID: 38091499 DOI: 10.1021/acs.jproteome.3c00615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
Species identification of fragmentary bones remains a challenging task in archeology and forensics. A species identification method for such fragmentary bones that has recently attracted interest is the use of bone collagen proteins. Here, we describe a method similar to DNA barcoding that reads collagen protein sequences in bone and automatically determines the species by performing sequence database searches. The method is almost identical to conventional shotgun proteomics analysis of bone samples, except that the database used by the SEQUEST search engine consisted only of entries for collagen type 1 alpha 2 (COL1A2) proteins from various vertebrates. Accordingly, the COL1A2 peptides that differ in sequence among species act as species marker peptides. In SEQUEST-based shotgun proteomics, the protein entries that contain more marker peptide sequences are assigned higher scores; therefore, the highest-scoring protein entry will be the COL1A2 entry for the species from which the analyzed bone was derived. We tested our method using bone samples from 30 vertebrate species and found that all species were correctly identified. In conclusion, COL1A2 can be used as a bone protein barcode and can be read through shotgun proteomics, allowing for automatic bone species identification. Data are available via ProteomeXchange with the identifier PXD045402.
Collapse
Affiliation(s)
- Kei Taniguchi
- National Research Institute of Police Science, 6-3-1, Kashiwanoha, Kashiwa 277-0882, Chiba, Japan
| | - Hajime Miyaguchi
- National Research Institute of Police Science, 6-3-1, Kashiwanoha, Kashiwa 277-0882, Chiba, Japan
| |
Collapse
|
3
|
DIA mass spectrometry characterizes urinary proteomics in neonatal and adult donkeys. Sci Rep 2022; 12:22590. [PMID: 36585464 PMCID: PMC9803668 DOI: 10.1038/s41598-022-27245-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 12/28/2022] [Indexed: 12/31/2022] Open
Abstract
Health monitoring is critical for newborn animals due to their vulnerability to diseases. Urine can be not only a useful and non-invasive tool (free-catch samples) to reflect the physiological status of animals but also to help monitor the progression of diseases. Proteomics involves the study of the whole complement of proteins and peptides, including structure, quantities, functions, variations and interactions. In this study, urinary proteomics of neonatal donkeys were characterized and compared to the profiles of adult donkeys to provide a reference database for healthy neonatal donkeys. The urine samples were collected from male neonatal donkeys on their sixth to tenth days of life (group N) and male adult donkeys aging 4-6 years old (group A). Library-free data-independent acquisition (direct DIA) mass spectrometry-based proteomics were applied to analyze the urinary protein profiles. Total 2179 urinary proteins were identified, and 411 proteins were differentially expressed (P < 0.05) between the two groups. 104 proteins were exclusively expressed in group N including alpha fetoprotein (AFP), peptidase-mitochondrial processing data unit (PMPCB), and upper zone of growth plate and cartilage matrix associated (UCMA), which might be used to monitor the health status of neonatal donkeys. In functional analysis, some differentially expressed proteins were identified related to immune system pathways, which might provide more insight in the immature immunity of neonatal donkeys. To the best of our knowledge, this is the first time to report donkey urinary proteome and our results might provide reference for urinary biomarker discovery used to monitor and evaluate health status of neonatal donkeys.
Collapse
|
4
|
Abstract
Paleoproteomics, the study of ancient proteins, is a rapidly growing field at the intersection of molecular biology, paleontology, archaeology, paleoecology, and history. Paleoproteomics research leverages the longevity and diversity of proteins to explore fundamental questions about the past. While its origins predate the characterization of DNA, it was only with the advent of soft ionization mass spectrometry that the study of ancient proteins became truly feasible. Technological gains over the past 20 years have allowed increasing opportunities to better understand preservation, degradation, and recovery of the rich bioarchive of ancient proteins found in the archaeological and paleontological records. Growing from a handful of studies in the 1990s on individual highly abundant ancient proteins, paleoproteomics today is an expanding field with diverse applications ranging from the taxonomic identification of highly fragmented bones and shells and the phylogenetic resolution of extinct species to the exploration of past cuisines from dental calculus and pottery food crusts and the characterization of past diseases. More broadly, these studies have opened new doors in understanding past human-animal interactions, the reconstruction of past environments and environmental changes, the expansion of the hominin fossil record through large scale screening of nondiagnostic bone fragments, and the phylogenetic resolution of the vertebrate fossil record. Even with these advances, much of the ancient proteomic record still remains unexplored. Here we provide an overview of the history of the field, a summary of the major methods and applications currently in use, and a critical evaluation of current challenges. We conclude by looking to the future, for which innovative solutions and emerging technology will play an important role in enabling us to access the still unexplored "dark" proteome, allowing for a fuller understanding of the role ancient proteins can play in the interpretation of the past.
Collapse
Affiliation(s)
- Christina Warinner
- Department
of Anthropology, Harvard University, Cambridge, Massachusetts 02138, United States
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Kristine Korzow Richter
- Department
of Anthropology, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Matthew J. Collins
- Department
of Archaeology, Cambridge University, Cambridge CB2 3DZ, United Kingdom
- Section
for Evolutionary Genomics, Globe Institute,
University of Copenhagen, Copenhagen 1350, Denmark
| |
Collapse
|
5
|
Runge AKW, Hendy J, Richter KK, Masson-MacLean E, Britton K, Mackie M, McGrath K, Collins M, Cappellini E, Speller C. Palaeoproteomic analyses of dog palaeofaeces reveal a preserved dietary and host digestive proteome. Proc Biol Sci 2021; 288:20210020. [PMID: 34229485 PMCID: PMC8261203 DOI: 10.1098/rspb.2021.0020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The domestic dog has inhabited the anthropogenic niche for at least 15 000 years, but despite their impact on human strategies, the lives of dogs and their interactions with humans have only recently become a subject of interest to archaeologists. In the Arctic, dogs rely exclusively on humans for food during the winter, and while stable isotope analyses have revealed dietary similarities at some sites, deciphering the details of provisioning strategies have been challenging. In this study, we apply zooarchaeology by mass spectrometry (ZooMS) and liquid chromatography tandem mass spectrometry to dog palaeofaeces to investigate protein preservation in this highly degradable material and obtain information about the diet of domestic dogs at the Nunalleq site, Alaska. We identify a suite of digestive and metabolic proteins from the host species, demonstrating the utility of this material as a novel and viable substrate for the recovery of gastrointestinal proteomes. The recovered proteins revealed that the Nunalleq dogs consumed a range of Pacific salmon species (coho, chum, chinook and sockeye) and that the consumed tissues derived from muscle and bone tissues as well as roe and guts. Overall, the study demonstrated the viability of permafrost-preserved palaeofaeces as a unique source of host and dietary proteomes.
Collapse
Affiliation(s)
- Anne Kathrine W Runge
- BioArCh, Department of Archaeology, University of York, Environment Building, Wentworth Way, YO10 5DD York, UK.,Section for Evolutionary Genomics, the GLOBE Institute, University of Copenhagen, Øster Farimagsgade 5A, 1353 København K, Denmark
| | - Jessica Hendy
- BioArCh, Department of Archaeology, University of York, Environment Building, Wentworth Way, YO10 5DD York, UK.,Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Kahlaische Strasse 10, 07743 Jena, Germany
| | - Kristine K Richter
- Department of Archaeology, Max Planck Institute for the Science of Human History, Kahlaische Strasse 10, 07743 Jena, Germany.,Department of Anthropology, Harvard University, Cambridge, MA 02138, USA
| | | | - Kate Britton
- Department of Archaeology, University of Aberdeen, Aberdeen, Scotland, UK.,Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, Leipzig 04103
| | - Meaghan Mackie
- Section for Evolutionary Genomics, the GLOBE Institute, University of Copenhagen, Øster Farimagsgade 5A, 1353 København K, Denmark.,The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3b, 2200 København N, Denmark
| | - Krista McGrath
- BioArCh, Department of Archaeology, University of York, Environment Building, Wentworth Way, YO10 5DD York, UK.,Department of Prehistory and Institute of Environmental Science and Technology (ICTA), Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Matthew Collins
- Section for Evolutionary Genomics, the GLOBE Institute, University of Copenhagen, Øster Farimagsgade 5A, 1353 København K, Denmark.,Department of Archaeology, University of Cambridge, Cambridge CB2 3DZ, UK
| | - Enrico Cappellini
- Section for Evolutionary Genomics, the GLOBE Institute, University of Copenhagen, Øster Farimagsgade 5A, 1353 København K, Denmark
| | - Camilla Speller
- BioArCh, Department of Archaeology, University of York, Environment Building, Wentworth Way, YO10 5DD York, UK.,Department of Anthropology, University of British Columbia, 6303 NW Marine Drive, Vancouver, Canada V6T 1Z1
| |
Collapse
|
6
|
Assessing the degradation of ancient milk proteins through site-specific deamidation patterns. Sci Rep 2021; 11:7795. [PMID: 33833277 PMCID: PMC8032661 DOI: 10.1038/s41598-021-87125-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 03/23/2021] [Indexed: 12/04/2022] Open
Abstract
The origins, prevalence and nature of dairying have been long debated by archaeologists. Within the last decade, new advances in high-resolution mass spectrometry have allowed for the direct detection of milk proteins from archaeological remains, including ceramic residues, dental calculus, and preserved dairy products. Proteins recovered from archaeological remains are susceptible to post-excavation and laboratory contamination, a particular concern for ancient dairying studies as milk proteins such as beta-lactoglobulin (BLG) and caseins are potential laboratory contaminants. Here, we examine how site-specific rates of deamidation (i.e., deamidation occurring in specific positions in the protein chain) can be used to elucidate patterns of peptide degradation, and authenticate ancient milk proteins. First, we characterize site-specific deamidation patterns in modern milk products and experimental samples, confirming that deamidation occurs primarily at low half-time sites. We then compare this to previously published palaeoproteomic data from six studies reporting ancient milk peptides. We confirm that site-specific deamidation rates, on average, are more advanced in BLG recovered from ancient dental calculus and pottery residues. Nevertheless, deamidation rates displayed a high degree of variability, making it challenging to authenticate samples with relatively few milk peptides. We demonstrate that site-specific deamidation is a useful tool for identifying modern contamination but highlight the need for multiple lines of evidence to authenticate ancient protein data.
Collapse
|
7
|
Tsutaya T, Mackie M, Sawafuji R, Miyabe-Nishiwaki T, Olsen JV, Cappellini E. Faecal proteomics as a novel method to study mammalian behaviour and physiology. Mol Ecol Resour 2021; 21:1808-1819. [PMID: 33720532 PMCID: PMC8360081 DOI: 10.1111/1755-0998.13380] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 02/28/2021] [Accepted: 03/10/2021] [Indexed: 11/30/2022]
Abstract
Mammalian faeces can be collected noninvasively during field research and provide valuable information on the ecology and evolution of the source individuals. Undigested food remains, genome/metagenome, steroid hormones, and stable isotopes obtained from faecal samples provide evidence on diet, host/symbiont genetics, and physiological status of the individuals. However, proteins in mammalian faeces have hardly been studied, which hinders the molecular investigations into the behaviour and physiology of the source individuals. Here, we apply mass spectrometry-based proteomics to faecal samples (n = 10), collected from infant, juvenile, and adult captive Japanese macaques (Macaca fuscata), to describe the proteomes of the source individual, of the food it consumed, and its intestinal microbes. The results show that faecal proteomics is a useful method to: (i) investigate dietary changes along with breastfeeding and weaning, (ii) reveal the taxonomic and histological origin of the food items consumed, and (iii) estimate physiological status inside intestinal tracts. These types of insights are difficult or impossible to obtain through other molecular approaches. Most mammalian species are facing extinction risk and there is an urgent need to obtain knowledge on their ecology and evolution for better conservation strategy. The faecal proteomics framework we present here is easily applicable to wild settings and other mammalian species, and provides direct evidence of their behaviour and physiology.
Collapse
Affiliation(s)
- Takumi Tsutaya
- Department of Evolutionary Studies of Biosystems, The Graduate University for Advanced Studies, Hayama, Japan.,Biogeochemistry Research Center, Japan Agency for Marine-Earth Science and Technology, Yokosuka, Japan
| | - Meaghan Mackie
- Evolutionary Genomics Section, The Globe Institute, University of Copenhagen, Copenhagen, Denmark.,Proteomics Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Science, University of Copenhagen, Copenhagen, Denmark
| | - Rikai Sawafuji
- Department of Evolutionary Studies of Biosystems, The Graduate University for Advanced Studies, Hayama, Japan
| | | | - Jesper V Olsen
- Proteomics Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Science, University of Copenhagen, Copenhagen, Denmark
| | - Enrico Cappellini
- Evolutionary Genomics Section, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
8
|
Hendy J. Ancient protein analysis in archaeology. SCIENCE ADVANCES 2021; 7:7/3/eabb9314. [PMID: 33523896 PMCID: PMC7810370 DOI: 10.1126/sciadv.abb9314] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 11/20/2020] [Indexed: 05/10/2023]
Abstract
The analysis of ancient proteins from paleontological, archeological, and historic materials is revealing insights into past subsistence practices, patterns of health and disease, evolution and phylogeny, and past environments. This review tracks the development of this field, discusses some of the major methodological strategies used, and synthesizes recent developments in archeological applications of ancient protein analysis. Moreover, this review highlights some of the challenges faced by the field and potential future directions, arguing that the development of minimally invasive or nondestructive techniques, strategies for protein authentication, and the integration of ancient protein analysis with other biomolecular techniques are important research strategies as this field grows.
Collapse
Affiliation(s)
- Jessica Hendy
- BioArCh, Department of Archaeology, University of York, York, UK
- Max Planck Institute for the Science of Human History, Jena, Germany.
| |
Collapse
|
9
|
Gil-Bona A, Bidlack FB. Tooth Enamel and its Dynamic Protein Matrix. Int J Mol Sci 2020; 21:ijms21124458. [PMID: 32585904 PMCID: PMC7352428 DOI: 10.3390/ijms21124458] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 06/19/2020] [Accepted: 06/20/2020] [Indexed: 12/12/2022] Open
Abstract
Tooth enamel is the outer covering of tooth crowns, the hardest material in the mammalian body, yet fracture resistant. The extremely high content of 95 wt% calcium phosphate in healthy adult teeth is achieved through mineralization of a proteinaceous matrix that changes in abundance and composition. Enamel-specific proteins and proteases are known to be critical for proper enamel formation. Recent proteomics analyses revealed many other proteins with their roles in enamel formation yet to be unraveled. Although the exact protein composition of healthy tooth enamel is still unknown, it is apparent that compromised enamel deviates in amount and composition of its organic material. Why these differences affect both the mineralization process before tooth eruption and the properties of erupted teeth will become apparent as proteomics protocols are adjusted to the variability between species, tooth size, sample size and ephemeral organic content of forming teeth. This review summarizes the current knowledge and published proteomics data of healthy and diseased tooth enamel, including advancements in forensic applications and disease models in animals. A summary and discussion of the status quo highlights how recent proteomics findings advance our understating of the complexity and temporal changes of extracellular matrix composition during tooth enamel formation.
Collapse
Affiliation(s)
- Ana Gil-Bona
- The Forsyth Institute, Cambridge, MA 02142, USA
- Department of Developmental Biology, Harvard School of Dental Medicine, Boston, MA 02115, USA
- Correspondence: (A.G.-B.); (F.B.B.)
| | - Felicitas B. Bidlack
- The Forsyth Institute, Cambridge, MA 02142, USA
- Department of Developmental Biology, Harvard School of Dental Medicine, Boston, MA 02115, USA
- Correspondence: (A.G.-B.); (F.B.B.)
| |
Collapse
|