1
|
Assael Y, Sommerschield T, Shillingford B, Bordbar M, Pavlopoulos J, Chatzipanagiotou M, Androutsopoulos I, Prag J, de Freitas N. Restoring and attributing ancient texts using deep neural networks. Nature 2022; 603:280-283. [PMID: 35264762 PMCID: PMC8907065 DOI: 10.1038/s41586-022-04448-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 01/19/2022] [Indexed: 01/12/2023]
Abstract
Ancient history relies on disciplines such as epigraphy-the study of inscribed texts known as inscriptions-for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian's workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.
Collapse
Affiliation(s)
| | - Thea Sommerschield
- Department of Humanities, Ca' Foscari University of Venice, Venice, Italy.
- Center for Hellenic Studies, Harvard University, Washington, DC, USA.
| | | | | | - John Pavlopoulos
- Department of Informatics, Athens University of Economics and Business, Athens, Greece
| | | | - Ion Androutsopoulos
- Department of Informatics, Athens University of Economics and Business, Athens, Greece
| | - Jonathan Prag
- Faculty of Classics, University of Oxford, Oxford, UK
| | | |
Collapse
|
2
|
Banerjee A, Das N, Dey R, Majumder S, Shit P, Banerjee A, Ghosh N, Bhadra A. Power-laws in dog behavior may pave the way to predictive models: A pattern analysis study. Heliyon 2021; 7:e07243. [PMID: 34195401 PMCID: PMC8239746 DOI: 10.1016/j.heliyon.2021.e07243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 04/21/2021] [Accepted: 06/03/2021] [Indexed: 11/26/2022] Open
Abstract
Apparently random events in nature often reveal hidden patterns when analyzed using diverse and robust statistical tools. Power law distributions, for example, project diverse natural phenomenon, ranging from earthquakes to heartbeat dynamics into a common platform of self-similarity. Animal behavior in specific contexts has been shown to follow power law distributions. However, the behavioral repertoire of a species in its entirety has never been analyzed for the existence of such underlying patterns. Here we show that the frequency-rank data of randomly sighted behaviors at the population level of free-ranging dogs follow a scale-invariant power law behavior. It suggests that irrespective of changes in location of sightings, seasonal variations and observer bias, datasets exhibit a conserved trend of scale invariance. The data also exhibits robust self-similarity patterns at different scales which we extract using multifractal detrended fluctuation analysis. We observe that the probability of consecutive occurrence of behaviors of adjacent ranks is much higher than behaviors widely separated in rank. The findings open up the possibility of designing predictive models of behavior from correlations existing in true time series of behavioral data and exploring the general behavioral repertoire of a species for the presence of syntax.
Collapse
Affiliation(s)
- Arunita Banerjee
- Behavior and Ecology Lab, Department of Biological Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| | - Nandan Das
- Tissue Optics and Microcirculation Imaging, School of Physics, National University of Ireland, Galway, Ireland
| | - Rajib Dey
- Tissue Optics and Microcirculation Imaging, School of Physics, National University of Ireland, Galway, Ireland
- Department of Physical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| | - Shouvik Majumder
- Department of Mathematical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| | - Piuli Shit
- Behavior and Ecology Lab, Department of Biological Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| | - Ayan Banerjee
- Department of Physical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| | - Nirmalya Ghosh
- Department of Physical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| | - Anindita Bhadra
- Behavior and Ecology Lab, Department of Biological Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, PIN 741246, West Bengal, India
| |
Collapse
|
3
|
Restoration of fragmentary Babylonian texts using recurrent neural networks. Proc Natl Acad Sci U S A 2020; 117:22743-22751. [PMID: 32873650 DOI: 10.1073/pnas.2003794117] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The main sources of information regarding ancient Mesopotamian history and culture are clay cuneiform tablets. Many of these tablets are damaged, leading to missing information. Currently, the missing text is manually reconstructed by experts. We investigate the possibility of assisting scholars, by modeling the language using recurrent neural networks and automatically completing the breaks in ancient Akkadian texts from Achaemenid period Babylonia.
Collapse
|
4
|
Abstract
Although no historical information exists about the Indus civilization (flourished ca. 2600-1900 B.C.), archaeologists have uncovered about 3,800 short samples of a script that was used throughout the civilization. The script remains undeciphered, despite a large number of attempts and claimed decipherments over the past 80 years. Here, we propose the use of probabilistic models to analyze the structure of the Indus script. The goal is to reveal, through probabilistic analysis, syntactic patterns that could point the way to eventual decipherment. We illustrate the approach using a simple Markov chain model to capture sequential dependencies between signs in the Indus script. The trained model allows new sample texts to be generated, revealing recurring patterns of signs that could potentially form functional subunits of a possible underlying language. The model also provides a quantitative way of testing whether a particular string belongs to the putative language as captured by the Markov model. Application of this test to Indus seals found in Mesopotamia and other sites in West Asia reveals that the script may have been used to express different content in these regions. Finally, we show how missing, ambiguous, or unreadable signs on damaged objects can be filled in with most likely predictions from the model. Taken together, our results indicate that the Indus script exhibits rich synactic structure and the ability to represent diverse content. both of which are suggestive of a linguistic writing system rather than a nonlinguistic symbol system.
Collapse
|