1
|
Cisneros-Barroso E, Gorram F, Ribot-Sansó MA, Alarcon F, Nuel G, González-Moreno J, Rodríguez A, Hernandez-Rodriguez J, Amengual-Cladera E, Martínez-López I, Ripoll-Vera T, Losada-López I, Heine-Suñer D, Plante-Bordeneuve V. Disease risk estimates in V30M variant transthyretin amyloidosis (A-ATTRv) from Mallorca. Orphanet J Rare Dis 2023; 18:255. [PMID: 37653545 PMCID: PMC10472571 DOI: 10.1186/s13023-023-02865-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/20/2023] [Indexed: 09/02/2023] Open
Abstract
BACKGROUND Variant transthyretin amyloidosis (A-ATTRv) is an autosomal dominant disease caused by a range of TTR gene variants which entail great phenotypical heterogeneity and penetrance. In Majorca, the A-ATTRv caused by the V30M gene variant (A-ATTRV30M) is the most common. Since asymptomatic carriers are at risk of developing the disease, estimating age of onset is vital for proper management and follow-up. Thus, the aim of this study was to estimate age-related penetrance in ATTRV30M variant carriers from Majorca. METHODS The disease risk among carriers from ATTRV30M families from Majorca was estimated by Non-parametric survival estimation. Factors potentially involved in the disease expression, namely gender and parent of origin were also analysed. RESULTS A total of 48 heterozygous ATTRV30M families (147 affected patients and 123 were asymptomatic carriers) were included in the analysis. Penetrance progressively increased from 6% at 30 years to 75% at 90 years of age. In contrast to other European populations, we observe a similar risk for both males and females, and no difference of risk according to the parent of origin. CONCLUSIONS In this first study assessing the age-related penetrance of ATTRV30M variant in Majorcan families, no effect of gender or parent of origin was observed. These findings will be helpful for improving management and follow-up of TTR variant carrier individuals.
Collapse
Affiliation(s)
- E Cisneros-Barroso
- Internal Medicine Department. Fundación Instituto de Investigación Sanitaria de Las Islas Baleares, Son Llàtzer University Hospital, Crta Manacor Km 4., 07198, Palma, Spain.
- Balearic Research Group in Genetic Cardiopathies, Sudden Death and TTR Amyloidosis. Health Research Institute of the Balearic Islands (IdISBa), Son Llàtzer University Hospital, Palma, Spain.
| | - F Gorram
- Department of Neurology, University Hospital Henri Mondor, 51 Avenue du Maréchal de Lattre de Tasigny, 94000, Créteil, France
- Paris Est-Créteil University, Créteil, France
- Inserm U.955, Institut Mondor de Recherche Biomédicale (IMRB), Créteil, France
| | - M A Ribot-Sansó
- Internal Medicine Department. Fundación Instituto de Investigación Sanitaria de Las Islas Baleares, Son Llàtzer University Hospital, Crta Manacor Km 4., 07198, Palma, Spain
- Balearic Research Group in Genetic Cardiopathies, Sudden Death and TTR Amyloidosis. Health Research Institute of the Balearic Islands (IdISBa), Son Llàtzer University Hospital, Palma, Spain
| | - F Alarcon
- Laboratory MAP5 UMR CNRS 8145, Paris University, Paris, France
| | - G Nuel
- Stochastics and Biology Group, Department of Probability and Statistics (LPSM, UMR CNRS 8001), Sorbonne University, Paris, France
| | - J González-Moreno
- Internal Medicine Department. Fundación Instituto de Investigación Sanitaria de Las Islas Baleares, Son Llàtzer University Hospital, Crta Manacor Km 4., 07198, Palma, Spain
- Balearic Research Group in Genetic Cardiopathies, Sudden Death and TTR Amyloidosis. Health Research Institute of the Balearic Islands (IdISBa), Son Llàtzer University Hospital, Palma, Spain
| | - A Rodríguez
- Internal Medicine Department. Fundación Instituto de Investigación Sanitaria de Las Islas Baleares, Son Llàtzer University Hospital, Crta Manacor Km 4., 07198, Palma, Spain
- Balearic Research Group in Genetic Cardiopathies, Sudden Death and TTR Amyloidosis. Health Research Institute of the Balearic Islands (IdISBa), Son Llàtzer University Hospital, Palma, Spain
| | - J Hernandez-Rodriguez
- Genomics of Health Research Group, Health Research Institute of the Balearic Islands (IdISBa), 07120, Palma, Spain
| | - E Amengual-Cladera
- Genomics of Health Research Group, Health Research Institute of the Balearic Islands (IdISBa), 07120, Palma, Spain
| | - I Martínez-López
- Genomics of Health Research Group, Health Research Institute of the Balearic Islands (IdISBa), 07120, Palma, Spain
- Molecular Diagnostics and Clinical Genetics Unit, Hospital Universitario Son Espases, 07120, Palma, Spain
| | - T Ripoll-Vera
- Balearic Research Group in Genetic Cardiopathies, Sudden Death and TTR Amyloidosis. Health Research Institute of the Balearic Islands (IdISBa), Son Llàtzer University Hospital, Palma, Spain
- Cardiology Department, Son Llàtzer University Hospital, Palma, Spain
| | - I Losada-López
- Internal Medicine Department. Fundación Instituto de Investigación Sanitaria de Las Islas Baleares, Son Llàtzer University Hospital, Crta Manacor Km 4., 07198, Palma, Spain
- Balearic Research Group in Genetic Cardiopathies, Sudden Death and TTR Amyloidosis. Health Research Institute of the Balearic Islands (IdISBa), Son Llàtzer University Hospital, Palma, Spain
| | - D Heine-Suñer
- Genomics of Health Research Group, Health Research Institute of the Balearic Islands (IdISBa), 07120, Palma, Spain
- Molecular Diagnostics and Clinical Genetics Unit, Hospital Universitario Son Espases, 07120, Palma, Spain
| | - V Plante-Bordeneuve
- Department of Neurology, University Hospital Henri Mondor, 51 Avenue du Maréchal de Lattre de Tasigny, 94000, Créteil, France.
- Paris Est-Créteil University, Créteil, France.
- Inserm U.955, Institut Mondor de Recherche Biomédicale (IMRB), Créteil, France.
| |
Collapse
|
2
|
Alarcon F, Planté-Bordeneuve V, Nuel G. Study of the parent-of-origin effect in monogenic diseases with variable age of onset. Application on ATTRv. PLoS One 2023; 18:e0288958. [PMID: 37561731 PMCID: PMC10414668 DOI: 10.1371/journal.pone.0288958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 07/10/2023] [Indexed: 08/12/2023] Open
Abstract
In genetic diseases with variable age of onset, an accurate estimation of the survival function for the mutation carriers and also modifying factors effects estimations are important for the management of asymptomatic gene carriers across life. Among the modifying factors, the gender of the parent transmitting the mutation (i.e. the parent-of-origin effect) has been shown to have a significant effect on survival curve estimation on transthyretin familial amyloid polyneuropathy (ATTRv) families. However, as most genotypes are unknown, the parent-of-origin must be calculated through a probability estimated from the pedigree. We propose in this article to extend the method providing mutation carrier survival estimates in order to estimate the parent-of-origin effect. The method is both validated on simulated data and applied to familly samples with ATTRv.
Collapse
Affiliation(s)
- Flora Alarcon
- Laboratory MAP5 UMR CNRS 8145 Paris City University, Paris, France
| | - Violaine Planté-Bordeneuve
- Department of Neurology, Henri Mondor University Hospital, APHP, Crteil, France
- Paris Est-Crteil University, Crteil, France
- Inserm U.955, Institut Mondor de Recherche Biomdicale (IMRB), Crteil, France
| | - Grégory Nuel
- Stochastics and Biology Group, Department of Probability and Statistics (LPSM, UMR CNRS 8001), Sorbonne University, Paris, France
| |
Collapse
|
3
|
Schramm C, Charbonnier C, Zaréa A, Lacour M, Wallon D, Boland A, Deleuze JF, Olaso R, Alarcon F, Campion D, Nuel G, Nicolas G. Publisher Correction: Penetrance estimation of Alzheimer disease in SORL1 loss-of-function variant carriers using a family-based strategy and stratification by APOE genotypes. Genome Med 2022; 14:83. [PMID: 35922859 PMCID: PMC9351270 DOI: 10.1186/s13073-022-01091-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Affiliation(s)
- Catherine Schramm
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, F-76000, Rouen, France
| | - Camille Charbonnier
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, F-76000, Rouen, France
| | - Aline Zaréa
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Neurology and CNRMAJ, FHU-G4 Génomique, F-76000, Rouen, France
| | - Morgane Lacour
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Neurology and CNRMAJ, FHU-G4 Génomique, F-76000, Rouen, France
| | - David Wallon
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Neurology and CNRMAJ, FHU-G4 Génomique, F-76000, Rouen, France
| | | | - Anne Boland
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Jean-François Deleuze
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Robert Olaso
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | | | - Flora Alarcon
- MAP5, UMR-CNRS 8145, Paris University, 75270, Paris, France
| | - Dominique Campion
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, F-76000, Rouen, France.,Department of Research, Rouvray Psychiatric Hospital, 76681, Sotteville-Lès-Rouen, France
| | - Grégory Nuel
- LPSM, CNRS 8001, Sorbonne University, 75005, Paris, France
| | - Gaël Nicolas
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, F-76000, Rouen, France.
| |
Collapse
|
4
|
Schramm C, Charbonnier C, Zaréa A, Lacour M, Wallon D, Boland A, Deleuze JF, Olaso R, Alarcon F, Campion D, Nuel G, Nicolas G. Penetrance estimation of Alzheimer disease in SORL1 loss-of-function variant carriers using a family-based strategy and stratification by APOE genotypes. Genome Med 2022; 14:69. [PMID: 35761418 PMCID: PMC9238165 DOI: 10.1186/s13073-022-01070-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 06/08/2022] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Alzheimer disease (AD) is a common complex disorder with a high genetic component. Loss-of-function (LoF) SORL1 variants are one of the strongest AD genetic risk factors. Estimating their age-related penetrance is essential before putative use for genetic counseling or preventive trials. However, relative rarity and co-occurrence with the main AD risk factor, APOE-ε4, make such estimations difficult. METHODS We proposed to estimate the age-related penetrance of SORL1-LoF variants through a survival framework by estimating the conditional instantaneous risk combining (i) a baseline for non-carriers of SORL1-LoF variants, stratified by APOE-ε4, derived from the Rotterdam study (N = 12,255), and (ii) an age-dependent proportional hazard effect for SORL1-LoF variants estimated from 27 extended pedigrees (including 307 relatives ≥ 40 years old, 45 of them having genotyping information) recruited from the French reference center for young Alzheimer patients. We embedded this model into an expectation-maximization algorithm to accommodate for missing genotypes. To correct for ascertainment bias, proband phenotypes were omitted. Then, we assessed if our penetrance curves were concordant with age distributions of APOE-ε4-stratified SORL1-LoF variant carriers detected among sequencing data of 13,007 cases and 10,182 controls from European and American case-control study consortia. RESULTS SORL1-LoF variants penetrance curves reached 100% (95% confidence interval [99-100%]) by age 70 among APOE-ε4ε4 carriers only, compared with 56% [40-72%] and 37% [26-51%] in ε4 heterozygous carriers and ε4 non-carriers, respectively. These estimates were fully consistent with observed age distributions of SORL1-LoF variant carriers in case-control study data. CONCLUSIONS We conclude that SORL1-LoF variants should be interpreted in light of APOE genotypes for future clinical applications.
Collapse
Affiliation(s)
- Catherine Schramm
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, Rouen, F-76000, France
| | - Camille Charbonnier
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, Rouen, F-76000, France
| | - Aline Zaréa
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Neurology and CNRMAJ, FHU-G4 Génomique, Rouen, F-76000, France
| | - Morgane Lacour
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Neurology and CNRMAJ, FHU-G4 Génomique, Rouen, F-76000, France
| | - David Wallon
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Neurology and CNRMAJ, FHU-G4 Génomique, Rouen, F-76000, France
| | | | - Anne Boland
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Jean-François Deleuze
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Robert Olaso
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | | | - Flora Alarcon
- MAP5, UMR-CNRS 8145, Paris University, 75270, Paris, France
| | - Dominique Campion
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, Rouen, F-76000, France.,Department of Research, Rouvray Psychiatric Hospital, 76681, Sotteville-Lès-Rouen, France
| | - Grégory Nuel
- LPSM, CNRS 8001, Sorbonne University, 75005, Paris, France
| | - Gaël Nicolas
- Normandie Université, UNIROUEN, Inserm U1245, CHU Rouen, Department of Genetics and CNRMAJ, FHU-G4 Génomique, 22 boulevard Gambetta - CS 76183, Rouen, F-76000, France.
| |
Collapse
|
5
|
Cluzel N, Courbariaux M, Wang S, Moulin L, Wurtzer S, Bertrand I, Laurent K, Monfort P, Gantzer C, Guyader SL, Boni M, Mouchel JM, Maréchal V, Nuel G, Maday Y. A nationwide indicator to smooth and normalize heterogeneous SARS-CoV-2 RNA data in wastewater. Environ Int 2022; 158:106998. [PMID: 34991258 PMCID: PMC8608586 DOI: 10.1016/j.envint.2021.106998] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/20/2021] [Accepted: 11/21/2021] [Indexed: 05/18/2023]
Abstract
Since many infected people experience no or few symptoms, the SARS-CoV-2 epidemic is frequently monitored through massive virus testing of the population, an approach that may be biased and may be difficult to sustain in low-income countries. Since SARS-CoV-2 RNA can be detected in stool samples, quantifying SARS-CoV-2 genome by RT-qPCR in wastewater treatment plants (WWTPs) has been carried out as a complementary tool to monitor virus circulation among human populations. However, measuring SARS-CoV-2 viral load in WWTPs can be affected by many experimental and environmental factors. To circumvent these limits, we propose here a novel indicator, the wastewater indicator (WWI), that partly reduces and corrects the noise associated with the SARS-CoV-2 genome quantification in wastewater (average noise reduction of 19%). All data processing results in an average correlation gain of 18% with the incidence rate. The WWI can take into account the censorship linked to the limit of quantification (LOQ), allows the automatic detection of outliers to be integrated into the smoothing algorithm, estimates the average measurement error committed on the samples and proposes a solution for inter-laboratory normalization in the absence of inter-laboratory assays (ILA). This method has been successfully applied in the context of Obépine, a French national network that has been quantifying SARS-CoV-2 genome in a representative sample of French WWTPs since March 5th 2020. By August 26th, 2021, 168 WWTPs were monitored in the French metropolitan and overseas territories of France. We detail the process of elaboration of this indicator, show that it is strongly correlated to the incidence rate and that the optimal time lag between these two signals is only a few days, making our indicator an efficient complement to the incidence rate. This alternative approach may be especially important to evaluate SARS-CoV-2 dynamics in human populations when the testing rate is low.
Collapse
Affiliation(s)
- Nicolas Cluzel
- Sorbonne Université, Maison des Modélisations Ingénieries et Technologies (SUMMIT), 75005 Paris, France.
| | - Marie Courbariaux
- Sorbonne Université, Maison des Modélisations Ingénieries et Technologies (SUMMIT), 75005 Paris, France
| | - Siyun Wang
- Sorbonne Université, Maison des Modélisations Ingénieries et Technologies (SUMMIT), 75005 Paris, France
| | - Laurent Moulin
- Eau de Paris, Département de Recherche, Développement et Qualité de l'Eau, 33 avenue Jean Jaurès, F-94200 Ivry sur Seine, France
| | - Sébastien Wurtzer
- Eau de Paris, Département de Recherche, Développement et Qualité de l'Eau, 33 avenue Jean Jaurès, F-94200 Ivry sur Seine, France
| | | | - Karine Laurent
- Sorbonne Université, Maison des Modélisations Ingénieries et Technologies (SUMMIT), 75005 Paris, France
| | - Patrick Monfort
- HydroSciences Montpellier, UMR 5151, Université de Montpellier, CNRS, IRD, F-34093 Montpellier, France
| | | | - Soizick Le Guyader
- Ifremer, laboratoire de Microbiologie, SG2M/LSEM, BP 21105, 44311 Nantes, France
| | - Mickaël Boni
- Institut de Recherche Biomédicale des Armées, 1 place Valérie André, F-91220 Brétigny-sur-Orge, France
| | - Jean-Marie Mouchel
- Sorbonne Université, CNRS, EPHE, UMR 7619 Metis, e-LTER Zone Atelier Seine, F-75005 Paris, France
| | - Vincent Maréchal
- Sorbonne Université, INSERM, Centre de Recherche Saint-Antoine, F-75012 Paris, France
| | - Grégory Nuel
- Stochastics and Biology Group, Probability and Statistics (LPSM, CNRS 8001), Sorbonne University, Campus Pierre et Marie Curie, 4 Place Jussieu, 75005 Paris, France
| | - Yvon Maday
- Sorbonne Université, CNRS, Université de Paris, Laboratoire Jacques-Louis Lions (LJLL), F-75005 Paris, France; Institut Universaire de France, France.
| |
Collapse
|
6
|
Schramm C, Charbonnier C, Zarea A, Wallon D, Lacour M, Alarcon F, Génin E, Campion D, Nuel G, Nicolas G. Penetrance estimation of SORL1 loss‐of‐function variants adjusted on APOE genotypes suggest a non‐monogenic inheritance. Alzheimers Dement 2021. [DOI: 10.1002/alz.056172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Catherine Schramm
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Genetics and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
| | - Camille Charbonnier
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Genetics and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
| | - Aline Zarea
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Neurology and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
| | - David Wallon
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Neurology and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
- CNR‐MAJ & Neurology, Rouen University Hospital Rouen France
| | - Morgane Lacour
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Neurology and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
| | - Flora Alarcon
- Laboratory MAP5 UMR CNRS 8145, Paris Descartes University Paris France
| | - Emmanuelle Génin
- Université Bretagne Occidentale Brest France
- Inserm UMR1078 / CHU Brest Brest France
| | - Dominique Campion
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Genetics and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
- CNR‐MAJ / Rouen University Hospital Rouen France
| | - Grégory Nuel
- LPSM, CNRS 8001 Sorbonne University Paris France
| | - Gaël Nicolas
- Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, Department of Genetics and CNR‐MAJ, F 76000, Normandy Center for Genomic and Personalized Medicine Rouen France
| | | |
Collapse
|
7
|
Affiliation(s)
| | - Eva Lauridsen
- Ressource Center for Rare Oral Diseases, Copenhagen University Hospital, Copenhagen, Denmark
| | | |
Collapse
|
8
|
Goepp V, Thalabard JC, Nuel G, Bouaziz O. Regularized bidimensional estimation of the hazard rate. Int J Biostat 2021; 18:263-277. [PMID: 33768761 DOI: 10.1515/ijb-2019-0003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 02/26/2021] [Indexed: 11/15/2022]
Abstract
In epidemiological or demographic studies, with variable age at onset, a typical quantity of interest is the incidence of a disease (for example the cancer incidence). In these studies, the individuals are usually highly heterogeneous in terms of dates of birth (the cohort) and with respect to the calendar time (the period) and appropriate estimation methods are needed. In this article a new estimation method is presented which extends classical age-period-cohort analysis by allowing interactions between age, period and cohort effects. We introduce a bidimensional regularized estimate of the hazard rate where a penalty is introduced on the likelihood of the model. This penalty can be designed either to smooth the hazard rate or to enforce consecutive values of the hazard to be equal, leading to a parsimonious representation of the hazard rate. In the latter case, we make use of an iterative penalized likelihood scheme to approximate the L 0 norm, which makes the computation tractable. The method is evaluated on simulated data and applied on breast cancer survival data from the SEER program.
Collapse
Affiliation(s)
- Vivien Goepp
- MAP5, CNRS UMR 8145, 45, rue des Saints-Pères, 75006, Paris, France.,MINES ParisTech, CBIO-Centre for Computational Biology, PSL Research University, 75006, Paris, France.,Institut Curie, PSL Research University, 75005, Paris, France.,Inserm, U900, Paris, France
| | | | - Grégory Nuel
- LPSM, CNRS UMR 8001, 4, Place Jussieu, 75005, Paris, France
| | - Olivier Bouaziz
- MAP5, CNRS UMR 8145, 45, rue des Saints-Pères, 75006, Paris, France
| |
Collapse
|
9
|
Nuel G. Moments of the Count of a Regular Expression in a Heterogeneous Random Sequence. Methodol Comput Appl Probab 2019. [DOI: 10.1007/s11009-019-09700-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Adjakossa EH, Hounkonnou NM, Nuel G. Computationally Stable Estimation Procedure for the Multivariate Linear Mixed-Effect Model and Application to Malaria Public Health Problem. Int J Biostat 2019; 15:ijb-2017-0076. [PMID: 31226099 DOI: 10.1515/ijb-2017-0076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 05/05/2019] [Indexed: 11/15/2022]
Abstract
In this paper, we provide the ML (Maximum Likelihood) and the REML (REstricted ML) criteria for consistently estimating multivariate linear mixed-effects models with arbitrary correlation structure between the random effects across dimensions, but independent (and possibly heteroscedastic) residuals. By factorizing the random effects covariance matrix, we provide an explicit expression of the profiled deviance through a reparameterization of the model. This strategy can be viewed as the generalization of the estimation procedure used by Douglas Bates and his co-authors in the context of the fitting of one-dimensional linear mixed-effects models. Beside its robustness regarding the starting points, the approach enables a numerically consistent estimate of the random effects covariance matrix while classical alternatives such as the EM algorithm are usually non-consistent. In a simulation study, we compare the estimates obtained from the present method with the EM algorithm-based estimates. We finally apply the method to a study of an immune response to Malaria in Benin.
Collapse
Affiliation(s)
- Eric Houngla Adjakossa
- International Chair in Mathematical Physics and Applications (ICMPA-UNESCO Chair), Université d'Abomey-Calavi, Cotonou, Benin.,Laboratoire de Probabilités, Statistique et Modélisation (UMR 8001), Sorbonne Université, Paris, France
| | - Norbert Mahouton Hounkonnou
- International Chair in Mathematical Physics and Applications (ICMPA-UNESCO Chair), Université d'Abomey-Calavi, Cotonou, Benin
| | - Grégory Nuel
- Laboratoire de Probabilités, Statistique et Modélisation (UMR 8001), Sorbonne Université, Paris, France
| |
Collapse
|
11
|
Alarcon F, Planté-Bordeneuve V, Olsson M, Nuel G. Non-parametric estimation of survival in age-dependent genetic disease and application to the transthyretin-related hereditary amyloidosis. PLoS One 2018; 13:e0203860. [PMID: 30252892 PMCID: PMC6155453 DOI: 10.1371/journal.pone.0203860] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 08/29/2018] [Indexed: 11/30/2022] Open
Abstract
In genetic diseases with variable age of onset, survival function estimation for the mutation carriers as well as estimation of the modifying factors effects are essential to provide individual risk assessment, both for mutation carriers management and prevention strategies. In practice, this survival function is classically estimated from pedigrees data where most genotypes are unobserved. In this article, we present a unifying Expectation-Maximization (EM) framework combining probabilistic computations in Bayesian networks with standard statistical survival procedures in order to provide mutation carrier survival estimates. The proposed approach allows to obtain previously published parametric estimates (e.g. Weibull survival) as particular cases as well as more general Kaplan-Meier non-parametric estimates, which is the main contribution. Note that covariates can also be taken into account using a proportional hazard model. The whole methodology is both validated on simulated data and applied to family samples with transthyretin-related hereditary amyloidosis (a rare autosomal dominant disease with highly variable age of onset), showing very promising results.
Collapse
Affiliation(s)
- Flora Alarcon
- Mathématiques appliquées Paris 5 (MAP5) CNRS: UMR8145 – Université Paris Descartes – Sorbonne Paris Cité, Paris, France
- * E-mail:
| | - Violaine Planté-Bordeneuve
- Hôpital Universitaire Henri Mondor, Département de Neurologie Créteil, France
- Inserm, U955-E10, Créteil, France
| | - Malin Olsson
- Umea university, Norrlands university hospital, NUS M31, Umea, Sweden
| | - Grégory Nuel
- Institute of Mathematics (INSMI), National Center for French Research (CNRS), Paris, France
- Laboratory of Probability (LPMA), Université Pierre et Marie Curie, Sorbonne Université, Paris, France
| |
Collapse
|
12
|
Abstract
In this article, we suggest a new statistical approach considering survival heterogeneity as a breakpoint model in an ordered sequence of time-to-event variables. The survival responses need to be ordered according to a numerical covariate. Our estimation method will aim at detecting heterogeneity that could arise through the ordering covariate. We formally introduce our model as a constrained Hidden Markov Model, where the hidden states are the unknown segmentation (breakpoint locations) and the observed states are the survival responses. We derive an efficient Expectation-Maximization framework for maximizing the likelihood of this model for a wide range of baseline hazard forms (parametrics or nonparametric). The posterior distribution of the breakpoints is also derived, and the selection of the number of segments using penalized likelihood criterion is discussed. The performance of our survival breakpoint model is finally illustrated on a diabetes dataset where the observed survival times are ordered according to the calendar time of disease onset.
Collapse
Affiliation(s)
- Olivier Bouaziz
- Laboratory MAP5, University Paris Descartes and CNRS, Sorbonne Paris Cité, Paris, France
| | - Grégory Nuel
- LPMA, CNRS 7599, University Pierre et Marie Curie, Sorbonne University, Paris, France
| |
Collapse
|
13
|
Monneret G, Jaffrézic F, Rau A, Zerjal T, Nuel G. Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS One 2017; 12:e0171142. [PMID: 28301504 PMCID: PMC5354375 DOI: 10.1371/journal.pone.0171142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 01/01/2017] [Indexed: 11/29/2022] Open
Abstract
Causal network inference is an important methodological challenge in biology as well as other areas of application. Although several causal network inference methods have been proposed in recent years, they are typically applicable for only a small number of genes, due to the large number of parameters to be estimated and the limited number of biological replicates available. In this work, we consider the specific case of transcriptomic studies made up of both observational and interventional data in which a single gene of biological interest is knocked out. We focus on a marginal causal estimation approach, based on the framework of Gaussian directed acyclic graphs, to infer causal relationships between the knocked-out gene and a large set of other genes. In a simulation study, we found that our proposed method accurately differentiates between downstream causal relationships and those that are upstream or simply associative. It also enables an estimation of the total causal effects between the gene of interest and the remaining genes. Our method performed very similarly to a classical differential analysis for experiments with a relatively large number of biological replicates, but has the advantage of providing a formal causal interpretation. Our proposed marginal causal approach is computationally efficient and may be applied to several thousands of genes simultaneously. In addition, it may help highlight subsets of genes of interest for a more thorough subsequent causal network inference. The method is implemented in an R package called MarginalCausality (available on GitHub).
Collapse
Affiliation(s)
- Gilles Monneret
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
- LPMA, UMR CNRS 7599, UPMC, Sorbonne Universités, 4 place Jussieu, 75005 Paris, France
- * E-mail:
| | - Florence Jaffrézic
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Andrea Rau
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Tatiana Zerjal
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Grégory Nuel
- LPMA, UMR CNRS 7599, UPMC, Sorbonne Universités, 4 place Jussieu, 75005 Paris, France
| |
Collapse
|
14
|
Hartmann AK, Nuel G. Using Triplet Ordering Preferences for Estimating Causal Effects in the Analysis of Gene Expression Data. PLoS One 2017; 12:e0170514. [PMID: 28141825 PMCID: PMC5283676 DOI: 10.1371/journal.pone.0170514] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 01/05/2017] [Indexed: 12/04/2022] Open
Abstract
Triplet ordering preferences are used to perform Monte Carlo sampling of the posterior causal orderings originating from the analysis of gene-expression experiments involving observation as well as, usually few, interventions, like knock-outs. The performance of this sampling approach is compared to a previously used sampling via pairwise ordering preference as well as to the sampling of the full posterior distribution. For a fair comparison, the latter approach is restricted to twice the numerical effort of the triplet-based approach. This is done for artificially generated causal, i.e., directed acyclic graphs (DAGs) and for actual experimental data taken from the ROSETTA challenge. The sampling using the triplets ordering turns out to be superior to both other approaches.
Collapse
Affiliation(s)
| | - Grégory Nuel
- LPMA, CNRS 7599, Université Pierre et Marie Curie, Paris, France
| |
Collapse
|
15
|
Witkowski B, Duru V, Khim N, Ross LS, Saintpierre B, Beghain J, Chy S, Kim S, Ke S, Kloeung N, Eam R, Khean C, Ken M, Loch K, Bouillon A, Domergue A, Ma L, Bouchier C, Leang R, Huy R, Nuel G, Barale JC, Legrand E, Ringwald P, Fidock DA, Mercereau-Puijalon O, Ariey F, Ménard D. A surrogate marker of piperaquine-resistant Plasmodium falciparum malaria: a phenotype-genotype association study. Lancet Infect Dis 2016; 17:174-183. [PMID: 27818097 PMCID: PMC5266792 DOI: 10.1016/s1473-3099(16)30415-7] [Citation(s) in RCA: 239] [Impact Index Per Article: 29.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 09/26/2016] [Accepted: 09/30/2016] [Indexed: 11/30/2022]
Abstract
Background Western Cambodia is the epicentre of Plasmodium falciparum multidrug resistance and is facing high rates of dihydroartemisinin–piperaquine treatment failures. Genetic tools to detect the multidrug-resistant parasites are needed. Artemisinin resistance can be tracked using the K13 molecular marker, but no marker exists for piperaquine resistance. We aimed to identify genetic markers of piperaquine resistance and study their association with dihydroartemisinin–piperaquine treatment failures. Methods We obtained blood samples from Cambodian patients infected with P falciparum and treated with dihydroartemisinin–piperaquine. Patients were followed up for 42 days during the years 2009–15. We established in-vitro and ex-vivo susceptibility profiles for a subset using piperaquine survival assays. We determined whole-genome sequences by Illumina paired-reads sequencing, copy number variations by qPCR, RNA concentrations by qRT-PCR, and protein concentrations by immunoblotting. Fisher’s exact and non-parametric Wilcoxon rank-sum tests were used to identify significant differences in single-nucleotide polymorphisms or copy number variants, respectively, for differential distribution between piperaquine-resistant and piperaquine-sensitive parasite lines. Findings Whole-genome exon sequence analysis of 31 culture-adapted parasite lines associated amplification of the plasmepsin 2–plasmepsin 3 gene cluster with in-vitro piperaquine resistance. Ex-vivo piperaquine survival assay profiles of 134 isolates correlated with plasmepsin 2 gene copy number. In 725 patients treated with dihydroartemisinin–piperaquine, multicopy plasmepsin 2 in the sample collected before treatment was associated with an adjusted hazard ratio (aHR) for treatment failure of 20·4 (95% CI 9·1–45·5, p<0·0001). Multicopy plasmepsin 2 predicted dihydroartemisinin–piperaquine failures with 0·94 (95% CI 0·88–0·98) sensitivity and 0·77 (0·74–0·81) specificity. Analysis of samples collected across the country from 2002 to 2015 showed that the geographical and temporal increase of the proportion of multicopy plasmepsin 2 parasites was highly correlated with increasing dihydroartemisinin–piperaquine treatment failure rates (r=0·89 [95% CI 0·77–0·95], p<0·0001, Spearman’s coefficient of rank correlation). Dihydroartemisinin–piperaquine efficacy at day 42 fell below 90% when the proportion of multicopy plasmepsin 2 parasites exceeded 22%. Interpretation Piperaquine resistance in Cambodia is strongly associated with amplification of plasmepsin 2–3, encoding haemoglobin-digesting proteases, regardless of the location. Multicopy plasmepsin 2 constitutes a surrogate molecular marker to track piperaquine resistance. A molecular toolkit combining plasmepsin 2 with K13 and mdr1 monitoring should provide timely information for antimalarial treatment and containment policies. Funding Institut Pasteur in Cambodia, Institut Pasteur Paris, National Institutes of Health, WHO, Agence Nationale de la Recherche, Investissement d’Avenir programme, Laboratoire d’Excellence Integrative “Biology of Emerging Infectious Diseases”.
Collapse
Affiliation(s)
- Benoit Witkowski
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia; Malaria Translational Research Unit, Institut Pasteur, Paris, France; Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Valentine Duru
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Nimol Khim
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia; Malaria Translational Research Unit, Institut Pasteur, Paris, France; Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Leila S Ross
- Department of Microbiology and Immunology and Division of Infectious Diseases, Department of Medicine, Columbia University Medical Center, New York, NY, USA
| | | | - Johann Beghain
- Department of Parasites and Insect Vectors, Institut Pasteur, Paris, France
| | - Sophy Chy
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Saorin Kim
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Sopheakvatey Ke
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Nimol Kloeung
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Rotha Eam
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Chanra Khean
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Malen Ken
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Kaknika Loch
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Anthony Bouillon
- Malaria Translational Research Unit, Institut Pasteur, Paris, France; Institut Pasteur in Cambodia, Phnom Penh, Cambodia; Structural Microbiology Unit, Biology of Malaria Targets Group, Department of Structural Biology and Chemistry and CNRS, UMR3528, Institut Pasteur, Paris, France
| | - Anais Domergue
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia
| | - Laurence Ma
- Plate-forme Génomique, Département Génomes et Génétique, Institut Pasteur, Paris, France
| | - Christiane Bouchier
- Plate-forme Génomique, Département Génomes et Génétique, Institut Pasteur, Paris, France
| | - Rithea Leang
- National Center for Parasitology, Entomology and Malaria Control, Phnom Penh, Cambodia
| | - Rekol Huy
- National Center for Parasitology, Entomology and Malaria Control, Phnom Penh, Cambodia
| | - Grégory Nuel
- Laboratoire de Mathématiques Appliquées (MAP5) UMR CNRS 8145, Université Paris Descartes, Paris, France
| | - Jean-Christophe Barale
- Malaria Translational Research Unit, Institut Pasteur, Paris, France; Institut Pasteur in Cambodia, Phnom Penh, Cambodia; Structural Microbiology Unit, Biology of Malaria Targets Group, Department of Structural Biology and Chemistry and CNRS, UMR3528, Institut Pasteur, Paris, France
| | - Eric Legrand
- Malaria Translational Research Unit, Institut Pasteur, Paris, France; Institut Pasteur in Cambodia, Phnom Penh, Cambodia; Department of Parasites and Insect Vectors, Institut Pasteur, Paris, France
| | - Pascal Ringwald
- Global Malaria Programme, World Health Organization, Geneva, Switzerland
| | - David A Fidock
- Department of Microbiology and Immunology and Division of Infectious Diseases, Department of Medicine, Columbia University Medical Center, New York, NY, USA
| | | | - Frédéric Ariey
- Department of Parasites and Insect Vectors, Institut Pasteur, Paris, France; Institut Cochin Inserm U1016, Université Paris-Descartes, Sorbonne Paris Cité, and Laboratoire de Parasitologie-Mycologie, Hôpital Cochin, Paris, France
| | - Didier Ménard
- Malaria Molecular Epidemiology Unit, Institut Pasteur in Cambodia, Phnom Penh, Cambodia; Malaria Translational Research Unit, Institut Pasteur, Paris, France; Institut Pasteur in Cambodia, Phnom Penh, Cambodia.
| |
Collapse
|
16
|
Zerjal T, Monneret G, Moroldo M, Coville JL, Tixier-Boichard M, Rau A, Nuel G, Jaffrezic F. P3005 Genome-wide transcriptomic analysis of liver in sex-linked dwarf and wild-type chickens. J Anim Sci 2016. [DOI: 10.2527/jas2016.94supplement453x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
17
|
Abstract
In the framework of patterns in random texts, the Markov chain embedding techniques consist of turning the occurrences of a pattern over an order-m Markov sequence into those of a subset of states into an order-1 Markov chain. In this paper we use the theory of language and automata to provide space-optimal Markov chain embedding using the new notion of pattern Markov chains (PMCs), and we give explicit constructive algorithms to build the PMC associated to any given pattern problem. The interest of PMCs is then illustrated through the exact computation of P-values whose complexity is discussed and compared to other classical asymptotic approximations. Finally, we consider two illustrative examples of highly degenerated pattern problems (structured motifs and PROSITE signatures), which further illustrate the usefulness of our approach.
Collapse
|
18
|
Abstract
In this paper we develop an explicit formula that allows us to compute the first k moments of the random count of a pattern in a multistate sequence generated by a Markov source. We derive efficient algorithms that allow us to deal with any pattern (low or high complexity) in any Markov model (homogeneous or not). We then apply these results to the distribution of DNA patterns in genomic sequences, and we show that moment-based developments (namely Edgeworth's expansion and Gram-Charlier type-B series) allow us to improve the reliability of common asymptotic approximations, such as Gaussian or Poisson approximations.
Collapse
|
19
|
Abstract
Penalized selection criteria like AIC or BIC are among the most popular methods for variable selection. Their theoretical properties have been studied intensively and are well understood, but making use of them in case of high-dimensional data is difficult due to the non-convex optimization problem induced by L0 penalties. In this paper we introduce an adaptive ridge procedure (AR), where iteratively weighted ridge problems are solved whose weights are updated in such a way that the procedure converges towards selection with L0 penalties. After introducing AR its specific shrinkage properties are studied in the particular case of orthogonal linear regression. Based on extensive simulations for the non-orthogonal case as well as for Poisson regression the performance of AR is studied and compared with SCAD and adaptive LASSO. Furthermore an efficient implementation of AR in the context of least-squares segmentation is presented. The paper ends with an illustrative example of applying AR to analyze GWAS data.
Collapse
Affiliation(s)
- Florian Frommlet
- Department of Medical Statistics (CEMSIIS), Medical University of Vienna, Spitalgasse 23, A-1090 Vienna, Austria
| | - Grégory Nuel
- National Institute for Mathematical Sciences (INSMI), CNRS, Stochastics and Biology Group (PSB), LPMA UMR CNRS 7599, Université Pierre et Marie Curie, 4 place Jussieu, 75005 Paris, France
| |
Collapse
|
20
|
Almelli T, Nuel G, Bischoff E, Aubouy A, Elati M, Wang CW, Dillies MA, Coppée JY, Ayissi GN, Basco LK, Rogier C, Ndam NT, Deloron P, Tahar R. Differences in gene transcriptomic pattern of Plasmodium falciparum in children with cerebral malaria and asymptomatic carriers. PLoS One 2014; 9:e114401. [PMID: 25479608 PMCID: PMC4257676 DOI: 10.1371/journal.pone.0114401] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2014] [Accepted: 11/10/2014] [Indexed: 11/24/2022] Open
Abstract
The mechanisms underlying the heterogeneity of clinical malaria remain largely unknown. We hypothesized that differential gene expression contributes to phenotypic variation of parasites which results in a specific interaction with the host, leading to different clinical features of malaria. In this study, we analyzed the transcriptomes of isolates obtained from asymptomatic carriers and patients with uncomplicated or cerebral malaria. We also investigated the transcriptomes of 3D7 clone and 3D7-Lib that expresses severe malaria associated-variant surface antigen. Our findings revealed a specific up-regulation of genes involved in pathogenesis, adhesion to host cell, and erythrocyte aggregation in parasites from patients with cerebral malaria and 3D7-Lib, compared to parasites from asymptomatic carriers and 3D7, respectively. However, we did not find any significant difference between the transcriptomes of parasites from cerebral malaria and uncomplicated malaria, suggesting similar transcriptomic pattern in these two parasite populations. The difference between isolates from asymptomatic children and cerebral malaria concerned genes coding for exported proteins, Maurer's cleft proteins, transcriptional factor proteins, proteins implicated in protein transport, as well as Plasmodium conserved and hypothetical proteins. Interestingly, UPs A1, A2, A3 and UPs B1 of var genes were predominantly found in cerebral malaria-associated isolates and those containing architectural domains of DC4, DC5, DC13 and their neighboring rif genes in 3D7-lib. Therefore, more investigations are needed to analyze the effective role of these genes during malaria infection to provide with new knowledge on malaria pathology. In addition, concomitant regulation of genes within the chromosomal neighborhood suggests a common mechanism of gene regulation in P. falciparum.
Collapse
Affiliation(s)
- Talleh Almelli
- Institut de Recherche pour le Développement (IRD), UMR 216 Mère et Enfant Face aux Infections Tropicales, Université Paris-Descartes, Près Sorbonne Paris-Cité, Paris, France
- PRES Sorbone Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France
| | - Grégory Nuel
- PRES Sorbone Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France
| | - Emmanuel Bischoff
- Institut Pasteur, Unit of Molecular Immunology of Parasites, Unit of Insect Vector Genetics and Genomics, Department of Parasitology and Mycology, Paris, France
- Centre National de la Recherche Scientifique (CNRS), URA 3012, Paris, France
| | - Agnès Aubouy
- Institut de Recherche pour le Développement (IRD), UMR 152 Pharmacochimie et pharmacologie pour le développement - (PHARMA-DEV), Université Paul Sabatier, Toulouse, France
| | - Mohamed Elati
- Institute of Systems and Synthetic Biology, CNRS, University of Evry, Genopole, Evry, France
| | - Christian William Wang
- Centre for Medical Parasitology at Department of International Health, Immunology, and Microbiology, University of Copenhagen and at Department of Infectious Diseases, Copenhagen University Hospital (Rigshospitalet), Copenhagen, Denmark
| | - Marie-Agnès Dillies
- Plate-forme Transcriptome et Epigénome, Departement Génomes et Génétique, Institut Pasteur, Paris, France
| | - Jean-Yves Coppée
- Plate-forme Transcriptome et Epigénome, Departement Génomes et Génétique, Institut Pasteur, Paris, France
| | | | - Leonardo Kishi Basco
- Organisation de Coordination pour la lutte contre les Endémies en Afrique Centrale (OCEAC), Laboratoire de Recherche sur le Paludisme, B. P. 288, Yaoundé, Cameroon
- Institut de Recherche pour le Développement (IRD), UMR 198 Unité de Recherche des Maladies Infectieuses et Tropicales Emergentes, Faculté de Médecine La Timone, Aix-Marseille Université, Marseille, France
| | - Christophe Rogier
- Institut Pasteur de Madagascar, B.P. 1274, Ambatofotsikely, Antananarivo, Madagascar
| | - Nicaise Tuikue Ndam
- Institut de Recherche pour le Développement (IRD), UMR 216 Mère et Enfant Face aux Infections Tropicales, Université Paris-Descartes, Près Sorbonne Paris-Cité, Paris, France
- PRES Sorbone Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France
| | - Philippe Deloron
- Institut de Recherche pour le Développement (IRD), UMR 216 Mère et Enfant Face aux Infections Tropicales, Université Paris-Descartes, Près Sorbonne Paris-Cité, Paris, France
- PRES Sorbone Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France
| | - Rachida Tahar
- Institut de Recherche pour le Développement (IRD), UMR 216 Mère et Enfant Face aux Infections Tropicales, Université Paris-Descartes, Près Sorbonne Paris-Cité, Paris, France
- PRES Sorbone Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France
- Organisation de Coordination pour la lutte contre les Endémies en Afrique Centrale (OCEAC), Laboratoire de Recherche sur le Paludisme, B. P. 288, Yaoundé, Cameroon
- * E-mail:
| |
Collapse
|
21
|
Thiam C, Le Quan Sang K, Aegerter P, Ou P, Nuel G, Thalabard JC. Évaluation comparative de deux procédures diagnostiques multi-dimensionnelles, corrélées, en absence de gold standard : application au suivi d’enfants opérés du cœur (TGVs). Rev Epidemiol Sante Publique 2014. [DOI: 10.1016/j.respe.2013.12.082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
22
|
Hammami I, Garcia A, Nuel G. Evidence for overdispersion in the distribution of malaria parasites and leukocytes in thick blood smears. Malar J 2013; 12:398. [PMID: 24195469 PMCID: PMC3831262 DOI: 10.1186/1475-2875-12-398] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2013] [Accepted: 10/22/2013] [Indexed: 11/24/2022] Open
Abstract
Background Microscopic examination of stained thick blood smears (TBS) is the gold standard for routine malaria diagnosis. Parasites and leukocytes are counted in a predetermined number of high power fields (HPFs). Data on parasite and leukocyte counts per HPF are of broad scientific value. However, in published studies, most of the information on parasite density (PD) is presented as summary statistics (e.g. PD per microlitre, prevalence, absolute/assumed white blood cell counts), but original data sets are not readily available. Besides, the number of parasites and the number of leukocytes per HPF are assumed to be Poisson-distributed. However, count data rarely fit the restrictive assumptions of the Poisson distribution. The violation of these assumptions commonly results in overdispersion. The objectives of this paper are to investigate and handle overdispersion in field-collected data. Methods The data comprise the records of three TBSs of 12-month-old children from a field study of Plasmodium falciparum malaria in Tori Bossito, Benin. All HPFs were examined systemically by visually scanning the film horizontally from edge to edge. The numbers of parasites and leukocytes per HPF were recorded and formed the first dataset on parasite and leukocyte counts per HPF. The full dataset is published in this study. Two sources of overdispersion in data are investigated: latent heterogeneity and spatial dependence. Unobserved heterogeneity in data is accounted for by considering more flexible models that allow for overdispersion. Of particular interest were the negative binomial model (NB) and mixture models. The dependent structure in data was modelled with hidden Markov models (HMMs). Results The Poisson assumptions are inconsistent with parasite and leukocyte distributions per HPF. Among simple parametric models, the NB model is the closest to the unknown distribution that generates the data. On the basis of model selection criteria AIC and BIC, HMMs provided a better fit to data than mixtures. Ordinary pseudo-residuals confirmed the validity of HMMs. Conclusion Failure to take overdispersion into account in parasite and leukocyte counts may entail important misleading inferences when these data are related to other explanatory variables (malariometric or environmental). Its detection is therefore essential. In addition, an alternative PD estimation method that accounts for heterogeneity and spatial dependence should be seriously considered in epidemiological studies with field-collected parasite and leukocyte data.
Collapse
Affiliation(s)
- Imen Hammami
- , Laboratoire de Mathématiques Appliquées (MAP5) UMR CNRS 8145Université Paris Descartes, Paris, France.
| | | | | |
Collapse
|
23
|
Rau A, Jaffrézic F, Nuel G. Joint estimation of causal effects from observational and intervention gene expression data. BMC Syst Biol 2013; 7:111. [PMID: 24172639 PMCID: PMC3834107 DOI: 10.1186/1752-0509-7-111] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 10/07/2013] [Indexed: 11/22/2022]
Abstract
Background In recent years, there has been great interest in using transcriptomic data to infer gene regulatory networks. For the time being, methodological development in this area has primarily made use of graphical Gaussian models for observational wild-type data, resulting in undirected graphs that are not able to accurately highlight causal relationships among genes. In the present work, we seek to improve the estimation of causal effects among genes by jointly modeling observational transcriptomic data with arbitrarily complex intervention data obtained by performing partial, single, or multiple gene knock-outs or knock-downs. Results Using the framework of causal Gaussian Bayesian networks, we propose a Markov chain Monte Carlo algorithm with a Mallows proposal model and analytical likelihood maximization to sample from the posterior distribution of causal node orderings, and in turn, to estimate causal effects. The main advantage of the proposed algorithm over previously proposed methods is its flexibility to accommodate any kind of intervention design, including partial or multiple knock-out experiments. Using simulated data as well as data from the Dialogue for Reverse Engineering Assessments and Methods (DREAM) 2007 challenge, the proposed method was compared to two alternative approaches: one requiring a complete, single knock-out design, and one able to model only observational data. Conclusions The proposed algorithm was found to perform as well as, and in most cases better, than the alternative methods in terms of accuracy for the estimation of causal effects. In addition, multiple knock-outs proved to contribute valuable additional information compared to single knock-outs. Finally, the simulation study confirmed that it is not possible to estimate the causal ordering of genes from observational data alone. In all cases, we found that the inclusion of intervention experiments enabled more accurate estimation of causal regulatory relationships than the use of wild-type data alone.
Collapse
Affiliation(s)
- Andrea Rau
- INRA, UMR1313 Génétique animale et biologie intégrative, 78352 Jouy-en-Josas, France.
| | | | | |
Collapse
|
24
|
Abstract
Malaria is a global health problem responsible for nearly one million deaths every year around 85% of which concern children younger than five years old in Sub-Saharan Africa. In addition, around 300 million clinical cases are declared every year. The level of infection, expressed as parasite density, is classically defined as the number of asexual parasites relative to a microliter of blood. Microscopy of Giemsa-stained thick blood films is the gold standard for parasite enumeration. Parasite density estimation methods usually involve threshold values; either the number of white blood cells counted or the number of high power fields read. However, the statistical properties of parasite density estimators generated by these methods have largely been overlooked. Here, we studied the statistical properties (mean error, coefficient of variation, false negative rates) of parasite density estimators of commonly used threshold-based counting techniques depending on variable threshold values. We also assessed the influence of the thresholds on the cost-effectiveness of parasite density estimation methods. In addition, we gave more insights on the behavior of measurement errors according to varying threshold values, and on what should be the optimal threshold values that minimize this variability.
Collapse
Affiliation(s)
- Imen Hammami
- Department of Applied Mathematics (MAP5), UMR CNRS 8145, Paris Descartes University, Paris, France.
| | | | | |
Collapse
|
25
|
Perduca V, Sinoquet C, Mourad R, Nuel G. Alternative methods for H1 simulations in genome-wide association studies. Hum Hered 2012; 73:95-104. [PMID: 22472690 DOI: 10.1159/000336194] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2011] [Accepted: 12/23/2011] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE Assessing the statistical power to detect susceptibility variants plays a critical role in genome-wide association (GWA) studies both from the prospective and retrospective point of view. Power is empirically estimated by simulating phenotypes under a disease model H1. For this purpose, the gold standard consists in simulating genotypes given the phenotypes (e.g. Hapgen). We introduce here an alternative approach for simulating phenotypes under H1 that does not require generating new genotypes for each simulation. METHODS In order to simulate phenotypes with a fixed total number of cases and under a given disease model, we suggest 3 algorithms: (1) a simple rejection algorithm; (2) a numerical Markov chain Monte-Carlo (MCMC) approach, and (3) an exact and efficient backward sampling algorithm. In our study, we validated the 3 algorithms both on a simulated dataset and by comparing them with Hapgen on a more realistic dataset. For an application, we then conducted a simulation study on a 1000 Genomes Project dataset consisting of 629 individuals (314 cases) and 8,048 SNPs from chromosome X. We arbitrarily defined an additive disease model with two susceptibility SNPs and an epistatic effect. RESULTS The 3 algorithms are consistent, but backward sampling is dramatically faster than the other two. Our approach also gives consistent results with Hapgen. Using our application data, we showed that our limited design requires a biological a priori to limit the investigated region. We also proved that epistatic effects can play a significant role even when simple marker statistics (e.g. trend) are used. We finally showed that the overall performance of a GWA study strongly depends on the prevalence of the disease: the larger the prevalence, the better the power. CONCLUSIONS Our approach is a valid alternative to Hapgen-type methods; it is not only dramatically faster but has 2 main advantages: (1) there is no need for sophisticated genotype models (e.g. haplotype frequencies, or recombination rates), and (2) the choice of the disease model is completely unconstrained (number of SNPs involved, gene-environment interactions, hybrid genetic models, etc.). Our 3 algorithms are available in an R package called 'waffect' ('double-u affect', for weighted affectations).
Collapse
Affiliation(s)
- V Perduca
- MAP5 - UMR CNRS 8145, Université Paris Descartes, Paris, France. vittorio.perduca @ parisdescartes.fr
| | | | | | | |
Collapse
|
26
|
Abifadel M, Bernier L, Dubuc G, Nuel G, Rabes JP, Bonneau J, Marques A, Marduel M, Devillers M, Munnich A, Erlich D, Varret M, Roy M, Davignon J, Boileau C. A PCSK9 variant and familial combined hyperlipidaemia. J Med Genet 2008; 45:780-6. [DOI: 10.1136/jmg.2008.059980] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
27
|
|
28
|
|
29
|
Guedj M, Della-Chiesa E, Picard F, Nuel G. Computing power in case-control association studies through the use of quadratic approximations: application to meta-statistics. Ann Hum Genet 2007; 71:262-70. [PMID: 17032289 DOI: 10.1111/j.1469-1809.2006.00316.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
In the framework of case-control studies many different test statistics are available to measure the association of a marker with a given disease. Nevertheless, choosing one particular statistic can lead to very different conclusions. In the absence of a consensus for this choice, a tempting option is to evaluate the power of these different statistics prior to make any decision. We review the available methods dedicated to power computation and assess their respective reliability in treating a wide range of tests on a wide range of alternative models. Considering Monte-Carlo, non-central chi-square and Delta-Method estimates, we evaluate empirical, asymptotic and numerical approaches. Additionally we introduce the use of the Delta-Method, extended to order 2, intended to provide better results than the traditional order-1 Delta-Method. Supplementary data can be found at: http://stat.genopole.cnrs.fr/software/dm2.
Collapse
Affiliation(s)
- M Guedj
- Laboratoire Statistique et Genome, 523 place des terrasses de l'Agora, 91000 Evry, France.
| | | | | | | |
Collapse
|
30
|
Abstract
Background: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). Results: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. Conclusion: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.
Collapse
Affiliation(s)
- Grégory Nuel
- Laboratoire Statistique et Génome, University of Evry, CNRS (8071), INRA(1152), 523, place des terrasses de I'Agora, 91034 Evry CEDEX, France.
| |
Collapse
|
31
|
Guedj M, Wojcik J, Della-Chiesa E, Nuel G, Forner K. A fast, unbiased and exact allelic test for case-control association studies. Hum Hered 2006; 61:210-21. [PMID: 16877868 DOI: 10.1159/000094776] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2006] [Accepted: 06/01/2006] [Indexed: 11/19/2022] Open
Abstract
Association studies are traditionally performed in the case-control framework. As a first step in the analysis process, comparing allele frequencies using the Pearson's chi-square statistic is often invoked. However such an approach assumes the independence of alleles under the hypothesis of no association, which may not always be the case. Consequently this method introduces a bias that deviates the expected type I error-rate. In this article we first propose an unbiased and exact test as an alternative to the biased allelic test. Available data require to perform thousands of such tests so we focused on its fast execution. Since the biased allelic test is still widely used in the community, we illustrate its pitfalls in the context of genome-wide association studies and particularly in the case of low-level tests. Finally, we compare the unbiased and exact test with the Cochran-Armitage test for trend and show it perfoms similarly in terms of power. The fast, unbiased and exact allelic test code is available in R, C++ and Perl at: http://stat.genopole.cnrs.fr/software/fueatest.
Collapse
Affiliation(s)
- M Guedj
- Statistique et Genome Laboratory, CNRS UMR 8071, Evry, France
| | | | | | | | | |
Collapse
|
32
|
Nuel G. Effective p-value computations using Finite Markov Chain Imbedding (FMCI): application to local score and to pattern statistics. Algorithms Mol Biol 2006; 1:5. [PMID: 16722531 PMCID: PMC1479348 DOI: 10.1186/1748-7188-1-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2006] [Accepted: 04/07/2006] [Indexed: 11/21/2022] Open
Abstract
The technique of Finite Markov Chain Imbedding (FMCI) is a classical approach to complex combinatorial problems related to sequences. In order to get efficient algorithms, it is known that such approaches need to be first rewritten using recursive relations. We propose here to give here a general recursive algorithms allowing to compute in a numerically stable manner exact Cumulative Distribution Function (CDF) or complementary CDF (CCDF). These algorithms are then applied in two particular cases: the local score of one sequence and pattern statistics. In both cases, asymptotic developments are derived. For the local score, our new approach allows for the very first time to compute exact p-values for a practical study (finding hydrophobic segments in a protein database) where only approximations were available before. In this study, the asymptotic approximations appear to be completely unreliable for 99.5% of the considered sequences. Concerning the pattern statistics, the new FMCI algorithms dramatically outperform the previous ones as they are more reliable, easier to implement, faster and with lower memory requirements.
Collapse
Affiliation(s)
- Grégory Nuel
- Laboratoire Statistique et Génome, UEVE, CNRS (8071), INRA (1152), Evry, France.
| |
Collapse
|
33
|
Abstract
Statistics on Markov chains are widely used for the study of patterns in biological sequences. Statistics on these models can be done through several approaches. Central limit theorem (CLT) producing Gaussian approximations are one of the most popular ones. Unfortunately, in order to find a pattern of interest, these methods have to deal with tail distribution events where CLT is especially bad. In this paper, we propose a new approach based on the large deviations theory to assess pattern statistics. We first recall theoretical results for empiric mean (level 1) as well as empiric distribution (level 2) large deviations on Markov chains. Then, we present the applications of these results focusing on numerical issues. LD-SPatt is the name of GPL software implementing these algorithms. We compare this approach to several existing ones in terms of complexity and reliability and show that the large deviations are more reliable than the Gaussian approximations in absolute values as well as in terms of ranking and are at least as reliable as compound Poisson approximations. We then finally discuss some further possible improvements and applications of this new method.
Collapse
Affiliation(s)
- G Nuel
- Laboratoire Statistique et Génome, Tour Evry 2, 523 place des terasses, 91034 Evry, France.
| |
Collapse
|
34
|
Abstract
SUMMARY S-SPatt allows the counting of patterns occurrences in text files and, assuming these texts are generated from a random Markovian source, the computation of the P-value of a given observation using a simple binomial approximation.
Collapse
Affiliation(s)
- Grégory Nuel
- Laboratoire Statistique et Génome 523 place des terrasses de l'Agora, 91000 Evry, France.
| |
Collapse
|
35
|
Abstract
SUMMARY The seq++ package offers a reference set of programs and an extensible library to biologists and developers working on sequence statistics. Its generality arises from the ability to handle sequences described with any alphabet (nucleotides, amino acids, codons and others). seq++ enables sequence modelling with various types of Markov models, including variable length Markov models and the newly developed parsimonious Markov models, all of them potentially phased. Simulation modules are supplied for Monte Carlo methods. Hence, this toolbox allows the study of any biological process which can be described by a series of states taken from a finite set.
Collapse
Affiliation(s)
- Vincent Miele
- UMR CNRS 8071 Statistique et Génome, 523 place des Terrasses, 91000 Evry, France.
| | | | | | | | | |
Collapse
|
36
|
Abstract
UNLABELLED AMIGene (Annotation of MIcrobial Genes) is an application for automatically identifying the most likely coding sequences (CDSs) in a large contig or a complete bacterial genome sequence. The first step in AMIGene is dedicated to the construction of Markov models that fit the input genomic data (i.e. the gene model), followed by the combination of well-known gene-finding methods and an heuristic approach for the selection of the most likely CDSs. The web interface allows the user to select one or several gene models applied to the analysis of the input sequence by the AMIGene program and to visualize the list of predicted CDSs graphically and in a downloadable text format. The AMIGene web site is accessible at the following address: http://www.genoscope.cns.fr/agc/tools/amigene/index.html ( CONTACT sbocs@genoscope.cns.fr).
Collapse
Affiliation(s)
- Stéphanie Bocs
- Génoscope/UMR-CNRS 8030, Atelier de Génomique Comparative, 2 rue Gaston Crémieux, F-91006 Evry, France
| | | | | | | | | |
Collapse
|
37
|
Abstract
Many statistical methods and programs are available to compute the significance of a given DNA pattern in a genome sequence. In this paper, after outlining the mathematical background of this problem, we present SPA (Statistic for PAtterns), an expert system with a simple web interface designed to be applied to two of these methods (large deviation approximations and exact computations using simple recurrences). A few results are presented, leading to a comparison between the two methods and to a simple decision rule in the choice of that to be used. Finally, future developments of SPA are discussed. This tool is available at the following address: http://stat.genopole.cnrs.fr/SPA/.
Collapse
Affiliation(s)
- H Richard
- Laboratoire Statistique et Genome, CNRS, INRA, Genopole, Université d'Evry Val d'Essone, 523 place des terrasses, 91000 Evry, France
| | | |
Collapse
|
38
|
|
39
|
|