1
|
Kooti G, Dabir B, Butscher C, Taherdangkoo R. A constrained machine learning surrogate model to predict the distribution of water-in-oil emulsions in electrostatic fields. Sci Rep 2024; 14:11142. [PMID: 38750144 PMCID: PMC11096166 DOI: 10.1038/s41598-024-61535-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Accepted: 05/07/2024] [Indexed: 05/18/2024] Open
Abstract
Accurately describing the evolution of water droplet size distribution in crude oil is fundamental for evaluating the water separation efficiency in dehydration systems. Enhancing the separation of an aqueous phase dispersed in a dielectric oil phase, which has a significantly lower dielectric constant than the dispersed phase, can be achieved by increasing the water droplet size through the application of an electrostatic field in the pipeline. Mathematical models, while being accurate, are computationally expensive. Herein, we introduced a constrained machine learning (ML) surrogate model developed based on a population balance model. This model serves as a practical alternative, facilitating fast and accurate predictions. The constrained ML model, utilizing an extreme gradient boosting (XGBoost) algorithm tuned with a genetic algorithm (GA), incorporates the key parameters of the electrostatic dehydration process, including droplet diameter, voltage, crude oil properties, temperature, and residence time as input variables, with the output being the number of water droplets per unit volume. Furthermore, we modified the objective function of the XGBoost algorithm by incorporating two penalty terms to ensure the model's predictions adhere to physical principles. The constrained model demonstrated accuracy on the test set, with a mean squared error of 0.005 and a coefficient of determination of 0.998. The efficiency of the model was validated through comparison with the experimental data and the results of the population balance mathematical model. The analysis shows that the initial droplet diameter and voltage have the highest influence on the model, which aligns with the observed behaviour in the real-world process.
Collapse
Affiliation(s)
- Ghazal Kooti
- Department of Petroleum Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Bahram Dabir
- Department of Chemical Engineering, Amirkabir University of Technology, Tehran, Iran.
| | - Christoph Butscher
- Chair of Engineering Geology and Environmental Geotechnics, TU Bergakademie Freiberg, Freiberg, Germany
| | - Reza Taherdangkoo
- Chair of Engineering Geology and Environmental Geotechnics, TU Bergakademie Freiberg, Freiberg, Germany
| |
Collapse
|
2
|
Nafouanti MB, Li J, Nyakilla EE, Mwakipunda GC, Mulashani A. A novel hybrid random forest linear model approach for forecasting groundwater fluoride contamination. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:50661-50674. [PMID: 36800089 DOI: 10.1007/s11356-023-25886-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/07/2023] [Indexed: 02/18/2023]
Abstract
Groundwater quality in the Datong basin is threatened by high fluoride contamination. Laboratory analysis is a standard method for estimating groundwater quality parameters, which is expensive and time-consuming. Therefore, this paper proposes a hybrid random forest linear model (HRFLM) as a novel approach for estimating groundwater fluoride contamination. Light gradient boosting (LightGBM), random forest (RF), and extreme gradient boosting (Xgboost) were also employed in comparison with HRFLM for predicting fluoride contamination in groundwater. 202 groundwater samples were collected to draw up the performance capability of several models in forecasting subsurface water fluoride contamination. The performance of the models was assessed utilizing the receiver operating characteristic (ROC) area under the curve (AUC) and the confusion matrix (CM). The CM results reveal that with nine predictor variables, the hybrid HRFLM achieved an accuracy of 95%, outperforming the Xgboost, LightGBM, and RF models, which attained 88%, 88%, and 85%, respectively. Likewise, the AUC results of the hybrid HRFLM show high performance with an AUC of 0.98 compared to Xgboost, LightGBM, and RF, which achieved an AUC of 0.95, 0.90, and 0.88, respectively. The study demonstrates that the HRFLM can be applied as an advanced approach for groundwater fluoride contamination prediction in the Datong basin and could be adopted in various areas facing a similar challenge.
Collapse
Affiliation(s)
- Mouigni Baraka Nafouanti
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan, 430074, China.
| | - Junxia Li
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan, 430074, China.,China Laboratory of Basin Hydrology and Wetland Eco-restoration, China University of Geosciences, Wuhan, 430074, China
| | - Edwin E Nyakilla
- Department of Petroleum Engineering, Faculty of Earth Resources, China University of Geosciences, Wuhan, 430074, China
| | - Grant Charles Mwakipunda
- Department of Petroleum Engineering, Faculty of Earth Resources, China University of Geosciences, Wuhan, 430074, China
| | - Alvin Mulashani
- Department of Geosciences and Mining Technology, College of Engineering and Technology, Mbeya University of Science and Technology, Box 131, Mbeya, Tanzania
| |
Collapse
|
3
|
Sun B, He H, Sun X, Li X, Wang Z. Prediction method of solubility of carbon dioxide and methane during gas invasion in deep-water drilling. JOURNAL OF CONTAMINANT HYDROLOGY 2022; 251:104081. [PMID: 36272377 DOI: 10.1016/j.jconhyd.2022.104081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 09/07/2022] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
Gases that invade during deep-water oil and gas drilling may be concealed due to the gas dissolution effect, leading to increased well control risks. Accurate and rapid prediction of carbon dioxide and methane dissolution is of great significance for the prediction and control of wellbore pressure during gas invasion. In this study, 316 sets of carbon dioxide solubility data at 288.15 to 423.15 K and 0.1 to 100 MPa, and 266 sets of methane solubility data at 275.15 to 444.3 K and 0.1 to 68 MPa were used to train a machine learning algorithm. The machine learning prediction method for gas solubility was established with a support vector regression machine and a particle swarm optimisation algorithm. The kernel function and disciplinary parameters of the support vector regression machine were optimised using the experimental dataset. The solubility of CO2 and CH4 in water was measured using a gas solubility measurement device. The experimental and model analysis showed that the solubility of CO2 varied in different phase states. At a given pressure, the solubility of CO2 was highest in the liquid state, followed by the supercritical state, and then the gaseous state. The average absolute relative deviation percentages between the calculated values of the CO2 and CH4 solubility models and the experimental values were 2.57 and 8.20, respectively. The machine learning method is consistent with the high-precision Duan thermodynamic model for predicting the solubility of CO2 and CH4 in water and can be used to predict the gas solubility in deep water and deep oil and gas drilling.
Collapse
Affiliation(s)
- Baojiang Sun
- School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266580, China.
| | - Haikang He
- School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266580, China
| | - Xiaohui Sun
- School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266580, China
| | - Xuefeng Li
- School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266580, China
| | - Zhiyuan Wang
- School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266580, China
| |
Collapse
|
4
|
Taherdangkoo R, Yang H, Akbariforouz M, Sun Y, Liu Q, Butscher C. Gaussian process regression to determine water content of methane: Application to methane transport modeling. JOURNAL OF CONTAMINANT HYDROLOGY 2021; 243:103910. [PMID: 34695717 DOI: 10.1016/j.jconhyd.2021.103910] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 09/28/2021] [Accepted: 10/13/2021] [Indexed: 06/13/2023]
Abstract
The uncontrolled release of methane from natural gas wells may pose risks to shallow groundwater resources. Numerical modeling of methane migration from deep hydrocarbon formations towards shallow systems requires knowledge of phase behavior of the water-methane system, usually calculated by classic thermodynamic approaches. This study presents a Gaussian process regression (GPR) model to estimate water content of methane gas using pressure and temperature as input parameters. Bayesian optimization algorithm was implemented to tune hyper-parameters of the GPR model. The GPR predictions were evaluated with experimental data as well as four thermodynamic models. The results revealed that the predictions of the GPR are in good correspondence with experimental data having a MSE value of 3.127 × 10-7 and R2 of 0.981. Furthermore, the analysis showed that the GPR model exhibits an acceptable performance comparing with the well-known thermodynamic models. The GPR predicts the water content of methane over widespread ranges of pressure and temperature with a degree of accuracy needed for typical subsurface engineering applications.
Collapse
Affiliation(s)
- Reza Taherdangkoo
- TU Bergakademie Freiberg, Institute of Geotechnics, Gustav-Zeuner-Str. 1, 09599 Freiberg, Germany.
| | - Huichen Yang
- Department of Applied Geology, Geosciences Center, University of Göttingen, Goldschmidtstr. 3, 37077 Göttingen, Germany
| | | | - Yuantian Sun
- School of Mines, China University of Mining and Technology, 221116 Xuzhou, China
| | - Quan Liu
- Department of Applied Geology, Geosciences Center, University of Göttingen, Goldschmidtstr. 3, 37077 Göttingen, Germany
| | - Christoph Butscher
- TU Bergakademie Freiberg, Institute of Geotechnics, Gustav-Zeuner-Str. 1, 09599 Freiberg, Germany
| |
Collapse
|
5
|
Prediction of Refracturing Timing of Horizontal Wells in Tight Oil Reservoirs Based on an Integrated Learning Algorithm. ENERGIES 2021. [DOI: 10.3390/en14206524] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Refracturing technology can effectively improve the EUR of horizontal wells in tight reservoirs, and the determination of refracturing time is the key to ensuring the effects of refracturing measures. In view of different types of tight oil reservoirs in the Songliao Basin, a library of 1896 sets of learning samples, with 11 geological and engineering parameters and corresponding refracturing times as characteristic variables, was constructed by combining numerical simulation with field statistics. After a performance comparison and analysis of an artificial neural network, support vector machine and XGBoost algorithm, the support vector machine and XGBoost algorithm were chosen as the base model and fused by the stacking method of integrated learning. Then, a prediction method of refracturing timing of tight oil horizontal wells was established on the basis of an ensemble learning algorithm. Through the prediction and analysis of the refracturing timing corresponding to 257 groups of test data, the prediction results were in good agreement with the real value, and the correlation coefficient R2 was 0.945. The established prediction method can quickly and accurately predict the refracturing time, and effectively guide refracturing practices in the tight oil test area of the Songliao basin.
Collapse
|
6
|
Comparison between Deep Learning and Tree-Based Machine Learning Approaches for Landslide Susceptibility Mapping. WATER 2021. [DOI: 10.3390/w13192664] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The efficiency of deep learning and tree-based machine learning approaches has gained immense popularity in various fields. One deep learning model viz. convolution neural network (CNN), artificial neural network (ANN) and four tree-based machine learning models, namely, alternative decision tree (ADTree), classification and regression tree (CART), functional tree and logistic model tree (LMT), were used for landslide susceptibility mapping in the East Sikkim Himalaya region of India, and the results were compared. Landslide areas were delimited and mapped as landslide inventory (LIM) after gathering information from historical records and periodic field investigations. In LIM, 91 landslides were plotted and classified into training (64 landslides) and testing (27 landslides) subsets randomly to train and validate the models. A total of 21 landslides conditioning factors (LCFs) were considered as model inputs, and the results of each model were categorised under five susceptibility classes. The receiver operating characteristics curve and 21 statistical measures were used to evaluate and prioritise the models. The CNN deep learning model achieved the priority rank 1 with area under the curve of 0.918 and 0.933 by using the training and testing data, quantifying 23.02% and 14.40% area as very high and highly susceptible followed by ANN, ADtree, CART, FTree and LMT models. This research might be useful in landslide studies, especially in locations with comparable geophysical and climatological characteristics, to aid in decision making for land use planning.
Collapse
|