Back to Search Start Over

Neural network-based correlation and statistical identification of data outliers in H2S-alkanolamine-H2O and CO2-alkanolamine-H2O datasets.

Authors :
Imai, Bruno
Nasir, Qazi
Maulud, Abdulhalim Shah
Nawaz, Muhammad
Nasir, Rizwan
Suleman, Humbul
Source :
Neural Computing & Applications; Feb2023, Vol. 35 Issue 4, p3395-3412, 18p
Publication Year :
2023

Abstract

Throughout the published literature for phase equilibrium data of CO<subscript>2</subscript>-alkanolamine-H<subscript>2</subscript>O and H<subscript>2</subscript>S-alkanolamine-H<subscript>2</subscript>O systems, it is common to find some discrepant data, called data outliers. The presence of these erroneous values induces inaccuracies and prediction errors in the models and simulation studies developed using such experimental datasets. Hence, it is important that the data outliers are identified and later corrected or removed before developing a model or simulation. This study proposes a modified approach to identifying data outliers present in the phase equilibrium data of CO<subscript>2</subscript>-alkanolamine-H<subscript>2</subscript>O and H<subscript>2</subscript>S-alkanolamine-H<subscript>2</subscript>O systems using an artificial neural network and data outlier identification methods. Firstly, the suggested approach correlates the experimental phase equilibrium data (2152 data points) of CO<subscript>2</subscript> and H<subscript>2</subscript>S-loaded monoethanolamine, diethanolamine, and N-methyldiethanolamine solutions by developing an artificial neural network. Following this, the data outliers are identified by applying a modified IQR method and compared graphically to 2.5 standard deviation method. The identified data outliers can then be truncated or winsorised for developing reliable and accurate models/simulations. The modified IQR method coupled with a neural network (based on the normalised data values) can robustly identify data outliers within a large experimental dataset. The proposed approach is superior to the previous data outlier identification techniques that used 2.5 standard deviations method, as it alleviates the need for a human decision in determining the congruence of experimental values. The results also indicate that the developed method can be reliably extended to other/larger non-linear experimental datasets having similar correlative complexity. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09410643
Volume :
35
Issue :
4
Database :
Complementary Index
Journal :
Neural Computing & Applications
Publication Type :
Academic Journal
Accession number :
161516474
Full Text :
https://doi.org/10.1007/s00521-022-07904-z