119 results on '"classification and regression tree"'
Search Results
2. Analisis Faktor-Faktor yang Menjelaskan Pengimplementasian Nilai-Nilai Utama (Corevalues) AKHLAK pada Karyawan di PT TASPEN (Persero)
- Author
-
Mohammad Zahran Pratomo, Yekti Widyaningsih, and Dian Lestari
- Subjects
Human Resources ,Classification and Regression Tree ,Partial Least Square (PLS) - Abstract
The development of the global economy is currently entering the era of Industry 4.0. Industry 4.0 cannot be faced only with technological development, but must involve social dynamics in it. Every company and agency must create a strategy in dealing with this era, including Badan Usaha Milik Negara (BUMN) by establishing main values that become the reference for the behavior of all human resources in BUMN. These core values consist of Trustworthy, Competent, Harmonious, Loyal, Adaptive, and Collaborative (AKHLAK). In practice, AKHLAK has not been implemented properly, even though the corevalues of AKHLAK need to be implemented by all human resources in BUMN. This study examines the significant factors explaining the implementation of AKHLAK core values on PT TASPEN (Persero) employees and to examine the profile of employees who have implementation core values high and low are based on significant factors. The factors used in this study are work motivation, work environment, employee welfare, socialization, employee commitment, religiosity, work stress, age, gender, education level, and years of service. The methods used in solving this research problem are the Partial Least Square (PLS) method and the Classification and Regression Tree (CART) method. The data used is primary data of 209 PT TASPEN (Persero) employees taken using purposive sampling. The results showed that work motivation, socialization, religiosity, and education level can significantly explain the implementation of AKHLAK. The profile of employees who have a high level of implementation of AKHLAK are employees with high level of religiosity, high work motivation, for all categories of educational levels, and work stress levels. The profile of employees who have a low level of implementation of AKHLAK are employees who have low religiosity and work motivation.
- Published
- 2023
- Full Text
- View/download PDF
3. Depression Detection on Twitter Social Media Using Decision Tree
- Author
-
Hidayatullah, Marcello Rasel and Warih Maharani
- Subjects
tweet ,depression anxiety and stress scale 42 ,classification and regression tree ,depresi ,depression ,twit - Abstract
Depression is a major mood illness that causes patients to experience significant symptoms that interfere with their daily activities. As technology has developed, people now frequently express themselves through social media, especially Twitter. Twitter is a social media platform that allows users to post tweets and communicate with each other. Therefore, detecting depression based on social media can help in early treatment for sufferers before further treatment. This study created a system to detect if a person is indicating depression or not based on Depression Anxiety and Stress Scale - 42 (DASS-42) and their tweets using the Classification and Regression Tree (CART) method with TF-IDF feature extraction. The results show that the most optimal model achieved an accuracy score of 81.25% and an f1 score of 85.71%, which are higher than baseline results with an accuracy score of 62.50% and an f1 score of 66.66%. In addition, we found that there were significant effects on changing the value of the maximum features in TF-IDF and changing the maximum depth of the tree to the model performance., Depression is a major mood illness that causes patients to experience significant symptoms that interfere with their daily activities. As technology has developed, people now frequently express themselves through social media, especially Twitter. Twitter is a social media platform that allows users to post tweets and communicate with each other. Therefore, detecting depression based on social media can help in early treatment for sufferers before further treatment. This study created a system to detect if a person is indicating depression or not based on Depression Anxiety and Stress Scale - 42 (DASS-42) and their tweets using the Classification and Regression Tree (CART) method with TF-IDF feature extraction. The results show that the most optimal model achieved an accuracy score of 81.25% and an f1 score of 85.71%, which are higher than baseline results with an accuracy score of 62.50% and an f1 score of 66.66%. In addition, we found that there were significant effects on changing the value of the maximum features in TF-IDF and changing the maximum depth of the tree to the model performance.  
- Published
- 2022
4. Tree-Structured Model with Unbiased Variable Selection and Interaction Detection for Ranking Data
- Author
-
Yu-Shan Shih and Yi-Hung Kung
- Subjects
General Medicine ,classification and regression tree ,distance-based model ,independence test ,selection bias - Abstract
In this article, we propose a tree-structured method for either complete or partial rank data that incorporates covariate information into the analysis. We use conditional independence tests based on hierarchical log-linear models for three-way contingency tables to select split variables and cut points, and apply a simple Bonferroni rule to declare whether a node worths splitting or not. Through simulations, we also demonstrate that the proposed method is unbiased and effective in selecting informative split variables. Our proposed method can be applied across various fields to provide a flexible and robust framework for analyzing rank data and understanding how various factors affect individual judgments on ranking. This can help improve the quality of products or services and assist with informed decision making.
- Published
- 2023
- Full Text
- View/download PDF
5. Picking Winners: Identifying Features of High-Performing Special Purpose Acquisition Companies (SPACs) with Machine Learning
- Author
-
Caleb J. Williams
- Subjects
Economics and Econometrics ,Accounting ,Business, Management and Accounting (miscellaneous) ,Special Purpose Acquisition Company (SPAC) ,private equity ,venture capital ,Initial Public Offering (IPO) ,feature engineering ,machine learning ,artificial intelligence ,classification and regression tree ,logistic regression ,LASSO regression ,out-of-sample performance ,econometrics ,predictive analytics ,Finance - Abstract
Special Purpose Acquisition Companies (SPACs) are publicly listed “blank check” firms with a sole purpose: to merge with a private company and take it public. Selecting a target to take public via SPACs is a complex affair led by SPAC sponsors who seek to deliver investor value by effectively “picking winners” from the private sector. A key question for all sponsors is what they should be searching for. This paper aims to identify the characteristics of SPACs and their target companies that are relevant to market performance at sponsor lock-up windows. To achieve this goal, the study breaks market performance into a binary classification problem and uses a machine learning approach comprised of decision trees, logistic regression, and LASSO regression to identify features that exhibit a distinct relationship with market performance. The obtained results demonstrate that corporate or private equity backing in target firms greatly improves the odds of market outperformance one-year post-merger. This finding is novel in indicating that characteristics of target firms may also be deterministic of SPAC performance, in addition to SPACs, transaction, and the market features identified in the prior literature. It further suggests that a viable sponsor strategy could be constructed for generating outsized market returns at share lock-up windows by simply “following the money” and choosing target firms with prior involvement from corporate or private equity investors.
- Published
- 2023
- Full Text
- View/download PDF
6. Predictive Model for Adverse Events and Immune Response Based on the Production of Antibodies After the Second-Dose of the BNT162b2 mRNA Vaccine
- Author
-
Shinichi, Okada, Katsuyuki, Tomita, Genki, Inui, Tomoyuki, Ikeuchi, Hirokazu, Touge, Junichi, Hasegawa, and Akira, Yamasaki
- Subjects
BNT162b2 vaccine ,classification and regression tree ,adverse effect ,antibody ,Original Article ,General Medicine - Abstract
BACKGROUND: The BNT162b mRNA vaccine for coronavirus disease 2019, which is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), mimics the immune response to natural infection. Few studies have predicted the adverse effects (AEs) after the second-dose vaccination. We present a predictive model for AEs and immune response after the second-dose of the BNT162b mRNA vaccine. METHODS: To predict AEs, 282 healthcare workers (HCWs) were enrolled in this prospective observational study. The classification and regression tree (CART) model was established, and its predictive efficacy was assessed. To predict immune response, 282 HCWs were included in the analysis. Moreover, the factors affected by anti-SARS-CoV-2 spike protein RBD antibody (s-IgG) were evaluated using serum samples collected 2 months after the second-dose vaccination. The s-IgG level was assessed using Lumipulse G1200. Multiple regression analyses were conducted to evaluate variables associated with anti-s-IgG titer levels. RESULTS: The most common AEs after the second-dose vaccination were pain (87.6%), redness (17.0%) at the injection site, fatigue (68.8%), headache (53.5%), and fever (37.5%). Based on the CART model, headache after the first-dose vaccination and age < 30 years were identified as the first and second discriminators for predicting the headache after the second-dose vaccination, respectively. In the multiple linear regression model, anti-s-IgG titer levels were associated with age, female sex, and AEs including headache and induration at the injection site after the second-dose vaccination. CONCLUSION: Headache after the first-dose vaccination can be a predictor of headache after the second-dose vaccination, and AEs are indicators of immune response.
- Published
- 2022
7. Detection of Water Hyacinth (Eichhornia crassipes) in Lake Tana, Ethiopia, Using Machine Learning Algorithms
- Author
-
Getachew Bayable, Ji Cai, Mulatie Mekonnen, Solomon Addisu Legesse, Kanako Ishikawa, Hiroki Imamura, and Victor S. Kuwahara
- Subjects
classification and regression tree ,water hyacinth detection ,Geography, Planning and Development ,support vector machine ,Aquatic Science ,Google Earth Engine ,Biochemistry ,random forest ,Water Science and Technology - Abstract
Lake Tana is Ethiopia’s largest lake and is infested with invasive water hyacinth (E. crassipes), which endangers the lake’s biodiversity and habitat. Using appropriate remote sensing detection methods and determining the seasonal distribution of the weed is important for decision-making, water resource management, and environmental protection. As the demand for the reliable estimation of E. crassipes mapping from satellite data grows, comparing the performance of different machine learning algorithms could help in identifying the most effective method for E. crassipes detection in the lake. Therefore, this study aimed to examine the ability of random forest (RF), support vector machine (SVM), and classification and regression tree (CART) machine learning algorithms to detect E. crassipes and estimating seasonal spatial coverage of the weed on the Google Earth Engine (GEE) platform using Landsat 8 and Sentinel 2 images. Cloud-masked monthly median composite Landsat 8 and Sentinel 2 data from October 2021 and 2022, January 2022 and 2023, March 2022, and June 2022 were used to represent autumn, winter, spring, and summer, respectively. Four spectral indices were derived and used in combination with spectral bands to improve the E. crassipes detection accuracy. All methods achieved greater than 95% and 90% overall accuracy when using Sentinel 2 and Landsat 8 images, respectively. Using both data sets, all methods achieved a greater than 93% F1 score for E. crassipes detection. Though the difference in performance between the methods was small, the RF was the most accurate, while the SVM and CART methods had the same accuracy. The maximum E. crassipes coverage area was observed in autumn (22.4 km2), while the minimum (2.2 km2) was observed in summer. Based on Sentinel 2 data, the E. crassipes area coverage decreased significantly by 62.5% from winter to spring and increased significantly by 81.7% from summer to autumn. The findings suggested that the RF classifier was the most accurate E. crassipes detection algorithm, and autumn was an appropriate season for E. crassipes detection in Lake Tana.
- Published
- 2023
- Full Text
- View/download PDF
8. Body weight prediction using different data mining algorithms in Thalli sheep: A comparative study
- Author
-
Ansar Abbas, Abdul Waheed, and Muhammad Aman Ullah
- Subjects
Cart ,Chi-square automatic interaction detector ,Coefficient of determination ,General Veterinary ,Rump ,classification and regression tree ,Veterinary medicine ,Thalli sheep ,Biology ,Body weight ,CHAID ,SF1-1100 ,Standard deviation ,Breed ,Animal culture ,body weight ,exhaustive Chi-square automatic interaction detector ,Animal science ,SF600-1100 ,artificial neural network ,Research Article - Abstract
Background and Aim: The Thalli sheep are the main breed of sheep in Pakistan, and an effective method to predict their body weight (BW) using linear body measurements has not yet been determined. Therefore, this study aims to establish an algorithm with the best predictive capability, among the Chi-square automatic interaction detector (CHAID), exhaustive CHAID, artificial neural network, and classification and regression tree (CART) algorithms, in live BW prediction using selected body measurements in female Pakistani Thalli sheep. Materials and Methods: A total of 152 BW records, including nine continuous predictors (wither height, body length [BL], head length, rump length, tail length, head width, rump width, heart girth [HG], and barrel depth), were utilized. The coefficient of determination (R2), standard deviation ratio, root-mean-square error (RMSE), etc., were calculated for each algorithm. Results: The R2 (%) values ranged from 49.28 (CART) to 64.48 (CHAID). The lowest RMSE was found for CHAID (2.61), and the highest one for CART (3.12). The most significant predictors were the HG of live BW for all algorithms. The heaviest average BW (41.12 kg) was observed in the subgroup of those having a BL of >73.91 cm (Adjusted p=0.045). Conclusion: Among the algorithms, CHAID provided the most appropriate predictive capability in the prediction of live BW for female Thalli sheep. In general, the applied algorithms accurately predicted the BW of Thalli sheep, which can be very helpful in deciding on the standards, available drug doses, and required feed amount for animals.
- Published
- 2021
9. Reconstruction of Historical Land Use and Urban Flood Simulation in Xi’an, Shannxi, China
- Author
-
Shuangtao Wang, Pingping Luo, Chengyi Xu, Wei Zhu, Zhe Cao, and Steven Ly
- Subjects
General Earth and Planetary Sciences ,historical land use ,reconstruction ,remote sensing ,classification and regression tree - Abstract
Reconstruction of historical land uses helps to understand patterns, drivers, and impacts of land-use change, and is essential for finding solutions to land-use sustainability. In order to analyze the relationship between land-use change and urban flooding, this study used the Classification and Regression Tree (CART) method to extract modern (2017) land-use data based on remote sensing images. Then, the Paleo-Land-Use Reconstruction (PLUR) program was used to reconstruct the land-use maps of Xi’an during the Ming (1582) and Qing (1766) dynasties by consulting and collecting records of land-use change in historical documents. Finally, the Flo-2D model was used to simulate urban flooding under different land-use scenarios. Over the past 435 years (1582–2017), the urban construction land area showed a trend of increasing, while the unused land area and water bodies were continuously decreasing. The increase in urban green space and buildings was 20.49% and 19.85% respectively, and the unused land area changed from 0.32 km2 to 0. Urban flooding in the modern land-use scenario is the most serious. In addition to the increase in impervious areas, the increase in building density and the decrease in water areas are also important factors that aggravate urban flooding. This study can provide a reference for future land-use planning and urban flooding control policy formulation and revision in the study area.
- Published
- 2022
- Full Text
- View/download PDF
10. Determination of Glaucoma Disease with Gray Level Co-occurrence Matrix Features
- Author
-
ŞAHİN SADIK, Evin
- Subjects
Artificial Neural Network ,Fundus ,Classification and Regression Tree ,Glaucoma ,Gray level Co-occurrence matrix ,K-nearest neighbor ,Engineering ,Mühendislik ,Yapay Sinir Ağı ,Göz Dibi Fotoğrafı ,Sınıflandırma ve Regresyon Ağacı ,Glokom ,Gri seviye Eş oluşum matrisi ,K-en yakın komşuluk - Abstract
Glaucoma is a disease that causes an abnormal increase in intraocular pressure and therefore causes permanent damage to the optic nerves. Early and accurate diagnosis of the disease, known as the most "insidious" disease among eye diseases, is important. In this study, glaucoma prediction application was performed from high-resolution fundus photographs taken from an open-source database. Correlation, energy, homogeneity, contrast and entropy features were extracted from the segmented photographs using the gray-level co-occurrence matrix. Extracted features were divided into 66% test and 33% training after taking their average values. A 3-fold cross-validation was applied to the data and a feedback artificial neural network, classification and regression trees algorithm and k nearest neighbor algorithm were trained using 66% of the data. Classification success was also tested with 33% of test data. As a result, glaucoma and healthy individuals were classified with an average of 86.7% accuracy with the k nearest neighbor algorithm, an average of 87.8% accuracy with the decision trees, and an average of 96.7% accuracy with the artificial neural network algorithm. According to the results obtained, it was seen that glaucoma disease could be detected with high accuracy with the gray-level co-occurrence matrix features of glaucoma disease., Glokom, göz iç basıncının anormal bir biçimde artmasına neden olan ve bu sebeple görme sinirlerinde kalıcı hasara yol açan bir hastalıktır. Göz rahatsızlıkları içerisinde en “sinsi” hastalık olarak bilinen hastalığın erken ve doğru teşhisi önemlidir. Bu çalışmada, açık kaynak bir veri tabanından alınan yüksek çözünürlüklü göz dibi (fundus) fotoğraflarından glokom tahmini uygulaması gerçekleştirilmiştir. Segmente edilmiş fotoğraflardan gri seviye eş oluşum matrisi kullanılarak korelasyon, enerji, homojenlik, kontrast ve entropi öznitelikleri çıkarılmıştır. Çıkarılan öznitelikler, ortalama değerleri alındıktan sonra %66 test ve %33 eğitim olarak ayrılmıştır. Verilere 3 kat çapraz doğrulama uygulanmış ve verilerin %66’sı kullanılarak geri beslemeli bir yapay sinir ağı, sınıflandırma ve regresyon ağaçları algoritması ve k en yakın komşuluk algoritması eğitilmiştir. %33 test verisi ile de sınıflandırma başarısı test edilmiştir. Sonuç olarak, k en yakın komşuluk algoritması ile ortalama %86,7 doğruluk, karar ağaçları ile ortalama %87,8 doğruluk ve yapay sinir ağı algoritması ile de ortalama %96,7 doğruluk ile glokom ve sağlıklı bireyler sınıflandırılmıştır. Elde edilen sonuçlara göre glokom rahatsızlığının gri seviye eş oluşum matrisi öznitelikleri ile glokom hastalığının yüksek doğrulukta tespit edilebildiği görülmüştür.
- Published
- 2022
11. Study on Quantitative Expression of Cycling Workload
- Author
-
Shangwen Qu, Ronghua Wang, Jiangbi Hu, and Li Yang
- Subjects
Fluid Flow and Transfer Processes ,Process Chemistry and Technology ,General Engineering ,General Materials Science ,Instrumentation ,Computer Science Applications ,bikeway ,heart rate variability ,classification and regression tree ,bikeway characteristics ,human factor - Abstract
Improper design of the geometric elements and facilities of bikeway systems could endanger cyclists’ safety and comfort, resulting in an increased risk of bicycle accidents; such accidents sometimes have severe consequences, namely casualties. The method of expression for cyclists’ safety and comfort and the question of how the correlation of these factors with bikeway characteristics—such as the design of geometry and facilities—can be quantitatively described are the key problems facing a reduction in accident risk. Cycling workload can be employed to assess cyclists’ safety and comfort. However, there has been little quantitative expression research on this topic, with no clear definition of cycling workload. The quantitative expression of cycling workload is important for developing guidance for the safe design and operational management of bikeways; this is necessary for controlling conditions that might induce overworking and discomfort among users. In this paper, the concept of cycling workload is clearly defined based on cyclists’ comfort and safety formation mechanisms. Through a literature review and a comparative analysis, it is inferred that heart rate variability (HRV) can be used as a quantitative measure and the low-frequency–high-frequency ratio (LF/HF) can be used as a physiological signal to quantify cycling workload. A subjective scale was found to effectively express cyclists’ feelings of safety and comfort, with the performance assessed according to a human factor engineering research paradigm that classified cycling status into three qualitative levels—comfortable; a little stressful; and stressful. In order to form various cycling workload states and to obtain the relationship between LF/HF data and various bikeway characteristics, we designed a field cycling experiment. This was conducted by 24 participants who wore a physiological measuring apparatus under three different bikeway characteristic scenario types including variations in cycling width, direction, and bikeway edges at four cycling speeds in the 10–25 km/h range. Statistical analysis was used to address the collected LF/HF values and the subjective scale results, and a quantitative model for assessing cycling workload was established. By adopting a classification and regression tree (CART) algorithm as a data-mining method, the classification threshold values (ΔHRV) of three cycling workload levels were obtained: 19 indicated a level between comfortable and a little stressful; and 79 indicated a level between a little stressful and stressful.
- Published
- 2022
- Full Text
- View/download PDF
12. Decision Tree-Based Foot Orthosis Prescription for Patients with Pes Planus
- Author
-
Ji-Yong Jung, Chang-Min Yang, and Jung-Ja Kim
- Subjects
Prescriptions ,pes planus ,foot orthosis ,decision tree ,classification and regression tree ,machine learning ,Health, Toxicology and Mutagenesis ,Decision Trees ,Public Health, Environmental and Occupational Health ,Foot Orthoses ,Humans ,Flatfoot ,Biomechanical Phenomena - Abstract
Pes planus, one of the most common foot deformities, includes the loss of the medial arch, misalignment of the rearfoot, and abduction of the forefoot, which negatively affects posture and gait. Foot orthosis, which is effective in normalizing the arch and providing stability during walking, is prescribed for the purpose of treatment and correction. Currently, machine learning technology for classifying and diagnosing foot types is being developed, but it has not yet been applied to the prescription of foot orthosis for the treatment and management of pes planus. Thus, the aim of this study is to propose a model that can prescribe a customized foot orthosis to patients with pes planus by learning from and analyzing various clinical data based on a decision tree algorithm called classification and regressing tree (CART). A total of 8 parameters were selected based on the feature importance, and 15 rules for the prescription of foot orthosis were generated. The proposed model based on the CART algorithm achieved an accuracy of 80.16%. This result suggests that the CART model developed in this study can provide adequate help to clinicians in prescribing foot orthosis easily and accurately for patients with pes planus. In the future, we plan to acquire more clinical data and develop a model that can prescribe more accurate and stable foot orthosis using various machine learning technologies.
- Published
- 2022
- Full Text
- View/download PDF
13. Application of intelligent algorithms in Down syndrome screening during second trimester pregnancy
- Author
-
Hongguo Zhang, Ruizhi Liu, Si-Da Dai, Yuting Jiang, Xiaonan Hu, and Ling Li
- Subjects
Down syndrome ,Down syndrome screening ,medicine.medical_specialty ,Support vector machine ,Classification and regression tree ,business.industry ,Obstetrics ,Prenatal screening ,General Medicine ,medicine.disease ,Intelligent algorithms ,03 medical and health sciences ,0302 clinical medicine ,Retrospective Study ,030220 oncology & carcinogenesis ,medicine ,Second trimester pregnancy ,030211 gastroenterology & hepatology ,business ,Algorithms ,Risk cutoff value - Abstract
BACKGROUND Down syndrome (DS) is one of the most common chromosomal aneuploidy diseases. Prenatal screening and diagnostic tests can aid the early diagnosis, appropriate management of these fetuses, and give parents an informed choice about whether or not to terminate a pregnancy. In recent years, investigations have been conducted to achieve a high detection rate (DR) and reduce the false positive rate (FPR). Hospitals have accumulated large numbers of screened cases. However, artificial intelligence methods are rarely used in the risk assessment of prenatal screening for DS. AIM To use a support vector machine algorithm, classification and regression tree algorithm, and AdaBoost algorithm in machine learning for modeling and analysis of prenatal DS screening. METHODS The dataset was from the Center for Prenatal Diagnosis at the First Hospital of Jilin University. We designed and developed intelligent algorithms based on the synthetic minority over-sampling technique (SMOTE)-Tomek and adaptive synthetic sampling over-sampling techniques to preprocess the dataset of prenatal screening information. The machine learning model was then established. Finally, the feasibility of artificial intelligence algorithms in DS screening evaluation is discussed. RESULTS The database contained 31 DS diagnosed cases, accounting for 0.03% of all patients. The dataset showed a large difference between the numbers of DS affected and non-affected cases. A combination of over-sampling and under-sampling techniques can greatly increase the performance of the algorithm at processing non-balanced datasets. As the number of iterations increases, the combination of the classification and regression tree algorithm and the SMOTE-Tomek over-sampling technique can obtain a high DR while keeping the FPR to a minimum. CONCLUSION The support vector machine algorithm and the classification and regression tree algorithm achieved good results on the DS screening dataset. When the T21 risk cutoff value was set to 270, machine learning methods had a higher DR and a lower FPR than statistical methods.
- Published
- 2021
14. The Influence of Data Density and Integration on Forest Canopy Cover Mapping Using Sentinel-1 and Sentinel-2 Time Series in Mediterranean Oak Forests
- Author
-
Vahid Nasiri, Seyed Mohammad Moein Sadeghi, Fardin Moradi, Samaneh Afshari, Azade Deljouei, Verena C. Griess, Carmen Maftei, and Stelian Alexandru Borz
- Subjects
classification and regression tree ,Geography, Planning and Development ,Sentinel time series ,forest canopy cover ,Google Earth Engine ,machine learning ,random forest ,support vector machine ,Quercus brantii ,Iran ,Earth and Planetary Sciences (miscellaneous) ,Computers in Earth Sciences - Abstract
Forest canopy cover (FCC) is one of the most important forest inventory parameters and plays a critical role in evaluating forest functions. This study examines the potential of integrating Sentinel-1 (S-1) and Sentinel-2 (S-2) data to map FCC in the heterogeneous Mediterranean oak forests of western Iran in different data densities (one-year datasets vs. three-year datasets). This study used very high-resolution satellite images from Google Earth, gridded points, and field inventory plots to generate a reference dataset. Based on it, four FCC classes were defined, namely non-forest, sparse forest (FCC = 1–30%), medium-density forest (FCC = 31–60%), and dense forest (FCC > 60%). In this study, three machine learning (ML) models, including Random Forest (RF), Support Vector Machine (SVM), and Classification and Regression Tree (CART), were used in the Google Earth Engine and their performance was compared for classification. Results showed that the SVM produced the highest accuracy on FCC mapping. The three-year time series increased the ability of all ML models to classify FCC classes, in particular the sparse forest class, which was not distinguished well by the one-year dataset. Class-level accuracy assessment results showed a remarkable increase in F-1 scores for sparse forest classification by integrating S-1 and S-2 (10.4% to 18.2% increased for the CART and SVM ML models, respectively). In conclusion, the synergetic use of S-1 and S-2 spectral temporal metrics improved the classification accuracy compared to that obtained using only S-2. The study relied on open data and freely available tools and can be integrated into national monitoring systems of FCC in Mediterranean oak forests of Iran and neighboring countries with similar forest attributes., ISPRS International Journal of Geo-Information, 11 (S 8), ISSN:2220-9964
- Published
- 2022
- Full Text
- View/download PDF
15. Novel Prognostic Models for Predicting the 180-day Outcome for Patients with Hepatitis-B Virus-related Acute-on-chronic Liver Failure
- Author
-
Jun Yang, Zhong-Ying Wang, Jing Wu, Ran Xue, and Qinghua Meng
- Subjects
Hepatitis B virus ,medicine.medical_specialty ,Logistic regression model ,Hepatology ,business.industry ,Classification and regression tree ,Liver failure ,Hepatitis B ,medicine.disease ,Logistic regression ,medicine.disease_cause ,Acute-on-chronic hepatitis B liver failure ,Outcome (game theory) ,Gastroenterology ,MELD scores ,Internal medicine ,medicine ,Acute on chronic liver failure ,Original Article ,business ,Prognostic models - Abstract
Background and Aims It remains difficult to forecast the 180-day prognosis of patients with hepatitis B virus-acute-on-chronic liver failure (HBV-ACLF) using existing prognostic models. The present study aimed to derive novel-innovative models to enhance the predictive effectiveness of the 180-day mortality in HBV-ACLF. Methods The present cohort study examined 171 HBV-ACLF patients (non-survivors, n=62; survivors, n=109). The 27 retrospectively collected parameters included the basic demographic characteristics, clinical comorbidities, and laboratory values. Backward stepwise logistic regression (LR) and the classification and regression tree (CART) analysis were used to derive two predictive models. Meanwhile, a nomogram was created based on the LR analysis. The accuracy of the LR and CART model was detected through the area under the receiver operating characteristic curve (AUROC), compared with model of end-stage liver disease (MELD) scores. Results Among 171 HBV-ACLF patients, the mean age was 45.17 years-old, and 11.7% of the patients were female. The LR model was constructed with six independent factors, which included age, total bilirubin, prothrombin activity, lymphocytes, monocytes and hepatic encephalopathy. The following seven variables were the prognostic factors for HBV-ACLF in the CART model: age, total bilirubin, prothrombin time, lymphocytes, neutrophils, monocytes, and blood urea nitrogen. The AUROC for the CART model (0.878) was similar to that for the LR model (0.878, p=0.898), and this exceeded that for the MELD scores (0.728, p
- Published
- 2021
16. Dynamic Bayesian Network-Based Escape Probability Estimation for Coach Fire Accidents
- Author
-
Rong Huang, Xuan Zhao, Chenyu Zhou, and Qiang Yu
- Subjects
classification and regression tree ,Computer science ,Reliability (computer networking) ,escape probability estimation ,lcsh:TA1001-1280 ,Decision tree ,Analytic hierarchy process ,coach fire escape safety ,Ocean Engineering ,Vehicle fire ,dynamic bayesian network ,Aisle ,Moment (mathematics) ,escape behavior experiment ,dynamic Bayesian network ,lcsh:Transportation engineering ,Engineering (miscellaneous) ,Rotation (mathematics) ,Dynamic Bayesian network ,Simulation ,Civil and Structural Engineering - Abstract
Coach emergency escape research is an effective measure to reduce casualties under serious vehicle fire accidents. A novel experiment method employing a wireless transducer was implemented and the head rotation speed, rotation moment and rotation duration were collected as the input variables for the classification and regression tree (CART) model. Based on this model, the classification result explicitly pointed out that the exit searching efficiency was evolving. By ignoring the last three unimportant factors from the Analytic Hierarchy Process (AHP), the ultimate Dynamic Bayesian Network (DBN) was built with the temporal part of the CART output and the time-independent part of the vehicle characteristics. Simulation showed that the most efficient exit searching period is the middle escape stage, which is 10 seconds after the emergency signal is triggered, and the escape probability clearly increases with the efficient exit searching. Furthermore, receiving emergency escape training contributes to a significant escape probability improvement of more than 10%. Compared with different failure modes, the emergency hammer layout and door reliability have a more significant influence on the escape probability improvement than aisle condition. Based on the simulation results, the escape probability will significantly drop below 0.55 if the emergency hammers, door, and aisle are all in a failure state.
- Published
- 2021
17. Utilizing Different Machine Learning Techniques to Examine Speeding Violations
- Author
-
Ahmad H. Alomari, Bara’ W. Al-Mistarehi, Tasneem K. Alnaasan, and Motasem S. Obeidat
- Subjects
Fluid Flow and Transfer Processes ,Process Chemistry and Technology ,General Engineering ,speeding violations ,machine learning ,Classification and Regression Tree ,Random Forest ,Multi-Layer Perceptron ,General Materials Science ,Instrumentation ,Computer Science Applications - Abstract
This study investigated the potential impacts on speeding violations in the United States, including the top ten states in terms of crashes: California, Florida, Georgia, Illinois, Michigan, North Carolina, Ohio, Pennsylvania, Tennessee, and Texas. Several variables connected to the driver, surroundings, vehicle, road, and weather were investigated. Three different machine learning algorithms—Random Forest (RF), Classification and Regression Tree (CART), and Multi-Layer Perceptron (MLP)—were applied to predict speeding violations. Accuracy, F-measure, Kappa statistic, Root Mean Squared Error (RMSE), Area Under Curve (AUC), and Receiver Operating Characteristic (ROC) were used to evaluate the algorithms’ performance. Findings showed that age, accident year, road alignment, weather, accident time, and speed limits are the most significant variables. The algorithms used showed excellent ability in analyzing and predicting speeding violations. The RF was the best method for analyzing and predicting speeding violations. Understanding how these factors affect speeding violations helps decision-makers devise ways to cut down on these violations and make the roads safer.
- Published
- 2023
18. An Electricity Consumption Disaggregation Method for HVAC Terminal Units in Sub-Metered Buildings Based on CART Algorithm
- Author
-
Xinyu Yang, Ying Ji, Jiefan Gu, and Menghan Niu
- Subjects
Architecture ,electricity sub-metering ,HVAC terminal units ,hourly energy use ,classification and regression tree ,Building and Construction ,Civil and Structural Engineering - Abstract
Obtaining reliable and detailed energy consumption information about building service (BS) systems is an essential prerequisite for identifying energy-saving potential and improving energy efficiency of a building. Therefore, in recent years, energy sub-metering systems have been widely implemented in public buildings in China. A majority of electrical systems and equipment can be directly metered. However, in actual sub-metering systems, the terminal units of heating, ventilation and air conditioning (HVAC) systems, such as fan coils, air handling units and so on, are often mixed with the lighting-plug circuit. This mismatch between theoretical sub-metering systems and actual electricity supply circuits constitutes a lot of challenges in BS system management and control optimization. This study proposed an indirect method to disaggregate the energy consumption of HVAC terminal units from mixed sub-metering data based on the CART algorithm. This method was demonstrated in two buildings in Shanghai. The case study results show that the weighted mean absolute percentage errors (WMAPE) are within 5% and 15% during working hours in the cooling and heating seasons, respectively.
- Published
- 2023
19. Classification of Malaria Complication Using CART (Classification and Regression Tree) and Naïve Bayes
- Author
-
Rachmadania Irmanita, null Sri Suryani Prasetiyowati, and null Yuliant Sibaroni
- Subjects
Cart ,Pediatrics ,medicine.medical_specialty ,biology ,classification and regression tree ,lcsh:T58.5-58.64 ,lcsh:Information technology ,Anopheles ,malaria ,Disease ,naive bayes ,Hypoglycemia ,biology.organism_classification ,medicine.disease ,lcsh:TA168 ,Naive Bayes classifier ,Malaria, Classification and Regression Tree, Naïve Bayes ,lcsh:Systems engineering ,parasitic diseases ,medicine ,Precision and recall ,Complication ,Malaria - Abstract
Malaria is a disease caused by the Plasmodium parasite that transmitted by female Anopheles mosquitoes. Malaria can become a dangerous disease if late have the medical treatment. The late medical treatment happened because of misdiagnosis and lack of medical staff, especially in the countryside. This problem can cause severe malaria that has complications. This study creates a system prediction to classify the severe malaria disease using Classification and Regression Tree (CART) method and the probability of malaria complication using Naïve Bayes method. The first step of this study is classifying the patients that have symptom are infected severe malaria or not based on the model that has been built. The next step, if the patient classified severe malaria then the data predicted if there any probability of complication by the malaria. There are 8 possibilities of complication malaria which are convulsion, hypoglycemia, hyperpyrexia, and the combinations of these four. The first step will evaluate by using F-score, precision and recall while the second step will evaluate by using accuracy. The highest result F-score, precision and recall are 0.551, 0.471 and 0.717. The highest accuracy 81.2% which predicted the complication is Hypoglycemia.
- Published
- 2021
- Full Text
- View/download PDF
20. Optimal Bayesian design for model discrimination via classification
- Author
-
Markus Hainy, David J. Price, Olivier Restif, Christopher Drovandi, Hainy, Markus [0000-0003-4834-0250], Price, David J [0000-0003-0076-3123], Restif, Olivier [0000-0001-9158-853X], Drovandi, Christopher [0000-0001-9222-8763], and Apollo - University of Cambridge Repository
- Subjects
Statistics and Probability ,FOS: Computer and information sciences ,Random Forest ,62K99, 62P10, 62P12 ,Statistics - Computation ,Article ,Theoretical Computer Science ,Statistics::Computation ,Methodology (stat.ME) ,ComputingMethodologies_PATTERNRECOGNITION ,Computational Theory and Mathematics ,Approximate Bayesian Computation ,Bayesian Model Selection ,Simulation-Based Bayesian Experimental Design ,Statistics, Probability and Uncertainty ,Continuous-time Markov Process ,Classification And Regression Tree ,Statistics - Methodology ,Computation (stat.CO) - Abstract
Performing optimal Bayesian design for discriminating between competing models is computationally intensive as it involves estimating posterior model probabilities for thousands of simulated data sets. This issue is compounded further when the likelihood functions for the rival models are computationally expensive. A new approach using supervised classification methods is developed to perform Bayesian optimal model discrimination design. This approach requires considerably fewer simulations from the candidate models than previous approaches using approximate Bayesian computation. Further, it is easy to assess the performance of the optimal design through the misclassification error rate. The approach is particularly useful in the presence of models with intractable likelihoods but can also provide computational advantages when the likelihoods are manageable., Biotechnology and Biological Sciences Research Council grant BB/M020193/1
- Published
- 2022
- Full Text
- View/download PDF
21. Classification of Driver Injury Severity for Accidents Involving Heavy Vehicles with Decision Tree and Random Forest
- Author
-
Aziemah Azhar, Noratiqah Mohd Ariff, Mohd Aftar Abu Bakar, and Azzuhana Roslan
- Subjects
Renewable Energy, Sustainability and the Environment ,Geography, Planning and Development ,classification and regression tree ,driver injury severity ,heavy vehicles accident ,machine learning ,random forest ,Management, Monitoring, Policy and Law ,human activities - Abstract
Accidents involving heavy vehicles are of significant concern as it poses a higher risk of fatality to both heavy vehicle drivers and other road users. This study is carried out based on the heavy vehicle crash data of 2014, extracted from the MIROS Road Accident and Analysis and Database System (M-ROADS). The main objective of this study is to identify significant variables associated with categories of injury severity as well as classify and predict heavy vehicle drivers’ injury severity in Malaysia using the classification and regression tree (CART) and random forest (RF) methods. Both CART and RF found that types of collision, driver errors, number of vehicles involved, driver’s age, lighting condition and types of heavy vehicle are significant factors in predicting the severity of heavy vehicle drivers’ injuries. Both models are comparable, but the RF classifier achieved slightly better accuracy. This study implies that the variables associated with categories of injury severity can be referred by road safety practitioners to plan for the best measures needed in reducing road fatalities, especially among heavy vehicle drivers.
- Published
- 2022
- Full Text
- View/download PDF
22. Geographical Disparity and Associated Factors of COPD Prevalence in China: A Spatial Analysis of National Cross-Sectional Study
- Author
-
Wang N, Cong S, Fan J, Bao H, Wang B, Yang T, Feng Y, Liu Y, Wang L, Wang C, Hu W, and Fang L
- Subjects
lcsh:RC705-779 ,classification and regression tree ,spatial clusters ,copd ,kriging ,lcsh:Diseases of the respiratory system - Abstract
Ning Wang,1,2 Shu Cong,1 Jing Fan,1 Heling Bao,1 Baohua Wang,1 Ting Yang,3 Yajing Feng,1 Yang Liu,4 Linhong Wang,1 Chen Wang,3,5 Wenbiao Hu,2 Liwen Fang1 1National Center for Chronic Non-Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, People’s Republic of China; 2School of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD 4059, Australia; 3Center of Respiratory Medicine, China–Japan Friendship Hospital, Beijing, People’s Republic of China; 4Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA, USA; 5Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People’s Republic of ChinaCorrespondence: Liwen FangNational Center for Chronic Non-Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, 27 Nanwei Road, Xicheng District, Beijing 100050, People’s Republic of ChinaTel +86 135 5239 3376Fax +86 010 6304 2350Email fangliwen@ncncd.chinacdc.cnWenbiao HuSchool of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland 4059, AustraliaTel/Fax +61 7 3138 5724Email w2.hu@qut.edu.auPurpose: COPD prevalence has rapidly increased in China, but the geographical disparities in COPD prevalence remain largely unknown. This study aimed to assess city-level disparities in COPD prevalence and identify the relative importance of COPD related risk factors in mainland China.Patients and Methods: A nationwide cross-sectional study of COPD recruited 66,752 adults across the mainland China between 2014 and 2015. Patients with COPD were ascertained by a post-bronchodilator pulmonary function test. We estimated the city-specific prevalence of COPD by spatial kriging interpolation method. We detected spatial clusters with a significantly higher prevalence of COPD by spatial scan statistics. We determined the relative importance of COPD associated risk factors by a nonparametric and nonlinear classification and regression tree (CART) model.Results: The three spatial clusters with the highest prevalence of COPD were located in parts of Sichuan, Gansu, and Shaanxi, etc. (relative risks (RRs)) ranging from 1.55 (95% CI 1.55– 1.56) to 1.33 (95% CI 1.33– 1.33)). CART showed that advanced age (≥ 60 years) was the most important factor associated with COPD in the overall population, followed by smoking. We estimated that there were about 28.5 million potentially avoidable cases of COPD among people aged 40 or older if they never smoked. PM2.5 was an important associated risk factor for COPD in the north, northeast, and southwest of China. After adjusting for age and smoking, the spatial cluster with the highest prevalence shifted to most of Sichuan, Gansu, Qinghai, and Ningxia, etc. (RR 1.65 (95% CI 1.63– 1.67)).Conclusion: The spatial clusters of COPD at the city level and regionally varied important risk factors for COPD would help develop tailored interventions for COPD in China. After adjusting for the main risk factors, the spatial clusters of COPD shifted, indicating that there would be other potential risk factors for the remaining clusters which call for further studies.Keywords: COPD, spatial clusters, kriging, classification and regression tree
- Published
- 2020
23. Geographical Disparity and Associated Factors of COPD Prevalence in China: A Spatial Analysis of National Cross-Sectional Study
- Author
-
Wang, Ning, Cong, Shu, Fan, Jing, Bao, Heling, Wang, Baohua, Yang, Ting, Feng, Yajing, Liu, Yang, Wang, Linhong, Wang, Chen, Hu, Wenbiao, and Fang, Liwen
- Subjects
Adult ,Male ,China ,classification and regression tree ,International Journal of Chronic Obstructive Pulmonary Disease ,Risk Assessment ,Pulmonary Disease, Chronic Obstructive ,Risk Factors ,Prevalence ,COPD ,Cluster Analysis ,Humans ,kriging ,Original Research ,Aged ,Spatial Analysis ,Smoking ,spatial clusters ,Age Factors ,Urban Health ,Health Status Disparities ,Middle Aged ,Health Surveys ,respiratory tract diseases ,Cross-Sectional Studies ,Female - Abstract
Ning Wang,1,2 Shu Cong,1 Jing Fan,1 Heling Bao,1 Baohua Wang,1 Ting Yang,3 Yajing Feng,1 Yang Liu,4 Linhong Wang,1 Chen Wang,3,5 Wenbiao Hu,2 Liwen Fang1 1National Center for Chronic Non-Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, People’s Republic of China; 2School of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD 4059, Australia; 3Center of Respiratory Medicine, China–Japan Friendship Hospital, Beijing, People’s Republic of China; 4Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA, USA; 5Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People’s Republic of ChinaCorrespondence: Liwen FangNational Center for Chronic Non-Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, 27 Nanwei Road, Xicheng District, Beijing 100050, People’s Republic of ChinaTel +86 135 5239 3376Fax +86 010 6304 2350Email fangliwen@ncncd.chinacdc.cnWenbiao HuSchool of Public Health and Social Work, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland 4059, AustraliaTel/Fax +61 7 3138 5724Email w2.hu@qut.edu.auPurpose: COPD prevalence has rapidly increased in China, but the geographical disparities in COPD prevalence remain largely unknown. This study aimed to assess city-level disparities in COPD prevalence and identify the relative importance of COPD related risk factors in mainland China.Patients and Methods: A nationwide cross-sectional study of COPD recruited 66,752 adults across the mainland China between 2014 and 2015. Patients with COPD were ascertained by a post-bronchodilator pulmonary function test. We estimated the city-specific prevalence of COPD by spatial kriging interpolation method. We detected spatial clusters with a significantly higher prevalence of COPD by spatial scan statistics. We determined the relative importance of COPD associated risk factors by a nonparametric and nonlinear classification and regression tree (CART) model.Results: The three spatial clusters with the highest prevalence of COPD were located in parts of Sichuan, Gansu, and Shaanxi, etc. (relative risks (RRs)) ranging from 1.55 (95% CI 1.55– 1.56) to 1.33 (95% CI 1.33– 1.33)). CART showed that advanced age (≥ 60 years) was the most important factor associated with COPD in the overall population, followed by smoking. We estimated that there were about 28.5 million potentially avoidable cases of COPD among people aged 40 or older if they never smoked. PM2.5 was an important associated risk factor for COPD in the north, northeast, and southwest of China. After adjusting for age and smoking, the spatial cluster with the highest prevalence shifted to most of Sichuan, Gansu, Qinghai, and Ningxia, etc. (RR 1.65 (95% CI 1.63– 1.67)).Conclusion: The spatial clusters of COPD at the city level and regionally varied important risk factors for COPD would help develop tailored interventions for COPD in China. After adjusting for the main risk factors, the spatial clusters of COPD shifted, indicating that there would be other potential risk factors for the remaining clusters which call for further studies.Keywords: COPD, spatial clusters, kriging, classification and regression tree
- Published
- 2020
24. Wake-up Stroke Outcome Prediction by Interpretable Decision Tree Model
- Author
-
Miloš Ajčević, Aleksandar Miladinović, Giovanni Furlanis, Alex Buoite Stella, Marcello Naccarato, Paola Caruso, Paolo Manganotti, Agostino Accardo, Ajcevic, M., Miladinovic, A., Furlanis, G., Buoite Stella, A., Naccarato, M., Caruso, P., Manganotti, P., and Accardo, A.
- Subjects
Prognosi ,Clinical outcome ,Decision Tree ,Decision Trees ,Classification and Regression Tree ,Prognosis ,Predictive modeling ,Stroke ,Treatment Outcome ,Wake-up stroke ,Humans ,Ischemic Stroke ,Human - Abstract
Outcome prediction in wake-up ischemic stroke (WUS) is important for guiding treatment strategies, in order to improve recovery and minimize disability. We aimed at producing an interpretable model to predict a good outcome (NIHSS 7-day
- Published
- 2022
25. Statistics and Computing / Optimal Bayesian design for model discrimination via classification
- Author
-
Hainy, Markus, Price, David J., Restif, Olivier, and Drovandi, Christopher
- Subjects
Continuous-time Markov process ,Classification and regression tree ,Approximate Bayesian computation ,Bayesian model selection ,Simulation-based Bayesian experimental design ,Random forest - Abstract
Performing optimal Bayesian design for discriminating between competing models is computationally intensive as it involves estimating posterior model probabilities for thousands of simulated data sets. This issue is compounded further when the likelihood functions for the rival models are computationally expensive. A new approach using supervised classification methods is developed to perform Bayesian optimal model discrimination design. This approach requires considerably fewer simulations from the candidate models than previous approaches using approximate Bayesian computation. Further, it is easy to assess the performance of the optimal design through the misclassification error rate. The approach is particularly useful in the presence of models with intractable likelihoods but can also provide computational advantages when the likelihoods are manageable. Fonds zur Förderung der Wissenschaftlichen Forschung J3959-N32 Version of record
- Published
- 2022
26. Assessment of Students’ Achievements and Competencies in Mathematics Using CART and CART Ensembles and Bagging with Combined Model Improvement by MARS
- Author
-
Hristina Kulina, A. Ivanov, and Snezhana Georgieva Gocheva-Ilieva
- Subjects
Cart ,classification and regression tree ,General Mathematics ,assessment ,CART ensembles and bagging ,02 engineering and technology ,Machine learning ,computer.software_genre ,cross-validation ,Cross-validation ,Software ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,ensemble model ,Engineering (miscellaneous) ,Mathematics ,Multivariate adaptive regression splines ,Ensemble forecasting ,business.industry ,lcsh:Mathematics ,05 social sciences ,050301 education ,Mars Exploration Program ,lcsh:QA1-939 ,Regression ,multivariate adaptive regression splines ,Test (assessment) ,ComputingMethodologies_PATTERNRECOGNITION ,machine learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,0503 education ,computer ,mathematical competency - Abstract
The aim of this study is to evaluate students&rsquo, achievements in mathematics using three machine learning regression methods: classification and regression trees (CART), CART ensembles and bagging (CART-EB) and multivariate adaptive regression splines (MARS). A novel ensemble methodology is proposed based on the combination of CART and CART-EB models in a new ensemble to regress the actual data using MARS. Results of a final exam test, control and home assignments, and other learning activities to assess students&rsquo, knowledge and competencies in applied mathematics are examined. The exam test combines problems on elements of mathematical analysis, statistics and a small practical project. The project is the new competence-oriented element, which requires students to formulate problems themselves, to choose different solutions and to use or not use specialized software. Initially, empirical data are statistically modeled using six CART and six CART-EB competing models. The models achieve a goodness-of-fit up to 96% to actual data. The impact of the examined factors on the students&rsquo, success at the final exam is determined. Using the best of these models and proposed novel ensemble procedure, final MARS models are built that outperform the other models for predicting the achievements of students in applied mathematics.
- Published
- 2021
27. Parcel-Level Flood and Drought Detection for Insurance Using Sentinel-2A, Sentinel-1 SAR GRD and Mobile Images
- Author
-
Bipul Neupane, Aakash Thapa, and Teerayut Horanont
- Subjects
General Earth and Planetary Sciences ,normalized difference vegetation index ,normalized difference water index ,classification and regression tree ,PlacesCNN ,cloud mask - Abstract
Floods and droughts cause catastrophic damage in paddy fields, and farmers need to be compensated for their loss. Mobile applications have allowed farmers to claim losses by providing mobile photos and polygons of their land plots drawn on satellite base maps. This paper studies diverse methods to verify those claims at a parcel level by employing (i) Normalized Difference Vegetation Index (NDVI) and (ii) Normalized Difference Water Index (NDWI) on Sentinel-2A images, (iii) Classification and Regression Tree (CART) on Sentinel-1 SAR GRD images, and (iv) a convolutional neural network (CNN) on mobile photos. To address the disturbance from clouds, we study the combination of multi-modal methods—NDVI+CNN and NDWI+CNN—that allow 86.21% and 83.79% accuracy in flood detection and 73.40% and 81.91% in drought detection, respectively. The SAR-based method outperforms the other methods in terms of accuracy in flood (98.77%) and drought (99.44%) detection, data acquisition, parcel coverage, cloud disturbance, and observing the area proportion of disasters in the field. The experiments conclude that the method of CART on SAR images is the most reliable to verify farmers’ claims for compensation. In addition, the CNN-based method’s performance on mobile photos is adequate, providing an alternative for the CART method in the case of data unavailability while using SAR images.
- Published
- 2022
28. Haplotype heterogeneity and low linkage disequilibrium reduce reliable prediction of genotypes for the ‑α3.7I form of α-thalassaemia using genome-wide microarray data
- Author
-
Christina Hubbart, Vysaul B. Nyirongo, George B.J. Busby, Thomas N. Williams, Kirk A. Rockett, Anna E. Jeffreys, Gavin Band, Kate Rowlands, Rosalind M. Harding, Carolyne M. Ndila, Alexander Macharia, and Consortium, MalariaGEN
- Subjects
0301 basic medicine ,Linkage disequilibrium ,haplotypes ,viruses ,Medicine (miscellaneous) ,Genome-wide association study ,Computational biology ,Biology ,Genome ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,0302 clinical medicine ,hemic and lymphatic diseases ,Predictive Models ,Genotype ,GWAS ,Typing ,Genotyping ,Confounding ,Haplotype ,virus diseases ,α-thalassaemia ,Classification and Regression Tree ,Articles ,biochemical phenomena, metabolism, and nutrition ,digestive system diseases ,Malaria ,030104 developmental biology ,030220 oncology & carcinogenesis ,multinomial regression-model ,Research Article - Abstract
Background: The -α3.7I-thalassaemia deletion is very common throughout Africa because it protects against malaria. When undertaking studies to investigate human genetic adaptations to malaria or other diseases, it is important to account for any confounding effects of α-thalassaemia to rule out spurious associations. Methods: In this study, we have used direct α-thalassaemia genotyping to understand why GWAS data from a large malaria association study in Kilifi Kenya did not identify the α-thalassaemia signal. We then explored the potential use of a number of new approaches to using GWAS data for imputing α-thalassaemia as an alternative to direct genotyping by PCR. Results: We found very low linkage-disequilibrium of the directly typed data with the GWAS SNP markers around α-thalassaemia and across the haemoglobin-alpha (HBA) gene region, which along with a complex haplotype structure, could explain the lack of an association signal from the GWAS SNP data. Some indirect typing methods gave results that were in broad agreement with those derived from direct genotyping and could identify an association signal, but none were sufficiently accurate to allow correct interpretation compared with direct typing, leading to confusing or erroneous results. Conclusions: We conclude that going forwards, direct typing methods such as PCR will still be required to account for α-thalassaemia in GWAS studies.
- Published
- 2021
29. Diagnosis of Problems in Truck Ore Transport Operations in Underground Mines Using Various Machine Learning Models and Data Collected by Internet of Things Systems
- Author
-
Hoang Nguyen, Dahee Jung, Sebeom Park, and Yosoon Choi
- Subjects
classification and regression tree ,Computer science ,Decision tree ,gaussian naïve bayes ,Machine learning ,computer.software_genre ,Field (computer science) ,Set (abstract data type) ,Naive Bayes classifier ,support vector machine ,business.industry ,k-nearest neighbors ,Geology ,transport time ,Geotechnical Engineering and Engineering Geology ,Mineralogy ,Support vector machine ,underground mine ,Hyperparameter optimization ,bluetooth beacon ,Artificial intelligence ,transport route ,F1 score ,Precision and recall ,business ,computer ,QE351-399.2 - Abstract
This study proposes a method for diagnosing problems in truck ore transport operations in underground mines using four machine learning models (i.e., Gaussian naïve Bayes (GNB), k-nearest neighbor (kNN), support vector machine (SVM), and classification and regression tree (CART)) and data collected by an Internet of Things system. A limestone underground mine with an applied mine production management system (using a tablet computer and Bluetooth beacon) is selected as the research area, and log data related to the truck travel time are collected. The machine learning models are trained and verified using the collected data, and grid search through 5-fold cross-validation is performed to improve the prediction accuracy of the models. The accuracy of CART is highest when the parameters leaf and split are set to 1 and 4, respectively (94.1%). In the validation of the machine learning models performed using the validation dataset (1500), the accuracy of the CART was 94.6%, and the precision and recall were 93.5% and 95.7%, respectively. In addition, it is confirmed that the F1 score reaches values as high as 94.6%. Through field application and analysis, it is confirmed that the proposed CART model can be utilized as a tool for monitoring and diagnosing the status of truck ore transport operations.
- Published
- 2021
30. Warming Increases the Carbon Sequestration Capacity of Picea schrenkiana in the Tianshan Mountains, China
- Author
-
Yaning Chen, Zhu Chenggang, Honghua Zhou, Yapeng Chen, Shifeng Chen, Yuhai Yang, and Wei-hong Li
- Subjects
biology ,classification and regression tree ,dendrochronology ,Diameter at breast height ,Climate change ,Forestry ,Carbon sequestration ,Atmospheric sciences ,biology.organism_classification ,carbon sequestration ,Carbon cycle ,tree rings ,climate change ,Ecosystem carbon ,Dendrochronology ,Environmental science ,Terrestrial ecosystem ,QK900-989 ,Plant ecology ,Picea schrenkiana - Abstract
As an essential part of terrestrial ecosystems, convenient and accurate reconstruction of the past carbon sequestration capacity of forests is critical to assess future trends of aboveground carbon storage and ecosystem carbon cycles. In addition, the relationship between climate change and carbon sequestration of forests has been vigorously debated. In this study, dynamic change of carbon sequestration capacity in aboveground biomass of Picea schrenkiana (hereinafter abbreviated as P. schrenkiana) in the Tianshan Mountains, northwestern China, from 1850–2017, were reconstructed using dendrochronology. The main climate drivers that affected carbon sequestration capacity in aboveground biomass of P. schrenkiana were then investigated. The results showed that: (1) tree-ring width and diameter at breast height (DBH) of P. schrenkiana obtained from different altitudes and ages were an effective and convenient estimation index for reconstructing the carbon sequestration capacity of P. schrenkiana. The carbon storage of P. schrenkiana forest in 2016 in the Tianshan Mountains was 50.08 Tg C calculated using tree-ring width and DBH, which was very close to the value determined by direct field investigation data. (2) The annual carbon sequestration potential capacity of P. schrenkiana exhibited an increasing trend from 1850–2017. Temperature, especially minimum temperature, constituted the key climatic driver resulting in increased carbon sequestration capacity. The contribution rates of temperature and minimum temperature to the change of P. schrenkiana carbon sequestration capacity was 75% and 44%, respectively. (3) The significant increase of winter temperature and minimum temperature led to warming in the Tianshan Mountains, resulting in a significant increase in carbon sequestration capacity of P. schrenkiana. The results indicate that, with the continuous increase of winter temperature and minimum temperature, carbon sequestration of P. schrenkiana in the Tianshan Mountains is predicted to increase markedly in the future. The findings of this study provide a useful basis to evaluate future aboveground carbon storage and carbon cycles in mountain systems possessed similar characteristics of the Tianshan Mountains.
- Published
- 2021
31. Development and validation of a Brief Diet Quality Assessment Tool in the French-speaking adults from Quebec
- Author
-
Jacynthe Lafrenière, Patrick Couture, Chantal Brisson, Simone Lemieux, Danielle Laurin, Benoît Lamarche, Stéphanie Harrison, and Denis Talbot
- Subjects
0301 basic medicine ,Cart ,Adult ,Male ,Waist ,Brief diet quality assessment tool ,Medicine (miscellaneous) ,Physical Therapy, Sports Therapy and Rehabilitation ,Healthy eating ,Clinical nutrition ,Diet Surveys ,Fasting insulin ,External validity ,03 medical and health sciences ,0302 clinical medicine ,Surveys and Questionnaires ,Medicine ,Humans ,030212 general & internal medicine ,lcsh:RC620-627 ,Aged ,030109 nutrition & dietetics ,Nutrition and Dietetics ,Receiver operating characteristic ,business.industry ,Classification and regression tree ,Research ,lcsh:Public aspects of medicine ,Quebec ,lcsh:RA1-1270 ,Middle Aged ,Diet ,lcsh:Nutritional diseases. Deficiency diseases ,Diet quality ,Alternative healthy eating index ,Female ,business ,Nutritive Value ,Demography - Abstract
Background The objective of this study was to develop and validate a short, self-administered questionnaire to assess diet quality in clinical settings, using the Alternative Healthy Eating Index (AHEI) as reference. Methods A total of 1040 men and women (aged 44.6 ± 14.4 y) completed a validated web-based food frequency questionnaire (webFFQ) and had their height and weight measured (development sample). Participants were categorized arbitrarily according to diet quality (high: AHEI score ≥ 65/110, low: AHEI score
- Published
- 2019
32. Enhancing prognosis prediction using pre-treatment nodal SUVmax and HPV status in cervical squamous cell carcinoma
- Author
-
Sang-Woo Lee, Jaetae Lee, Shin-Hyung Park, Shin Young Jeong, Gun Oh Chong, Byeong-Cheol Ahn, Chae Moon Hong, Yoon Hee Lee, and Ju Hye Jeong
- Subjects
Adult ,Oncology ,lcsh:Medical physics. Medical radiology. Nuclear medicine ,medicine.medical_specialty ,Prognostic variable ,Multivariate analysis ,Cervical Squamous Cell Carcinoma ,Prognosis prediction ,lcsh:R895-920 ,Uterine Cervical Neoplasms ,Human papilloma virus ,Alphapapillomavirus ,lcsh:RC254-282 ,Group B ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Fluorodeoxyglucose F18 ,Positron Emission Tomography Computed Tomography ,Internal medicine ,medicine ,Humans ,Radiology, Nuclear Medicine and imaging ,Hpv status ,Aged ,Aged, 80 and over ,Cervical cancer ,Radiological and Ultrasound Technology ,business.industry ,Classification and regression tree ,HPV infection ,General Medicine ,Middle Aged ,Prognosis ,medicine.disease ,lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,FDG PET/CT ,Concurrent chemoradiotherapy ,030220 oncology & carcinogenesis ,Carcinoma, Squamous Cell ,Female ,Lymph node ,Radiopharmaceuticals ,business ,NODAL ,Research Article - Abstract
Background This study was to evaluate the prognostic value of metabolic parameters on F-18-FDG PET/CT and the status of human papillomavirus (HPV) infection and known prognostic variables for predicting tumor recurrence and investigating a prognostic model in patients with locally advanced cervical cancer treated with concurrent chemoradiotherapy (CCRT). Methods A total of 129 patients with cervical squamous cell carcinoma who underwent initial CCRT were eligible for this study. Univariate and multivariate analyses were performed using traditional prognostic factors, metabolic parameters, and HPV infection. Classification and regression decision tree (CART) was used to establish new classification. Results Among 129 patients, 29 patients (22.5%) had recurrence after a median follow-up of 60 months (range, 3–125 months). Tumor size, para-aortic lymph node metastasis, nodal SUVmax, and HPV infection status were identified as independent prognostic factors by multivariate analysis. The CART analysis classified the patients into three groups. The first node was nodal SUVmax, and HPV status was the second node for patients with nodal SUVmax ≤7.49; Group A (nodal SUVmax ≤7.49 and HPV positive, HR 1.0), Group B (nodal SUVmax ≤7.49 and HPV negative, HR 3.56), and Group C (nodal SUVmax > 7.49, HR 10.13). Disease-free survival was significantly different among the three groups (p
- Published
- 2019
33. Exploring the optimum pattern for knowledge workers selection using DEA and CART compilation approach
- Author
-
Maryam Akhavan Kharazian, Mohammad Mahdi Shahbazi, and Mohammad Fatehi
- Subjects
employee selection (personnel selection) ,human resource management ,classification and regression tree ,data envelopment analysis ,data mining ,lcsh:Production management. Operations management ,lcsh:TS155-194 - Abstract
The success or failure of any organization is directly linked to the quality of its human resources selection (recruitment, measurement, and selection). By reviewing the data of a knowledge job, this paper aims to help improve the selection process of that job. Consequently, the selection of appropriate employees’ rate will increase, and the rate of human resource turnover will decrease. The approach of this paper is Applied Research and the strategy is Case Study. This paper combines two computational techniques (DEA and CART). Data Envelopment Analysis (DEA) is a non-parametric technique that determines the efficiency of individuals, but it does not provide information on the details of factors affecting performance (especially non-numerical factors). In the present study, this deficiency has been resolved using the Classification and Regression Tree (CART) (as a data mining technique). The result of this study has provided a framework for combining DEA and CART in order to discover rules on the recruitment of knowledge workers in a specific job (a knowledge job) and in a specific organization (HFJ Institute). The results indicate that ‘work experience’, ‘average score in the last degree’ and ‘age’ are related to the employee performance, and therefore it is necessary to be considered in the process of future recruitment of that job. Introduction: By reviewing the data of a knowledge job, this paper aims to help improve the selection process of that job. Consequently, the selection of appropriate employees’ rate will increase, and the rate of human resource turnover will decrease. The approach of this paper is Applied Research and the strategy is Case Study. In the literature review section, the definition of recruitment and selection (Azar et al. 2013), the definition of knowledge workers (Drucker 1994; Horwitz et al. 2006; Li et al. 2015), tasks of human resource management (Osman et al. 2011) and a background of the use of data mining in the field of human resources (Hajiheydari et al. 2017) have been reviewed. Materials and Methods: This paper combines two computational techniques (DEA and CART). Data Envelopment Analysis (DEA) is a non-parametric technique that determines the efficiency of individuals, but it does not provide information on the details of factors affecting performance (especially non-numerical factors). In the present study, this deficiency has been resolved using the Classification and Regression Tree (CART) (as a data mining technique). Results and Discussion: In this research, we tried to develop the previous models and present a new model. The result of this study has provided a framework for combining DEA and CART in order to discover rules on the recruitment of knowledge workers in a specific job (knowledge job) and in a specific organization (Hedayat-e Farhighteghgan-e Javan (HFJ) Institute). The combination of data envelopment analysis and data mining approaches (and considering qualitative and implicit variables in the estimation of efficiency) is one of the most important innovations in this research. In the proposed framework, organizations can identify and recruit talents and appropriate individuals in a short time based on data mining and discovery of success patterns (resulted from their past experiences). This action avoids costs of frequent recruitment, and decreases turnover rate and improves performance. By analyzing the outputs of the designed model for the stage of recruitment and selection in this specific job (a knowledge job) and specific organization (HFJ Institute), six rules were extracted, and based on that, suggestions were given. At the stage of recruitment, it is better for this organization to take into consideration these rules (the status of jobseekers in ‘work experience’, ‘average score in the last degree’, and ‘age’) and then to decrease costs and failure rate of the recruitment process. Conclusion: The results are somewhat consistent with the results of previous studies. The proposed approach can be planned and implemented in various jobs and organizations to extract specific rules for these jobs and organizations in order to increase productivity in the process of human resource selection and recruitment. References Zhu, X., Seaver, W., Sawhney, R., Ji, S., Holt, B., Sanil, G. B., & Upreti, G. (2017). Employee turnover forecasting for human resource management based on time series analysis. Journal of Applied Statistics, 44 (8), 1421-1440. Lukovac, V., Pamučar, D., Popović, M., & Đorović, B. (2017). Portfolio model for analyzing human resources: An approach based on neuro-fuzzy modeling and the simulated annealing algorithm. Expert Systems with Applications, 90, 318-331. Osman, I. H., Berbary, L. N., Sidani, Y., Al-Ayoubi, B., & Emrouznejad, A. (2011). Data envelopment analysis model for the appraisal and relative performance evaluation of nurses at an intensive care unit. Journal of medical systems, 35 (5), 1039-1062.
- Published
- 2019
34. Integrated Learning via Randomized Forests and Localized Regression With Application to Medical Diagnosis
- Author
-
Qing-Guo Wang, Adeola Ogunleye, and Tshilidzi Marwala
- Subjects
hybrid expert system ,General Computer Science ,Computer science ,02 engineering and technology ,Machine learning ,computer.software_genre ,01 natural sciences ,Node (computer science) ,decision tree ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Medical diagnosis ,business.industry ,Classification and regression tree ,010401 analytical chemistry ,General Engineering ,Local regression ,Variance (accounting) ,Expert system ,Regression ,0104 chemical sciences ,Tree (data structure) ,020201 artificial intelligence & image processing ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,computer ,lcsh:TK1-9971 ,Test data - Abstract
The tree-based machine learning functions on the divide-and-conquer principle and is known to perform well in certain applications. In this paper, we first give a new data partitioning rule using the mean of the data columns to grow the tree till the child nodes are small in size. Then, the local regression is applied to leave nodes to enhance the resolution of the node outputs. Randomization is introduced at tree growth and forest creation. The local prediction accuracies on the leaves are used to select a subset of the test data for actual predictions. The case study on the diagnosis of autistic spectrum disorder shows that the proposed method achieves the prediction accuracy of the ensemble at above 96% with reduced variance, which is much better than those reported in the literature.
- Published
- 2019
35. Analyzing the Impact of High-Speed Rail on Tourism with Parametric and Non-Parametric Methods: The Case Study of China
- Author
-
Yin Ping, Filomena Mauriello, Francesca Pagliara, Pagliara, F., Mauriello, F., and Yin, P.
- Subjects
classification and regression tree ,Computer science ,Geography, Planning and Development ,0211 other engineering and technologies ,Decision tree ,TJ807-830 ,02 engineering and technology ,Management, Monitoring, Policy and Law ,TD194-195 ,Renewable energy sources ,Tourism market ,0502 economics and business ,Econometrics ,GE1-350 ,HIGH SPEED RAIL, TOURISM, PARAMETRIC AND NON PARAMETRIC METHODS ,China ,Generalized estimating equation ,Parametric statistics ,Estimation ,tourism market ,050210 logistics & transportation ,Environmental effects of industries and plants ,Renewable Energy, Sustainability and the Environment ,05 social sciences ,Nonparametric statistics ,021107 urban & regional planning ,Environmental sciences ,generalized estimating equation ,high-speed rail ,Chinese provinces ,Tourism - Abstract
High-speed rail (HSR) and tourism are closely related activities since improved mobility is perceived to facilitate tourist behavioral changes. The interest in research is very high and this contribution tries to provide an insight into this topic by making a comparison between the estimation of the parametric Generalized Estimating Equation (GEE) approaches with the non-parametric Classification and Regression Tree (CART). A dataset containing information both on tourism and transport for thirty Chinese provinces, during the 2001–2017 period, has been collected. The finding of this paper shows that the presence of HSR has value in the explanation of tourist arrivals.
- Published
- 2021
- Full Text
- View/download PDF
36. Measuring the Response Performance of U.S. States against COVID-19 Using an Integrated DEA, CART, and Logistic Regression Approach
- Author
-
Yong Shin Park, Yuan Xu, and Ju Dong Park
- Subjects
Cart ,classification and regression tree ,Leadership and Management ,Population ,Decision tree ,lcsh:Medicine ,Health Informatics ,02 engineering and technology ,Logistic regression ,Article ,03 medical and health sciences ,Health Information Management ,DEA ,Health care ,Statistics ,0202 electrical engineering, electronic engineering, information engineering ,Data envelopment analysis ,education ,Health policy ,Mathematics ,education.field_of_study ,business.industry ,030503 health policy & services ,Health Policy ,logistic regression ,lcsh:R ,COVID-19 ,Random forest ,machine learning ,020201 artificial intelligence & image processing ,0305 other medical science ,business - Abstract
Measuring the U.S.’s COVID-19 response performance is an extremely important challenge for health care policymakers. This study integrates Data Envelopment Analysis (DEA) with four different machine learning (ML) techniques to assess the efficiency and evaluate the U.S.’s COVID-19 response performance. First, DEA is applied to measure the efficiency of fifty U.S. states considering four inputs: number of tested, public funding, number of health care employees, number of hospital beds. Then, number of recovered from COVID-19 as a desirable output and number of confirmed COVID-19 cases as a undesirable output are considered. In the second stage, Classification and Regression Tree (CART), Boosted Tree (BT), Random Forest (RF), and Logistic Regression (LR) were applied to predict the COVID-19 response performance based on fifteen environmental factors, which were classified into social distancing, health policy, and socioeconomic measures. The results showed that 23 states were efficient with an average efficiency score of 0.97. Furthermore, BT and RF models produced the best prediction results and CART performed better than LR. Lastly, urban, physical inactivity, number of tested per population, population density, and total hospital beds per population were the most influential factors on efficiency.
- Published
- 2021
- Full Text
- View/download PDF
37. A Study on Re-Engagement and Stabilization Time on Take-Over Transition in a Highly Automated Driving System
- Author
-
Seung Jun Lee, Daesub Yoon, Junghee Jo, Hyun-Suk Kim, Jungsook Kim, and Woo Jin Kim
- Subjects
Cart ,classification and regression tree ,Computer Networks and Communications ,Computer science ,media_common.quotation_subject ,Control (management) ,lcsh:TK7800-8360 ,02 engineering and technology ,0202 electrical engineering, electronic engineering, information engineering ,0501 psychology and cognitive sciences ,Quality (business) ,Electrical and Electronic Engineering ,050107 human factors ,Simulation ,media_common ,take-over request ,Transition (fiction) ,05 social sciences ,lcsh:Electronics ,020206 networking & telecommunications ,Take over ,re-engagement ,stabilization ,control authority transition ,Hardware and Architecture ,Control and Systems Engineering ,Obstacle ,Signal Processing ,automated driving - Abstract
In the case of level 3 automated vehicles, in order to safely and quickly transfer control authority rights to manual driving, it is necessary that a study be conducted on the characteristics of human factors affecting the transition of manual driving. In this study, we conducted three experiments to compare the characteristics of human factors that influence the driver’s quality of response when re-engaging and stabilizing manual driving. The three experiments were conducted sequentially by dividing them into a normal driving situation, an obstacle occurrence situation in front, and an obstacle and congestion on surrounding roads. We performed a statistical analysis and classification and regression tree (CART) analysis using experimental data. We found that as the number of trials increased, there was a learning effect that shortened re-engagement times and increased the proportion of drivers with good response times. We found that the stabilization time increased as the experiment progressed, as obstacles appeared in front and traffic density increased in the surrounding lanes. The results of the analysis are useful for vehicle developers designing safer human–machine interfaces and for governments developing guidelines for automated driving systems.
- Published
- 2021
38. Comparison of different machine learning models for mass appraisal of real estate
- Author
-
Süleyman Sefa Bilgilioğlu, Hacı Murat Yilmaz, and Mühendislik Fakültesi
- Subjects
Artificial Neural Network ,Support Vector Machine ,Chi-square Automatic Interaction Detection ,Random Forest ,business.industry ,Computer science ,Mass appraisal ,Classification and Regression Tree ,Real estate ,Machine learning ,computer.software_genre ,Machine Learning ,Mass Appraisal ,Earth and Planetary Sciences (miscellaneous) ,Artificial intelligence ,Computers in Earth Sciences ,business ,computer ,Civil and Structural Engineering - Abstract
The present study aimed to compare five machine learning techniques, namely, artificial neural network (ANN), support vector machine (SVM), chi-square automatic interaction detection (CHAID), classification and regression tree (CART), and random forest (RF) for mass appraisal of real estate. Firstly, 1982 precedent data was collected throughout the entire study area for train and test models. Secondly, a total of 68 variables were considered for the mass appraisal. Subsequently, the five machine learning techniques were applied. Finally, the receiver operating characteristic (ROC) and various statistical methods were applied to compare five machine learning techniques.
- Published
- 2021
39. Öğretmen adaylarının akademik okuryazarlıklarını açıklayan değişkenlerin sınıflandırma ve regresyon ağacı ile belirlenmesi
- Author
-
Sütçü, Neşe Dokumacı, Dicle Üniversitesi, Ziya Gökalp Eğitim Fakültesi, Matematik ve Fen Bilimleri Eğitimi Bölümü, and Sütçü, Neşe Dokumacı
- Subjects
Akademik okuryazarlık ,C&RT analizi ,CRT analizi ,CART analizi ,Classification and regression tree ,Karar ağacı ,Decision tree ,Academic literacy ,CRT analysis ,Sınıflandırma ve regresyon ağacı ,C&RT analysis ,CART analysis - Abstract
Araştırmada, öğretmen adaylarının akademik okuryazarlıklarını açıklayan değişkenlerin belirlenmesi amaçlanmıştır. Araştırmaya, 2020-2021 eğitim-öğretim yılı bahar döneminde bir devlet üniversitesinin eğitim fakültesinde öğrenim gören 541 öğretmen adayı katılmıştır. Öğretmen adaylarından 156’sı matematik, 162’si fen bilgisi, 103’ü sınıf ve 120’si sosyal bilgiler branşındadır. İlişkisel tarama modelinin kullanıldığı araştırmada veriler, “Akademik Okuryazarlık Ölçeği” ile toplanmıştır. Ölçek 23 madde ve üç boyuttan oluşmaktadır. “Akademik Eğilim” boyutunda 11, “Araştırma Süreci” boyutunda 8 ve “Bilgi Kullanımı” boyutunda 4 madde bulunmaktadır. Verilerin analizinde Sınıflandırma ve Regresyon Ağacı (Classification and Regression Trees-C&RT/CART/CRT) kullanılmıştır. Araştırmada öğretmen adaylarının akademik eğilim boyutuna göre akademik okuryazarlıkları yüksek bulunmuştur. Akademik eğilim boyutunu açıklayan değişkenler lisansüstü eğitim alma isteği, anabilim dalı ve sınıf düzeyidir. Öğretmen adaylarının akademik eğilim boyutuna ilişkin ortalama puanlarını açıklayan en önemli değişkenin “lisansüstü eğitim alma istediği” olduğu, araştırmaya katılan öğretmen adaylarından lisansüstü eğitim almak isteyen öğretmen adaylarının ortalamalarının, lisansüstü eğitim almak istemeyen ve lisansüstü eğitim alıp almama konusunda kararsız olan öğretmen adaylarının ortalamalarından daha fazla olduğu tespit edilmiştir. Öğretmen adaylarının araştırma süreci boyutuna göre akademik okuryazarlıkları yüksek bulunmuştur. Araştırma süreci boyutunu açıklayan değişkenler anabilim dalı, sınıf düzeyi, yaş aralığı ve bilimsel/eğitimde araştırma yöntemleri dersini alma durumudur. Öğretmen adaylarının araştırma süreci boyutuna ilişkin ortalama puanlarını açıklayan en önemli değişkenin “anabilim dalı” olduğu; matematik, fen bilgisi ve sınıf öğretmen adaylarının ortalamalarının, sosyal bilgiler öğretmen adaylarının ortalamalarından daha yüksek olduğu tespit edilmiştir. Öğretmen adaylarının bilgi kullanımı boyutuna göre akademik okuryazarlıkları yüksek bulunmuştur. Bilgi kullanımı boyutunu açıklayan değişkenler sınıf düzeyi, lisansüstü eğitim alma isteği ve bilimsel/eğitimde araştırma yöntemleri dersini alma durumudur. Öğretmen adaylarının bilgi kullanımı boyutuna ilişkin ortalama puanlarını açıklayan en önemli değişkenin “sınıf düzeyi” olduğu ve 1, 2 ve 3. sınıflarda okuyan öğretmen adaylarının ortalamalarının, 4. sınıfta okuyan öğretmen adaylarının ortalamalarından daha az olduğu tespit edilmiştir. In this study aimed to determine the variables explaining the academic literacy of preservice teachers. The participants of the study consisted of 541 preservice teachers, who were enrolled in the education faculty of a state university located in the 2020-2021 academic year spring term. Of the preservice teachers, 156 are in mathematics, 162 in science, 103 in classroom and 120 in social studies. The data were collected using the "Academic Literacy Scale". The scale consists of 23 items and three dimensions. There are 11 items in the “Academic Tendency” dimension, 8 items in the “Research Process” dimension and 4 items in the “Information Use” dimension. Classification and Regression Tree (C&RT/CART/CRT) was used in the analysis of the data. In the research academic literacy of the preservice teachers with respect to the dimension of academic tendency was found to be high. The variables explaining the academic tendency dimension was wish for graduate education, department and grade. It was determined that the most important variable explaining the mean scores of the preservice teachers regarding the academic tendency dimension was "wish for graduate education". Of the preservice teachers participated in the study, it was observed that the mean scores of those wishing to receive a graduate education were higher than those not wishing to receive a graduate education or who were uncertain about receiving a graduate education. In the research academic literacy of the preservice teachers with respect to the dimension of research process was found to be high. The variables explaining the research process dimension was department, grade, age range and the status of receiving scientific/educational research methods course. The most important variable explaining the mean scores of the preservice teachers regarding the research process dimension was specified to be “department”. The mean scores of the mathematics, science and classroom preservice teachers were found to be higher than the mean score of the social sciences preservice teachers. In the research academic literacy of the preservice teachers with respect to the dimension of information use was found to be high. The variables explaining the information use dimension was grade, wish for graduate education, the status of receiving scientific/educational research methods course. The most important variable explaining the mean scores of preservice teachers regarding the dimension of information use was "grade level". It was specified that the mean scores of the preservice teachers enrolled in the 1st, 2nd and 3rd grades were lower than the preservice teachers in the 4th grade.
- Published
- 2021
40. Response of European grayling Thymallus thymallus to multiple stressors in hydropeaking rivers
- Author
-
Bernhard Zeiringer, Stefan Schmutz, Norbert Höller, Franz Greimel, Erwin Lautsch, Daniel S. Hayes, and Günther Unfer
- Subjects
Environmental Engineering ,classification and regression tree ,configuration frequency analysis ,0208 environmental biotechnology ,sub-daily flow fluctuations ,02 engineering and technology ,010501 environmental sciences ,Management, Monitoring, Policy and Law ,01 natural sciences ,Rivers ,Animals ,hydro-morphology ,Waste Management and Disposal ,Population dynamics of fisheries ,Ecosystem ,0105 earth and related environmental sciences ,salmonidae ,biology ,Ecology ,Grayling ,General Medicine ,biology.organism_classification ,Thymallus ,020801 environmental engineering ,European grayling ,hydropower ,Geography ,Habitat ,Indicator species ,Threatened species ,River morphology ,Hydrology ,Salmonidae - Abstract
Rivers of the large Alpine valleys constitute iconic ecosystems that are highly threatened by multiple anthropogenic stressors. This stressor mix, however, makes it difficult to develop and refine conservation and restoration strategies. It is, therefore, urgent to acquire more detailed knowledge on the consequences and interactions of prevalent stressors on fish populations, in particular, on indicator species such as the European grayling Thymallus thymallus. Here, we conducted a multi-river, multi-stressor investigation to analyze the population status of grayling. Using explorative decision-tree approaches, we disentangled the main and interaction effects of four prevalent stressor groups: flow modification (i.e., hydropeaking), channelization, fragmentation, and water quality alteration. Moreover, using a modified variant of the bootstrapping method, pooled bootstrapping, we determined the optimal number of characteristics that adequately describe fish population status. In our dataset, hydropeaking had the strongest single effect on grayling populations. Grayling biomass at hydrological control sites was around eight times higher than at sites affected by hydropeaking. The primary parameters for predicting population status were downramping rate and peak amplitude, with critical ranges of 0.2–0.4 cm min-1 and 10–25 cm. In hydropeaking rivers, river morphology and connectivity were the preceding subordinated parameters. Repeating the procedure with pooled bootstrapping datasets strengthened the hypothesis that the identified parameters are most relevant in predicting grayling population status. Hence, hydropeaking mitigation based on ecological thresholds is key to protect and restore already threatened grayling populations. In hydropeaking rivers, high river network connectivity and heterogenous habitat features can dampen the adverse effects of pulsed-flow releases by offering shelter and habitats for all life cycle stages of fish. The presented approach of explorative tree analysis followed by post-hoc tests of identified effects, as well as the pooled bootstrapping method, offers a simple framework for researchers and managers to analyze multi-factorial datasets and draw solid management conclusions info:eu-repo/semantics/publishedVersion
- Published
- 2021
41. Dong Ra Nang National Forest Change Detection using Multi-temporal LANDSAT 7 ETM+ Imagery by Using CART Classification: Object-Oriented Approach
- Author
-
Sopholwit Khamphilung
- Subjects
remote sensing ,OBIA ,classification and regression tree ,CART - Abstract
Current Applied Science and Technology, 21, 2, 318-336
- Published
- 2021
- Full Text
- View/download PDF
42. Prediction of Innovation Values of Countries Using Data Mining Decision Trees and a Comparative Application with Linear Regression Model
- Author
-
DOĞRUEL, Merve and ÜMİT FIRAT, Seniye
- Subjects
İşletme ,ddc:650 ,Networked Readiness Index,Innovation,Decision Tree Learning,Global Innovation Index,Classification and RegressionTree ,Classification and Regression Tree ,Global Innovation Index ,Ağ Yapılara Hazır Olma Endeksi,İnovasyon,Karar Ağacı Öğrenmesi,Küresel İnovasyon Endeksi,Sınıflandırma ve RegresyonAğacı ,Innovation ,Decision Tree Learning ,Management ,Networked Readiness Index - Abstract
Innovation levels and capacities of countries are two very important factors for competitiveness as well as the current Industrial 4.0 Revolution. In this context, capacity and level are relative concepts, with a great need for a common measurement system on global-based comparisons. The Network Readiness Index (NRI) and the Global Innovation Index (GII), which meet this need to a significant extent, are globally important indices with an effective and academic infrastructure to determine the innovation levels of countries. This study includes regression tree analysis and linear regression analysis and comparison using the indicators within the dimensions below the subscales of the GII score and NRI index based on supervised machine learning. The regression tree application aimed to make the GII estimation based on the NRI indicators and determine the best discriminating GII indicators. Therefore, the Classification and Regression Tree (CART) algorithm is used for analysis. The analysis result determined the indicators within the scope of NRI that are used in the GII scores and country ranking estimation. Linear regression analysis was performed with the same data set, and the regression tree obtained by the CART algorithm was compared with the linear regression model., Ülkelerin sahip olduğu inovasyon seviyeleri ve kapasiteleri, günümüzde hem rekabet edebilirlik hem de yaşamakta olduğumuz Endüstri 4.0 Devrimi açısından son derece önemlidir. Bu kapsamda bakıldığında, ülkeler açısından kapasite ve seviye göreceli bir kavram olarak kalmaktadır ve küresel karşılaştırmalar açısından ortak bir ölçme sistemine gereksinim vardır. Bu ihtiyacı önemli ölçüde karşılayan Ağ Yapılara Hazır Olma Endeksi (AYHOE) ve Küresel İnovasyon Endeksi (KİE), ülkelerin inovasyon seviyelerinin belirlenmesinde etkili ve kapsamlı endekslerdir. Ayrıca her iki endeks de akademik altyapıya sahiptir ve bu nedenle araştırmacılar için önemli bir veri kaynağıdır. Bu çalışma, KİE değeri ve AYHOE endeksine ait alt endekslerin boyutlarında yer alan göstergeler kullanılarak, denetimli makine öğrenmesi temellerine dayanan bir veri madenciliği tekniği olan regresyon ağacı analizi ve doğrusal regresyon analizi uygulamalarını ve karşılaştırmasını içermektedir. Araştırmanın amacı regresyon ağacı uygulayarak, AYHOE göstergelerinden hareketle KİE tahminlemesi yapmak ve en iyi ayrılmayı sağlayan KİE göstergelerini belirlemektir. Analiz için Sınıflandırma ve Regresyon Ağacı ((SRA) - Clasification and Regression Tree (CART)) algoritması kullanılmıştır. Analiz sonucunda AYHOE kapsamındaki hangi göstergelerin, KİE değerleri tahmininde ve ülke sıralamasında kullanılabileceği belirlenmiştir. Aynı veri seti kullanılarak doğrusal regresyon analizi uygulanmıştır. SRA algoritması ile elde edilen regresyon ağacı sonuçları, doğrusal regresyon modelinden elde edilen çıkarımlar ile karşılaştırılmıştır.
- Published
- 2021
43. Structural and operational management of Turkish airports: a bootstrap data envelopment analysis of efficiency
- Author
-
H. Hasan Örkcü, Volkan Soner Özsoy, and Ortaköy Meslek Yüksekokulu
- Subjects
Sociology and Political Science ,Turkish ,Computer science ,020209 energy ,Decision tree ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,02 engineering and technology ,010501 environmental sciences ,Management, Monitoring, Policy and Law ,Development ,01 natural sciences ,ComputerApplications_MISCELLANEOUS ,0202 electrical engineering, electronic engineering, information engineering ,Data envelopment analysis ,Operational efficiency ,Operations management ,Business and International Management ,0105 earth and related environmental sciences ,Structural and Operational Efficiency ,Bootstrap Data Envelopment Analysis ,Classification and Regression Tree ,Regression analysis ,Turkish Airports ,language.human_language ,Homogeneous ,language - Abstract
*Özsoy, Volkan Soner ( Aksaray, Yazar ), This paper explores the structural and operational dimensions of the efficiencies of airports. The two-stage procedure is suggested to assess the efficiencies of airports in this study. In the first-stage, Classification and Regression Tree, which is one of the machine-learning approaches used to divide the airports into homogeneous and thus comparable sub-groups. In the second stage, the bootstrap data envelopment analysis approach obtains more precise structural and operational efficiency scores. To illustrate the proposed framework use, we applied it to a real case associated with Turkish airports. The results demonstrate that this framework presents a more comprehensive assessment of airport performance rather than conventional data envelopment analysis models. Moreover, it provides to show the deficiencies of the structural and operational management of airports. The findings can help anywhere airport authorities as well as Turkish airport authorities.
- Published
- 2021
44. Analisis Faktor-Faktor yang Memengaruhi Kompetensi Karyawan SOS Children’s Villages Indonesia
- Author
-
Rianti Setiadi, Anis Dela Desela, and Yekti Widyaningsih
- Subjects
Computer science ,Classification and Regression Tree ,Partial Least Square ,Humanities ,Non-Govermental Organization - Abstract
Pertumbuhan ekonomi dan sosial di suatu negara tidak terlepas dari peran Non-Governmental Organization (NGO). Akan tetapi, terdapat beberapa permasalahan yang muncul dari NGO. Untuk menanggulangi permasalahan dan meningkatkan kualitas pelayanan, suatu NGO diharapkan memiliki karyawan dengan kompetensi tinggi. Penelitian dilakukan di SOS Children’s Villages Indonesia yang merupakan salah satu NGO di Indonesia yang didedikasikan untuk anak-anak yang telah atau berisiko kehilangan pengasuhan orang tua. Tujuan dari penelitian ini adalah untuk mengetahui faktor-faktor yang memengaruhi kompetensi karyawan di SOS Children’s Villages Indonesia serta mengetahui profil dari karyawan dengan kompetensi tinggi. Metode analisis yang digunakan dalam penelitian ini adalah Partial Least Square dan Classification and Regression Tree. Hasil penelitian menunjukkan bahwa variabel yang secara signifikan berpengaruh terhadap kompetensi adalah variabel interpersonal dan kreativitas. Hasil analisis profil untuk karyawan dengan kompetensi tinggi menunjukkan bahwa karyawan yang bekerja di kantor Fund Development and Communication yang terletak di Bandung atau DKI Jakarta, atau karyawan yang bekerja di kantor National Office yang terletak di Bandung dengan tingkat kreativitas tinggi cenderung memiliki kompetensi yang tinggi, selain itu karyawan yang bekerja di kantor Fund Development and Communication yang terletak di Bandung meskipun memiliki tingkat interpersonal dan kreativitas rendah, karyawan tersebut memiliki kecenderungan untuk memiliki kompetensi tinggi.
- Published
- 2020
- Full Text
- View/download PDF
45. Association between polypharmacy and the persistence of delirium: a retrospective cohort study
- Author
-
Osamu Shibayama, Daisuke Miyabe, Yoshiko Furukawa, Ken Kurisu, and Kazuhiro Yoshiuchi
- Subjects
medicine.medical_specialty ,Propensity score ,Social Psychology ,Logistic regression ,behavioral disciplines and activities ,lcsh:RC321-571 ,03 medical and health sciences ,0302 clinical medicine ,Internal medicine ,mental disorders ,medicine ,030212 general & internal medicine ,lcsh:Neurosciences. Biological psychiatry. Neuropsychiatry ,Receiver operating characteristic curve ,Biological Psychiatry ,General Psychology ,Polypharmacy ,Classification and regression tree ,business.industry ,Research ,Medical record ,Delirium ,Retrospective cohort study ,Odds ratio ,Confidence interval ,nervous system diseases ,Psychiatry and Mental health ,Propensity score matching ,medicine.symptom ,business ,030217 neurology & neurosurgery - Abstract
Background Although the association between polypharmacy and the occurrence of delirium has been well studied, the influence of polypharmacy on the persistence of delirium remains unclear. We aimed to explore the effect of polypharmacy on the persistence of delirium. Methods This retrospective cohort study was conducted at a tertiary hospital. The medical records of patients diagnosed with delirium who were referred to the Department of Psychosomatic Medicine were reviewed. Presentation with delirium on day 3 was set as the outcome in this study. We counted the number of drugs prescribed on the date of referral, excluding general infusion fluids, nutritional or electrolytic products, and psychotropics. To define polypharmacy, we developed a classification and regression tree (CART) model and drew a receiver operating characteristic (ROC) curve. The odds ratio (OR) of polypharmacy for the persistence of delirium on day 3 was calculated using a logistic regression model with the propensity score as a covariate. Results We reviewed the data of 113 patients. The CART model and ROC curve indicated an optimal polypharmacy cutoff of six drugs. Polypharmacy was significantly associated with the persistence of delirium both before [OR, 3.02; 95% confidence interval (CI), 1.39–6.81; P = 0.0062] and after (OR, 3.19; 95% CI, 1.32–8.03; P = 0.011) propensity score adjustment. Conclusion We discovered an association between polypharmacy and worsening courses of delirium and hypothesize that polypharmacy might be a prognostic factor for delirium.
- Published
- 2020
46. Long-Term-Based Road Blackspot Screening Procedures by Machine Learning Algorithms
- Author
-
Nicholas Fiorentini and Massimo Losa
- Subjects
Naïve bayes ,Computer science ,Geography, Planning and Development ,Decision tree ,TJ807-830 ,Crash ,Management, Monitoring, Policy and Law ,Overfitting ,Machine learning ,computer.software_genre ,TD194-195 ,Renewable energy sources ,Naive Bayes classifier ,road blackspot screening procedures ,machine learning algorithms ,0502 economics and business ,0501 psychology and cognitive sciences ,GE1-350 ,Logistic Regression ,Categorical variable ,050107 human factors ,Screening procedures ,050210 logistics & transportation ,Random Forest ,Environmental effects of industries and plants ,Renewable Energy, Sustainability and the Environment ,business.industry ,K-Nearest Neighbor ,05 social sciences ,Confusion matrix ,Classification and Regression Tree ,Classification and regression tree ,K-nearest neighbor ,Logistic regression ,Machine learning algorithms ,Random forest ,Road blackspot screening procedures ,Environmental sciences ,Artificial intelligence ,business ,computer ,Algorithm - Abstract
Screening procedures in road blackspot detection are essential tools for road authorities for quickly gathering insights on the safety level of each road site they manage. This paper suggests a road blackspot screening procedure for two-lane rural roads, relying on five different machine learning algorithms (MLAs) and real long-term traffic data. The network analyzed is the one managed by the Tuscany Region Road Administration, mainly composed of two-lane rural roads. An amount of 995 road sites, where at least one accident occurred in 2012&ndash, 2016, have been labeled as &ldquo, Accident Case&rdquo, Accordingly, an equal number of sites where no accident occurred in the same period, have been randomly selected and labeled as &ldquo, Non-Accident Case&rdquo, Five different MLAs, namely Logistic Regression, Classification and Regression Tree, Random Forest, K-Nearest Neighbor, and Naï, ve Bayes, have been trained and validated. The output response of the MLAs, i.e., crash occurrence susceptibility, is a binary categorical variable. Therefore, such algorithms aim to classify a road site as likely safe (&ldquo, ) or potentially susceptible to an accident occurrence (&ldquo, ) over five years. Finally, algorithms have been compared by a set of performance metrics, including precision, recall, F1-score, overall accuracy, confusion matrix, and the Area Under the Receiver Operating Characteristic. Outcomes show that the Random Forest outperforms the other MLAs with an overall accuracy of 73.53%. Furthermore, all the MLAs do not show overfitting issues. Road authorities could consider MLAs to draw up a priority list of on-site inspections and maintenance interventions.
- Published
- 2020
47. Context matters: Agronomic field monitoring and participatory research to identify criteria of farming system sustainability in South-East Asia
- Author
-
Damien Jourdain, Santiago Lopez-Ridaura, Chanthaly Syfongxay, Gatien N. Falconnier, Bruno Striffler, Juliette Lairez, Pascal Lienhard, François Affholder, Agroécologie et Intensification Durables des cultures annuelles (UPR AIDA), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), Département Performances des systèmes de production et de transformation tropicaux (Cirad-PERSYST), International Maize and Wheat Improvement Center (CIMMYT), Consultative Group on International Agricultural Research [CGIAR] (CGIAR), Gestion de l'Eau, Acteurs, Usages (UMR G-EAU), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Institut de Recherche pour le Développement (IRD)-AgroParisTech-Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Département Environnements et Sociétés (Cirad-ES), Provincial Agriculture and Forestry Office (PAFO), Directorate-General for Development and Cooperation-EuropeAid - EuropeAid/132-657/L/ACT/LA, Agence Francaise de Developpement (Conservation Agriculture within the Northern Upland Development Programme, NUDP) - AFD CLA1077.01K, Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Institut de Recherche pour le Développement (IRD)-AgroParisTech-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)-Institut Agro - Montpellier SupAgro, and Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)
- Subjects
Serious games ,agriculture familiale ,010504 meteorology & atmospheric sciences ,Exploitation agricole familiale ,F08 - Systèmes et modes de culture ,[SDV]Life Sciences [q-bio] ,Farm income ,Biodiversity ,01 natural sciences ,Recherche sur les systèmes agraires ,recherche participative ,Agriculture durable ,Multi-criteria assessment ,Agroecology ,Durabilité ,0105 earth and related environmental sciences ,2. Zero hunger ,Agroforestry ,business.industry ,Classification and regression tree ,Subsistence agriculture ,04 agricultural and veterinary sciences ,15. Life on land ,Geography ,Sustainability ,Agriculture ,Laos ,Système d'exploitation agricole ,040103 agronomy & agriculture ,Maize yield gap ,0401 agriculture, forestry, and fisheries ,Animal Science and Zoology ,Soil fertility ,business ,Agronomy and Crop Science ,Cropping - Abstract
International audience; In the mountainous areas of South-East Asia, family farms have shifted from subsistence to input-intensified and market-oriented maize-based farming systems, resulting in a substantial increase in farm income, but also in new environmental threats: deforestation, biodiversity loss, soil erosion, herbicide leaching and soil fertility degradation. In this typical case study of cash-strapped farms, where the balance between socio-economic and environmental dimensions of sustainability is complex, we used participatory methods (serious games and Q-methodology), combined with agronomic field monitoring, to identify relevant farm and field-level criteria for sustainability assessment.Serious games at farm level showed that short-term socio-economic dimensions prevailed over environmental dimensions in farmers' objectives. However, farmers also greatly valued their capacity to transfer a viable farm to the next generation and avoid herbicide use. Serious games at field level showed that some farmers were willing to preserve soil fertility for future generations. The agronomic field monitoring showed that maize yield deviations from potential water-limited yield were primarily due to weed infestation favoured by low sowing density, due to uncontrolled moto-mechanized crop establishment. This technical failure at the beginning of the maize cycle led to herbicide overuse, poor returns on investment for fertilizer, and increased exposure to soil erosion.Combining the perspectives of scientists and farmers led to the following set of locally-relevant criteria: i) at farm level: farm income, diversity of activities, farmer autonomy, farmer health, workload peaks, soil fertility transfer between agroecological zones in the landscape, rice and forage self-sufficiency; ii) at field level: resource use efficiency, soil fertility, erosion and herbicide risks, susceptibility to pests, weeds and climate variability, biodiversity, land productivity, economic performance, labour productivity and work drudgery. Our approach helped to identify key relevant sustainability criteria and could be useful for designing alternatives to current maize-based cropping systems, and contributed to informing priority-setting for institutional development and agricultural policies in the region.
- Published
- 2020
48. Development of a Group Method of Data Handling Technique to Forecast Iron Ore Price
- Author
-
Masoud Monjezi, Mohammad Reza R. Moghaddam, Diyuan Li, Amirhossein Mehrdanesh, and Danial Jahed Armaghani
- Subjects
Mean squared error ,classification and regression tree ,Group method of data handling ,0211 other engineering and technologies ,Decision tree ,02 engineering and technology ,engineering.material ,lcsh:Technology ,iron ore price prediction ,lcsh:Chemistry ,Statistics ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Autoregressive integrated moving average ,support vector regression ,Instrumentation ,lcsh:QH301-705.5 ,021101 geological & geomatics engineering ,Mathematics ,group method of data handling ,Fluid Flow and Transfer Processes ,Artificial neural network ,lcsh:T ,Process Chemistry and Technology ,General Engineering ,lcsh:QC1-999 ,Computer Science Applications ,Support vector machine ,Mean absolute percentage error ,Iron ore ,lcsh:Biology (General) ,lcsh:QD1-999 ,lcsh:TA1-2040 ,engineering ,020201 artificial intelligence & image processing ,lcsh:Engineering (General). Civil engineering (General) ,lcsh:Physics ,autoregressive integrated moving average - Abstract
Iron is one of the most applicable metals in the world. The global price of iron ore is determined based on demand and supply. There are numerous parameters (e.g., price of steel, steel production, oil price, gold price, interest rate, inflation rate, iron production, and aluminum price) affecting the global iron ore price. Considering the high number of effective parameters and existence of complex relationship among them, artificial intelligence-based approaches can be employed to predict iron ore price. In this paper, a new intelligence system namely group method of data handling (GMDH) was developed and introduced to predict the price of iron ore. For comparison purposes, four other techniques i.e., autoregressive integrated moving average (ARIMA), support vector regression (SVR), artificial neural network (ANN), and classification and regression tree (CART) were developed for prediction of monthly iron ore price. Then, using testing datasets, the developed models were validated and their performance capacities were compared. The results showed that performance prediction of the GMDH model is significantly better than other predictive models based on four performance indices i.e., root mean square error, variance account for (VAF), mean absolute error, and mean absolute percentage error. Results of VAF (97.89%, 90.81%, 80.95%, 55.02%, and 23.87% for GMDH, SVR, ANN, CART, and ARIMA models, respectively) revealed that the GMDH technique is able to predict iron ore price with higher degree of accuracy compared to the other techniques.
- Published
- 2020
49. A big - data classification tree for decision support system in the detection of dilated cardiomyopathy using heart rate variability
- Author
-
Giulia Silveri, Marco Merlo, Agostino Accardo, Luca Restivo, Miloš Ajčević, Gianfranco Sinagra, Silveri, G., Merlo, Marco, Restivo, L., Ajcevic, M., Sinagra, G., and Accardo, Agostino
- Subjects
medicine.medical_specialty ,Classification and Regression Tree ,Dilated cardiomyopathy ,HRV parameters ,Computer science ,Volume overload ,02 engineering and technology ,Coronary artery disease ,Internal medicine ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Heart rate variability ,cardiovascular diseases ,General Environmental Science ,Ejection fraction ,Area under the curve ,020206 networking & telecommunications ,Stepwise regression ,medicine.disease ,Regression ,cardiovascular system ,Cardiology ,General Earth and Planetary Sciences ,020201 artificial intelligence & image processing - Abstract
Dilated cardiomyopathy (DCM) is a heart muscle disease characterized by left ventricular (LV) or biventricular dilatation and systolic dysfunction in the absence of either pressure or volume overload or coronary artery disease sufficient to explain the dysfunction. The use of heart rate variability (HRV) analysis as well as of some machine learning algorithms, proved to be a valuable support in the diagnosis of cardiovascular disease. However, till now, only single beats or electrocardiogram segments of subjects affected by DCM were identified using machine learning techniques applied to HRV parameters. In this study, we used linear and non-linear HRV parameters and some clinical parameters (age, sex and left ventricular ejection fraction) evaluated on a large cohort of 972 subjects to early identify subjects suffered from DCM and to find which features could be selected as predictors for a correct diagnosis. By using principal component analysis and stepwise regression, we reduced the original parameters used as inputs for a series of classification and regression trees (CART). The highest accuracy of 97% and Area Under the Curve (AUC) of 95% were achieved using the ratio between low frequency and high frequency (LF/HF), sex and left ventricular ejection fraction (LVEF) parameters as inputs of the classifier.
- Published
- 2020
50. Is age an additional factor in the treatment of elderly patients with glioblastoma? A new stratification model: an Italian Multicenter Study
- Author
-
Diego Garbossa, Fabio Cofano, Tamara Ius, Alessandro Olivi, Roberto Altieri, Filippo Flavio Angileri, Pier Paolo Panciani, Teresa Somma, Alessandro D'Elia, Fabrizio Pignotti, Giuseppe La Rocca, Giuseppe Barbagallo, Francesco Maiuri, Miriam Isola, Antonino Germanò, Vincenzo Esposito, Giovanni Sabatino, Marco Maria Fontanella, Giannantonio Spena, Francesco Certo, Paolo Cappabianca, Miran Skrap, Giuseppe Maria Della Pepa, Ius, Tamara, Somma, Teresa, Altieri, Roberto, Angileri, Filippo Flavio, Barbagallo, Giuseppe Maria, Cappabianca, Paolo, Certo, Francesco, Cofano, Fabio, D'Elia, Alessandro, Della Pepa, Giuseppe Maria, Esposito, Vincenzo, Fontanella, Marco Maria, Germanò, Antonino, Garbossa, Diego, Isola, Miriam, La Rocca, Giuseppe, Maiuri, Francesco, Olivi, Alessandro, Panciani, Pier Paolo, Pignotti, Fabrizio, Skrap, Miran, Spena, Giannantonio, and Sabatino, Giovanni
- Subjects
Oncology ,OS = overall survival ,Multivariate analysis ,classification and regression tree ,glioblastoma surgery ,Settore MED/27 - NEUROCHIRURGIA ,Neurosurgical Procedures ,030218 nuclear medicine & medical imaging ,0302 clinical medicine ,CART model ,decision tree diagram ,elderly ,extent of resection ,prognostic score ,Medicine ,GBM = glioblastoma ,EOR = extent of resection ,PFS = progression-free survival ,education.field_of_study ,Brain Neoplasms ,Hazard ratio ,CART = classification and regression tree ,CCI = Charlson Comorbidity Index ,EGBM = elderly GBM ,HR = hazard ratio ,KPS = Karnofsky Performance Scale ,RHR = relative HR ,General Medicine ,Prognosis ,Treatment Outcome ,Italy ,Radiological weapon ,Cart ,medicine.medical_specialty ,Population ,03 medical and health sciences ,Internal medicine ,Humans ,education ,Survival analysis ,Aged ,Retrospective Studies ,business.industry ,Univariate ,medicine.disease ,Surgery ,Neurology (clinical) ,business ,Glioblastoma ,030217 neurology & neurosurgery - Abstract
OBJECTIVEApproximately half of glioblastoma (GBM) cases develop in geriatric patients, and this trend is destined to increase with the aging of the population. The optimal strategy for management of GBM in elderly patients remains controversial. The aim of this study was to assess the role of surgery in the elderly (≥ 65 years old) based on clinical, molecular, and imaging data routinely available in neurosurgical departments and to assess a prognostic survival score that could be helpful in stratifying the prognosis for elderly GBM patients.METHODSClinical, radiological, surgical, and molecular data were retrospectively analyzed in 322 patients with GBM from 9 neurosurgical centers. Univariate and multivariate analyses were performed to identify predictors of survival. A random forest approach (classification and regression tree [CART] analysis) was utilized to create the prognostic survival score.RESULTSSurvival analysis showed that overall survival (OS) was influenced by age as a continuous variable (p = 0.018), MGMT (p = 0.012), extent of resection (EOR; p = 0.002), and preoperative tumor growth pattern (evaluated with the preoperative T1/T2 MRI index; p = 0.002). CART analysis was used to create the prognostic survival score, forming six different survival groups on the basis of tumor volumetric, surgical, and molecular features. Terminal nodes with similar hazard ratios were grouped together to form a final diagram composed of five classes with different OSs (p < 0.0001). EOR was the most robust influencing factor in the algorithm hierarchy, while age appeared at the third node of the CART algorithm. The ability of the prognostic survival score to predict death was determined by a Harrell’s c-index of 0.75 (95% CI 0.76–0.81).CONCLUSIONSThe CART algorithm provided a promising, thorough, and new clinical prognostic survival score for elderly surgical patients with GBM. The prognostic survival score can be useful to stratify survival risk in elderly GBM patients with different surgical, radiological, and molecular profiles, thus assisting physicians in daily clinical management. The preliminary model, however, requires validation with future prospective investigations. Practical recommendations for clinicians/surgeons would strengthen the quality of the study; e.g., surgery can be considered as a first therapeutic option in the workflow of elderly patients with GBM, especially when the preoperative estimated EOR is greater than 80%.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.