Back to Search Start Over

Optimum Feature Subset for Optimizing Crop Yield Prediction Using Filter and Wrapper Approaches

Authors :
R. Bhargavi
P. S. Maya Gopal
Source :
Applied Engineering in Agriculture. 35:9-14
Publication Year :
2019
Publisher :
American Society of Agricultural and Biological Engineers (ASABE), 2019.

Abstract

In agriculture, crop yield prediction is critical. Crop yield depends on various features which can be categorized as geographical, climatic, and biological. Geographical features consist of cultivable land in hectares, canal length to cover the cultivable land, number of tanks and tube wells available for irrigation. Climatic features consist of rainfall, temperature, and radiation. Biological features consist of seeds, minerals, and nutrients. In total, 15 features were considered for this study to understand features impact on paddy crop yield for all seasons of each year. For selecting vital features, five filter and wrapper approaches were applied. For predicting accuracy of features selection algorithm, Multiple Linear Regression (MLR) model was used. The RMSE, MAE, R, and RRMSE metrics were used to evaluate the performance of feature selection algorithms. Data used for the analysis was drawn from secondary sources of state Agriculture Department, Government of Tamil Nadu, India, for over 30 years. Seventy-five percent of data was used for training and 25% was used for testing. Low computational time was also considered for the selection of best feature subset. Outcome of all feature selection algorithms have given similar results in the RMSE, RRMSE, R, and MAE values. The adjusted R2 value was used to find the optimum feature subset despite all the deviations. The evaluation of the dataset used in this work shows that total area of cultivation, number of tanks and open wells used for irrigation, length of canals used for irrigation, and average maximum temperature during the season of the crop are the best features for better crop yield prediction on the study area. The MLR gives 85% of model accuracy for the selected features with low computational time. Keywords: Feature selection algorithm, Model validation, Multiple linear regression, Performance metrics.

Details

ISSN :
19437838
Volume :
35
Database :
OpenAIRE
Journal :
Applied Engineering in Agriculture
Accession number :
edsair.doi...........8909c0fd63da82ebb5001f509011ab71
Full Text :
https://doi.org/10.13031/aea.12938