Crop pests and diseases are emerging threats to global food security in recent years. Manual diagnosis has also been a serious constraint to recognizing the crop diseases in modern agriculture. The latest Convolutional Neural Network (CNN) models have opened up a new way to control diseases with the development of deep learning. However, a complex real environment in the field has posed a great challenge on the general model for disease recognition, due to the single background of the leaf disease images taken in the laboratory. In this study, an improved MobileNet-V2 was proposed to recognize the diseases of crop leaves in the fields, thereby optimizing the parameters for higher accuracy under the complex background. The specific procedures were as followed. Firstly, an image dataset was collected in the field for the disease classification, including 11 kinds of diseased leaves and 4 kinds of healthy leaves of four crops. A series of enhancement operations were then performed on the disease images, including random brightness, and noise. Secondly, a coordinate attention mechanism was added in the 3-18 layers of the basic MobileNet-V2 model. The Region of Interest (ROI) was effectively positioned to concentrate on the disease regions in the pixel coordinate system, thereby to better identify the background and foreground information of the targets. Since the areas of disease spots were different, it was easy to miss some details of the diseases only when using the high-level features. Thus, a feature pyramid module was added to the model using a multi-scale feature fusion. As such, the low-level features were combined with the high-level features, providing for more targets information and better recognition. The specific sampling was operated from the 7×7 to 14×14 feature map, where the same size was fused. Finally, the unnecessary classification layer was removed to optimize the parameter memory of the improved model, where the operation of group convolution was adopted. Compared with the original, the classification accuracy of the improved model was enhanced by 2.91 percentage points, with a little increase in the parameter memory, indicating superior performance. The times of up-sampling were significantly reduced to deal with the feature overlap, where all aspects of indicators were improved than before. Additionally, the improved model was used to better distinguish the similar target features and different lesion areas in detail. In contrast, the recognition accuracy was 0.65 percentage points higher than the EfficientNet-b0a CNN model, indicating a fewer half number of parameters. The improved model also presented much fewer parameters suitable for the mobile terminal, compared with the classical ResNet-50 CNN architecture. Consequently, the improved model can be widely expected to better identify the crop leaf diseases under a complex background, indicating more stable convergence with less parameter memory. This finding can provide strong theoretical support to reliably transplant the new CNN model into the mobile terminal for the disease classification. [ABSTRACT FROM AUTHOR]