ObjectiveThe width of maize stalks is an important indicator affecting the lodging resistance of maize. The measurement of maize stalk width has many problems, such as cumbersome manual collection process and large errors in the accuracy of automatic equipment collection and recognition, and it is of great application value to study a method for in-situ detection and high-precision identification of maize stalk width.MethodsThe ZED2i binocular camera was used and fixed in the field to obtain real-time pictures from the left and right sides of maize stalks together. The picture acquisition system was based on the NVIDIA Jetson TX2 NX development board, which could achieve timed shooting of both sides view of the maize by setting up the program. A total of maize original images were collected and a dataset was established. In order to observe more features in the target area from the image and provide assistance to improve model training generalization ability, the original images were processed by five processing methods: image saturation, brightness, contrast, sharpness and horizontal flipping, and the dataset was expanded to 3500 images. YOLOv8 was used as the original model for identifying maize stalks from a complex background. The coordinate attention (CA) attention mechanism can bring huge gains to downstream tasks on the basis of lightweight networks, so that the attention block can capture long-distance relationships in one direction while retaining spatial information in the other direction, so that the position information can be saved in the generated attention map to focus on the area of interest and help the network locate the target better and more accurately. By adding the CA module multiple times, the CA module was fused with the C2f module in the original Backbone, and the Bottleneck in the original C2f module was replaced by the CA module, and the C2fCA network module was redesigned. Replacing the loss function Efficient IoU Loss(EIoU) splits the loss term of the aspect ratio into the difference between the predicted width and height and the width and height of the minimum outer frame, which accelerated the convergence of the prediction box, improved the regression accuracy of the prediction box, and further improved the recognition accuracy of maize stalks. The binocular camera was then calibrated so that the left and right cameras were on the same three-dimensional plane. Then the three-dimensional reconstruction of maize stalks, and the matching of left and right cameras recognition frames was realized through the algorithm, first determine whether the detection number of recognition frames in the two images was equal, if not, re-enter the binocular image. If they were equal, continue to judge the coordinate information of the left and right images, the width and height of the bounding box, and determine whether the difference was less than the given Ta. If greater than the given Ta, the image was re-imported; If it was less than the given Ta, the confidence level of the recognition frame of the image was determined whether it was less than the given Tb. If greater than the given Tb, the image is re-imported; If it is less than the given Tb, it indicates that the recognition frame is the same maize identified in the left and right images. If the above conditions were met, the corresponding point matching in the binocular image was completed. After the three-dimensional reconstruction of the binocular image, the three-dimensional coordinates (Ax, Ay, Az) and (Bx, By, Bz) in the upper left and upper right corners of the recognition box under the world coordinate system were obtained, and the distance between the two points was the width of the maize stalk. Finally, a comparative analysis was conducted among the improved YOLOv8 model, the original YOLOv8 model, faster region convolutional neural networks (Faster R-CNN), and single shot multiBox detector (SSD)to verify the recognition accuracy and recognition accuracy of the model.Results and DiscussionsThe precision rate (P)、recall rate (R)、average accuracy mAP0.5、average accuracy mAP0.5:0.95 of the improved YOLOv8 model reached 96.8%、94.1%、96.6% and 77.0%. Compared with YOLOv7, increased by 1.3%、1.3%、1.0% and 11.6%, compared with YOLOv5, increased by 1.8%、2.1%、1.2% and 15.8%, compared with Faster R-CNN, increased by 31.1%、40.3%、46.2%、and 37.6%, and compared with SSD, increased by 20.6%、23.8%、20.9% and 20.1%, respectively. Respectively, and the linear regression coefficient of determination R2, root mean square error RMSE and mean absolute error MAE were 0.373, 0.265 cm and 0.244 cm, respectively. The method proposed in the research can meet the requirements of actual production for the measurement accuracy of maize stalk width.ConclusionsIn this study, the in-situ recognition method of maize stalk width based on the improved YOLOv8 model can realize the accurate in-situ identification of maize stalks, which solves the problems of time-consuming and laborious manual measurement and poor machine vision recognition accuracy, and provides a theoretical basis for practical production applications.