How do sweet potato harvesting robots locate grasp points in complex environments?
Higher Education Press
image: image
Credit: HIGHER EDUCATON PRESS
Sweet potato is a globally important food crop with rich nutritional value, widely used in food processing, feed production, and the pharmaceutical industry. Hainan Province in China has a history of sweet potato cultivation spanning over 300 years and has long been one of the major sweet potato-producing areas in southern China. However, Hainan's terrain is dominated by central mountains, gradually transitioning to hills, terraces, and plains, forming a ring-shaped landform. Large-scale agricultural machinery struggles to operate in these regions, leading to long-term reliance on manual harvesting of sweet potatoes, which is inefficient and labor-intensive. The development of agricultural robots offers a promising solution to this problem. Nevertheless, existing robotic vision systems often face challenges such as unclear segmentation between sweet potatoes and backgrounds (e.g., soil, weeds, and stones) and insufficient adaptability in grasp point localization in complex harvesting environments. How to enable robots to accurately recognize sweet potato contours and find optimal grasp points in complex backgrounds has become a key issue in improving the mechanization level of sweet potato harvesting.
Associate Professor Jian Zhang and his team from the College of Mechanical and Electrical Engineering, Hainan University, proposed a sweet potato contour recognition model named SPECNet based on the BASNet network. The model achieves performance improvements through three key enhancements: Firstly, standard convolutions are replaced with dynamic convolutions, allowing the network to adaptively adjust filters based on image content and better capture the diverse shape features of sweet potatoes. Secondly, a Haar wavelet downsampling (HWD) module is introduced to retain more detailed information while reducing the resolution of feature maps, enhancing multi-scale feature extraction capabilities. Finally, the standard squeeze-and-excitation (SE) attention mechanism is improved by incorporating an edge-guided module to form the Edge-Enhanced SE (ESE) attention mechanism, which more effectively captures and emphasizes the edge information of sweet potatoes, improving contour detection accuracy. The related research has been published in Frontiers of Agricultural Science and Engineering (DOI: 10.15302/J-FASE-2025653).
Experimental results show that the SPECNet model outperforms six mainstream comparison models, including ACCoNet, U2-net, and BASNet. When the input image resolution is increased to 512 × 512, the model's performance is further improved, indicating that its accuracy in sweet potato contour recognition under complex backgrounds is significantly higher than existing methods.
To verify the practical application effect, the research team deployed the model on a sweet potato picking robot platform. The robot is equipped with a depth camera to capture color and depth images. The model extracts sweet potato contours, calculates the centroid as the grasp point, and uses principal component analysis (PCA) to determine the grasp direction. Experiments were conducted in three environments: indoor sandy soil, field soil, and grassland, covering sweet potatoes of different shapes (slender, sturdy, compact), and tests were performed under varying illumination conditions (morning, noon, evening). The results show that the model achieves a real-time processing speed of 35 frames per second and an overall grasp success rate of 78%. Among them, slender sweet potatoes have the highest grasp success rate, while sturdy and compact types have lower success rates due to poor morphological matching with the manipulator. Although interferences such as weeds and withered leaves in the field can affect recognition accuracy, the model maintains stable performance under different illumination conditions, demonstrating strong environmental adaptability.
This study provides an efficient solution for the vision system of sweet potato harvesting robots, helping to improve the mechanization level of sweet potato harvesting and reduce labor intensity. It is particularly suitable for the needs of regions with complex terrain such as Hainan.
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.