1. ChatGPT and general-purpose AI count fruits in pictures surprisingly well without programming or training
- Author
-
Konlavach Mengsuwan, Juan C. Rivera-Palacio, and Masahiro Ryo
- Subjects
Foundation model ,General purpose ai ,Chatgpt ,Large language model ,Large vision language model ,agriculture ,Agriculture (General) ,S1-972 ,Agricultural industries ,HD9000-9495 - Abstract
General-purpose artificial intelligence (AI) can facilitate agricultural digitalization as many tools do not require coding. Yet, it remains unclear how well the emerging general-purpose AI technologies can perform object counting, which is a fundamental task in agricultural digitalization, in comparison to the current standard practice. We show that ChatGPT (GPT4 V) demonstrated moderate performance in counting coffee cherries from images, while the T-Rex, foundation model for object counting, performed with high accuracy. Testing with a hundred images, we examined that ChatGPT can count cherries, and the performance improves with human feedback (R2 = 0.36 and 0.46, respectively). The T-Rex foundation model required only a few samples for training but outperformed YOLOv8, the conventional best practice model (R2 = 0.92 and 0.90, respectively). Obtaining the results with these models was 100x shorter than the conventional best practice. These results bring two surprises for deep learning users in applied domains: a foundation model can drastically save effort and achieve higher accuracy than a conventional approach, and ChatGPT can reveal a relatively good performance especially with guidance by providing some examples and feedback. No requirement for coding skills can impact education, outreach, and real-world implementation of generative AI for supporting farmers.
- Published
- 2024
- Full Text
- View/download PDF