Start Over

Automated extraction of domain knowledge in the dairy industry.

Authors :: Zhu, Junsheng
Lacroix, René
Wade, Kevin M.
Source :: Computers & Electronics in Agriculture. Nov2023, Vol. 214, pN.PAG-N.PAG. 1p.
Publication Year :: 2023
Abstract: • Knowledge graph theory was applied to extract and compile information from domain textual resources. • Deep-learning models were applied to extract entities and relationships from dairy literature. • The use of domain dictionaries can accelerate the named entity recognition corpus preparation process. • Natural language could be parsed into database query language by semantic parsing methods to directly retrieve information from the database. Three weeks prior to calving to three weeks after calving, the transition period poses challenges for dairy cattle and farmers. Vast changes in housing, feeding, and reproduction might result in milk drop, metabolic and reproductive diseases. Moreover, most of the metabolic processes are intricately linked as many conditions can coexist. This challenge means that dairy producers and their advisors have difficulty drawing concise conclusions because of all aspects and relationships in transition cow management. Herein, machine-learning techniques and knowledge-graph theory were explored with a view to creating a decision-support system that could provide producers and their advisors with knowledge from domain literature. Specifically, knowledge is modelled as entities and relationships in knowledge graph theory, and natural language models were developed to extract information as knowledge graphs. A dataset comprising 1152 sentences from 20 papers was created and split into 922 sentences for training and 230 sentences for testing. Sequentially, two deep learning models were trained to extract entities and relationships respectively. For training results, a Bi-directional Long-Short-Term Memory model was applied for the entity extraction task and obtained an F1 score of 80 %. As for relationship extraction, a Transformer-based model was deployed but yielded a low F1 of 23 %, thus another pre-trained Transformer model with 89 % accuracy was deployed into the system. After feeding the domain literature into the deep-learning models, a knowledge graph of 1,576 nodes and 3,456 edges was constructed and stored in the graph database Neo4j. Afterward, a semantic parsing method was used to allow users to conduct question answering through the knowledge graph in natural language. In addition, to determine the quality of answers that the knowledge built from the papers, answers were sampled and evaluated based on human judgment. On average, answers scored 7.5 out of 10 and proved informative with respect to the original literature. Although the final interactive results demonstrated a high degree of visualization and scalability, this study primarily sought to demonstrate its feasibility. For tailored commercial applications, further improvements could be implemented in knowledge graph expansion and reasoning. [ABSTRACT FROM AUTHOR]