1. A Framework for Information Retrieval and Knowledge Discovery from Online Healthcare Forums
- Author
-
Sampathkumar, Hariprasad
- Abstract
Information used to assist biomedical and clinical research has largely comprised of data available in published sources like scientific papers and journals, or in clinical sources like patient health records, lab reports and discharge summaries. Information from such sources, though extensive and organized, is often not readily available due to its proprietary and/or privacy-sensitive nature. Collecting such information through clinical studies is expensive and the information is often limited to the diversity of the people who are involved in the study. With the growth of online social networks, more and more people openly share their health experiences with other similar patients through online healthcare forums. The data from these forum messages can act as an alternate source that provides for unrestricted, high volume, highly diverse and up-to-date information needed for assisting and guiding biomedical and pharmaceutical research. However, this data is often unstructured, noisy and scattered, making it unsuitable for use in its current form. This dissertation presents an Information Retrieval and Knowledge Discovery Framework that is capable of collecting data from online healthcare forums, extracting useful information and storing it in a structured form that facilitates knowledge discovery. A Healthcare Forum Mining Ontology developed as a part of this work is used to organize and capture the semantic relationships between patient related data like age, gender, ethnicity and habits, along with health related data like drugs, side-effects, diseases and symptoms which are extracted from the forum messages. The utility of this framework is demonstrated with the help of two applications: an Adverse Drug Reaction discovery tool that is able to assist pharmacovigilance by extracting adverse effects of drugs from forum messages and an ontology-based visualization tool that can be used for exploring and analyzing associations between patient and health related data extracted from forum messages. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2016