Content

Speaker

Zhichao Yang

Abstract

The integration of Artificial Intelligence (AI) in healthcare promises unprecedented improvements in patient care, yet its full potential, especially in precision health, remains underutilized due to significant challenges in transforming real-world data into real-world evidence. This dissertation explores the development and application of clinical foundation models (FMs), specifically Clinical Language Models (CLaMs) and Foundation Models for Electronic Health Records (FEHRs), which are pretrained on extensive healthcare data and expert-curated knowledge graphs. The research presents innovations in three key areas:

Accurate information extraction from clinical notes: We analyzed baseline CLaMs performance in the task of extracting diagnostic code information from clinical notes. We identified 3 key issues in these CLaMs: their imprecision in extracting rare diseases due to the lack of training data, their difficulties with recognizing synonyms due to model's inadequate medical knowledge, and their inability to handle extended text or notes from multiple patient visits. To address these issues, we developed a generative knowledge-injected prompt-based fine-tuned Mamba, achieving the state-of-the-art accuracy. Higher clinical information extraction accuracy allows clinicians to make quicker and more accurate decisions, enhancing patient care efficiency and effectiveness.

Enhanced quality of patient health assessments: Inferring clinical diagnosis to generate an assessment is a crucial step during the patient encounter. However, there is limited research on generating clinical diagnoses in a free text format. Hence, we propose a new task of generating full-length patient health assessments. We applied CLaMs to this task and found that they tend to generate factually incorrect responses. To improve the generated assessment quality, we combined the CLaM with the medical knowledge graph. By reducing the incidence of misleading information generated during the assessment process, our CLaM supports clinicians in making better-informed decisions.

Predictive modeling of complex disease interrelations: We developed TransformEHR, a generative transformer FEHR pretrained on a vast dataset of 6.5 million patient electronic health records. TransformEHR excels in predicting complex interrelations among diseases. The high performance in predicting intentional self-harm shows the potential of TransformEHR in building effective clinical intervention systems. TransformEHR is also generalizable and can be easily fine-tuned for clinical prediction tasks with limited data.

By training advanced generative clinical FMs on large-scale healthcare data, this dissertation demonstrates AI's role in enhancing precision health for more personalized and effective healthcare solutions. The findings underscore the potential of AI to transform medical data analysis and patient care, setting a path towards a future where healthcare is increasingly driven by intelligent and automated systems to support healthcare providers.

Hybrid event posted in PhD Thesis Defense for Faculty and Alumni