This thesis of Jingxuan Wang investigates the application of machine learning and deep learning techniques to predict the disappearance of lung nodules using real-world imaging data. Chapter 1 introduces the research background and provides an overview of the thesis structure. Chapter 2 presents a practical, step-by-step AI workflow for working with a lung nodule dataset, offering coding examples and guidance for researchers, clinicians, and technicians.
Chapters 3 through 5 develop predictive models for the disappearance of indeterminate pulmonary nodules (IPNs). Chapter 3 explores machine learning approaches using demographic and radiological features from the NELSON dataset. Among the models tested, a random forest achieved the best performance with an AUC of 0.865. Feature importance analysis identified volume, maximum diameter, and minimum diameter as the most influential variables.
Chapter 4 focuses on deep learning models trained with imaging and non-imaging features from the ImaLife dataset. Results showed that image-only models performed comparably to those integrating demographic information, suggesting limited added value from non-imaging data. Explainability tools confirmed that imaging features were the primary drivers of model performance.
Chapter 5 introduces a multi-view deep learning model that incorporates multiple spatial perspectives of new IPNs. This model outperformed all single-view approaches on the NELSON dataset, achieving an AUC of 0.81. Explainable heatmaps further highlighted the most predictive image regions.
Finally, Chapter 6 summarizes the findings, discusses their clinical relevance, and outlines future research directions for advancing predictive modeling in pulmonary imaging.