Machine Learning: Multi-site Evidence-based Best Practice Discovery
This study establishes interoperability among electronic medical records from 737 healthcare sites and performs machine learning for best practice discovery. A novel mapping algorithm is designed to disambiguate free text entries and provide a unique and unified way to link content to structured medical concepts despite the extreme variations that can occur during clinical diagnosis documentation. Redundancy is reduced through concept mapping. A SNOMED-CT graph database is created to allow for rapid data access and queries. These integrated data can be accessed through a secured web-based portal. A classification machine learning model (DAMIP) is then designed to uncover discriminatory characteristics that can predict the quality of treatment outcome. We demonstrate system usability by analyzing Type II diabetic patients among the 2.7 million patients. DAMIP establishes a classification rule on a training set which results in greater than 80% blind predictive accuracy on an independent set of patients. By including features obtained from structured concept mapping, the predictive accuracy is improved to over 88%. The results facilitate evidence-based treatment and optimization of site performance through best practice dissemination and knowledge transfer.