A member of the RWA Team, Stephen Lavery addresses the challenges of leveraging the capabilities of machine learning and outlines a practical example of how it can be used to deliver better patient care in pharmacy.
Interest around machine learning and its applications have gained considerable traction in recent years (Figure 1). But finding a practical business use case for machine learning can be difficult, and projects often result in a wasted effort. Models don’t always work out as you had envisaged, and the interpretation and communication of model results can be challenging. It requires the right people, and ultimately buy-in from decision makers who can see the value in such predictive models. Moreover, as there is already lots of low hanging fruit and day-to-day analytics to be performed, machine learning projects can seem farfetched and a waste of valuable company resources.
Figure 1: Google trends: Machine learning interest over time
Taking all this into consideration, I recently indulged in a machine learning project with the hopes of developing something that could deliver real business value. The goal was to create a diabetes prediction model using pharmacy data, an approach that (to my knowledge) hadn’t been done before. Diabetes prediction models aren’t all that new. In fact, they are one of the most commonly tackled healthcare problems using machine learning. This is because the risk factors (predictors) for diabetes are well understood. Indicators such as age, body mass index, cholesterol and blood pressure all contribute to one’s risk of developing the disease. But the issue with existing models is that they generally use clinical heath care datasets. These datasets are expensive to capture and require patients to visit a hospital. Pharmacy data on the other hand is far more abundant, and although not as detailed as clinical data sets, could offer a more cost-effective and scalable solution for identifying at-risk diabetes patients.
The idea with this project was that risk indicators such as blood pressure and cholesterol level could be derived from the medications that a patient was taking. For example, if a patient is taking a blood pressure medication it could be inferred that this patient suffered from high blood pressure. Using this logic, over 80 features were derived for each patient. In total, these features were narrowed down to 9 for training the machine learning models. The training data was partitioned into labelled patients who were either diabetic or non-diabetic. These datapoints were then used to train a random forest and artificial neural network (Figure 2). As the training data is already labelled, the models “learn” the combinations of features that result in either a diabetic or non-diabetic classification. This can then be used to test new unlabelled (undiagnosed) patients.
Figure 2: Diabetes prediction artificial neural network
Once the models had been trained, they were tested against a validation dataset to see how well they performed. Ultimately, the random forest outperformed the artificial neural network, achieving a classification accuracy of 77%. This represents some pretty successful results and offers a valuable application of machine learning to solve a real-world problem. Such a model could be used to identify at-risk patients who could be invited to their local pharmacy for a further diabetes screening. This could allow patient testing at-scale and could even result in an earlier patient diagnosis, which could ultimately lead to better patient outcomes.
With Real World Analytics we allow Pharmacies to see which patients are at high risk of Diabetes. This is based on the data of Diabetic patients and the other ailments that go along with Diabetes such as hypertension, high cholesterol and heart/stroke disease. Patients who have two or more of these ailments are deemed high risk for developing Diabetes.