Modifiable risk factors and interventions for individual patients: The role of machine learning

The current reporting standard in medical science is to take results obtained from groups as well as those published in scientific literature and assume that they are entirely applicable to our patients. In other words, assume that the results for groups can be used to solve the problems for individuals, except that individuals are not groups. Take, for instance Mary Smith, she is a unique individual but when applying results from traditional statistical models we are forced to represent her as being part of a group with a mean age and a gender percentage.


Machine learning and individual patient predictions

Contrary to traditional statistical analyses focused on risk factors for groups of patients, machine learning allows for the prediction of individual patients. In other words, rather than learning that uncontrolled diabetes increases the average risk of postoperative infection for the average patient, machine learning will provide an estimate for a specific patient called Mary Smith. This prediction takes into account all the interacting factors that make Mary the patient she is: her age, gender, comorbidities, previous procedures, and any other features that might be associated with her outcomes.

Of importance, the prediction of outcomes for individual patients is also more sophisticated than the use of risk stratification scales. In this approach, patients are first placed into categories of patients who are, on average, similar to them. For example, given a certain number of characteristics, I will split my patients into one out of four categories. While this approach is better than simply providing a risk estimate for a specific component such as the presence of diabetes, it still assumes that the patient is an average of that smaller group.

Addressing the "black box" problem

Despite the advantage of generating individual predictions, one of the main criticisms of machine learning is that it is a "black box", meaning that it is not apparent which specific factors led to the patient-specific prediction. Hence, a patient with a particular group of clinical characteristics might be more likely to have a complication, but the machine learning model will not tell you which attributes for that specific patient were responsible for that prediction.

Recent developments in machine learning have addressed this limitation, making it possible to generate a graphic known as LIME, which will display the main factors contributing toward a given outcome prediction for a specific patient.


In this example, risk factors in red increase the risk, while those in green reduce the risk for the patient under evaluation then.

Focusing on modifiable factors and running "what if" scenarios during shared decision-making sessions.

Although predicting the risk of an outcome and identifying patient-specific risk factors is unquestionably interesting, machine learning can only be meaningful when clinicians and patients focus on modifiable risk factors. The workflow to achieve this goal while using machine learning models is as follows:


  • The clinician uses patient data to simulate both a prediction related to a given outcome, say a postoperative complication, and to detect the factors that might be increasing or mitigating risk.
  • A discussion is held between the clinician and the patient about a plan to act upon any modifiable risk factors while some "what if" simulations are run in real time. In other words, for each hypothetical scenario where we reduce or enhance risk or protective factors respectively, the model will compare the risk levels with and without the intervention.

Contact us at