Published on Faculty of Medicine News
In the first published guidelines for responsible machine learning in healthcare, experts from around the world – including faculty at U of T and the Vector Institute – are calling for interdisciplinary teams as a starting point.
“The majority of [machine learning] solutions are currently being developed in silos, away from the real-world clinical problems and settings that these [machine learning] models will actually impact," says Anna Goldenberg, who is an assistant professor of computer science at U of T and associate research director of health at the Vector Institute.
“Our guidelines provide a framework within which many issues stemming from the complexity of adopting [machine learning] in health care in particular can be avoided.”
Goldenberg is senior author of Do no harm: a roadmap for responsible machine learning for health care, published in Nature Medicine this week. She is also senior scientist in genetics and genome biology at the Hospital for Sick Children, and co-chair of Artificial Intelligence in Medicine for Kids at the hospital.
The paper recommends that deployment of machine learning in health care involve interdisciplinary teams, including knowledge experts like clinical experts and machine-learning researchers. Decision-makers like hospital administrators and regulatory agencies should also be involved, as well as users of machine-learning, like nurses, physicians, patients and the friends and family of a person affected.
“Health care is not immune to pernicious bias. The health data on which algorithms are trained are likely to be influenced by many facets of social inequality, including bias toward those who contribute the most data,” the paper states.
Co-author Marzyeh Ghassemi, assistant professor in the departments of computer science and medicine, points out that machine learning work can be presented as if “a model on its own is a solution – and most problems in human health are not really solvable by a model.”
To successfully identify a solution to a problem, researchers must recognize health care delivery is a process, and not a fixed point.
“You have to engage in the fact that health care is a process, it’s not a static data set that you can pull once, train a model on, and deploy,” says Ghassemi, who also holds the CIFAR AI Chair at the Vector Institute.
“It’s an ongoing process where labels and definitions of clinical conditions can and do change. Populations can shift, treatments and different locations for different groups can vary. I think there is a lot of careful thought that needs to go into deployable solutions, which is very separate from creating an interesting machine learning model.”
A machine learning model can be promising from a technical perspective, however, for an ultimately successful solution, she says there are a wider set of objectives that need to be achieved.
“We tried to focus on things you might not think about initially: choosing the right problems, making sure the solution is useful, really rigorously thinking through the ethical implications of deployment, and evaluation. Evaluation is particularly challenging because you have to thoughtfully report your results, and then think through the caveats for responsible deployments.”
Ghassemi says it’s important to think through the ethical impacts of machine learning that’s developed. A developer’s approach will vary, depending on their background.
“In those with a really strong technical background, what I often try to emphasize is the thoughtful reporting of results, and the ethical implications of a deployment,” she says. “In a technical setting, often we already emphasize really rigorous evaluation and choosing an appropriate problem.”
That approach can shift.
“If somebody has a more clinical background, and already lives and breathes the ethical implications of what they’re doing, so I would emphasize the other facets,” she says. “Especially with the availability of downloadable models, the goal should be to ensure that the technical solution you come up with is useful across different patients, that it’s possible to generalize it to your setting and problem.”