AI Transparency Rule Has Benefits, but Is Limited, Experts Suggest
The final rule from HHS on transparency in artificial intelligence (AI) has some positive points, but it won’t solve all the challenges posed by this new technology, according to a legal expert.
“I’m glad to see that there is rulemaking that is timely,” Meghan O’Connor, JD, a partner in the Quarles & Brady law firm in Milwaukee, said in a phone interview. “So often the rulemaking process doesn’t allow for it to be timely or up to date with current security and technology practices. So I like to see that.”
The rule was developed because there is currently “very little transparency in the marketplace even for paying customers of predictive models,” Jeff Smith, MPP, deputy director of certification and testing at HHS’s Office of the National Coordinator (ONC) for Health Information Technology, said in a phone interview at which a press person was present.
He noted that while predictive AI can be used in many areas of healthcare — “models that look at the chances of a patient having a heart attack, or falling in a hospital, or acquiring sepsis,” for example — “there’s very little information on how these models were designed, developed, tested, trained, and evaluated. And over the last several years, there have been really well-documented harms [from these models] that have been far-reaching, impacting millions of Americans.”
Smith gave several examples of how biases and other problems in AI predictive algorithms have led to adverse outcomes for patients. One 2019 study by Ziad Obermeyer, MD, of the University of California Berkeley, and colleagues found that one algorithm widely used by commercial health plans to identify patients with complex health needs requiring extra help was racially biased against Black patients, and that reducing the resulting disparity would increase the percentage of Black patients who received the extra services from 17.7% to 46.5%.
“This bias arose because the algorithm was using past spending as a proxy for future spending, and as the literature bears out, Black Americans are less likely to receive care and they spend as a group much less on care than white Americans, so this use of a proxy for healthcare costs really kind of showed a pretty bright light on the ways in which this can go bad,” Smith said.
Another study in 2021 by Andrew Wong, MD, of the University of Michigan in Ann Arbor, and colleagues found that an algorithm designed by Epic, a large electronic health record (EHR) vendor, to predict which hospitalized patients would develop sepsis failed to identify more than two-thirds (67%) of the patients who developed the ailment.
“It performed much more poorly than advertised,” Smith said. “And when you start to think at scale, you know that Epic, by their own numbers, includes medical records for 180 million individuals … That has real consequences when you think about something like sepsis.”
Under the rule, known as HTI-1, ONC “finalized two big buckets of policy and requirements,” Smith noted. The rule requires “that information on how the predictive DSIs [decision support interventions] were designed, developed, trained, evaluated, and should be used, needs to be available to users of the predictive algorithm. And then we said that risk needs to be managed for these predictive DSIs and that governance needs to play a role in how these predictive DSIs are designed and deployed.”
The intent behind some AI models is not always transparent, said Mandar Karhade, MD, PhD, leader of data and analytics at Avalere Health, a consulting firm in Washington, D.C. “Either I want to diagnose patients or I want to save some money or I want to do something else for which the model was created,” he said during a phone interview at which a public relations person was present.
EHRs are one area that’s ripe for possible problems related to AI, he added. For example, Oracle recently introduced an “autocomplete” feature into its EHR, and although that can work very well for documenting patients who are healthy, “sometimes you’re going to miss something, or it is going to ‘hallucinate’ something and add something that your eyes won’t be able to trace, but it is now part of the electronic health record that you didn’t want to happen. So there is good and bad that comes with it.”
The idea behind the rule is similar to a nutrition label for food, said Smith. “For the first time, the rule requires nine categories of information and 31 measures, metrics, and descriptions associated with a pretty important swath of technology that’s out there. And this is going to create a nationwide baseline set of information upon which medical societies, individual hospitals, technology companies, patients, and others can develop newer measures to help people understand the relative quality of the algorithm that they’re using.”
O’Connor disagreed with the comparison to a nutrition label. “This information is not objective or measurable,” she said. “It’s not like we can prove there’s ‘2% bias.’ So how will that information be communicated by the health IT developers, and then how will providers use that information to do their risk analysis?”
Niam Yaraghi, PhD, a nonresident senior fellow at the Brookings Institution in Washington, D.C., said in a phone interview that he found the rule to be a little bit “reactionary, in the sense that rather than setting policies or recommendations that would ensure rapid advancements in AI or at least be guiding them, it seems to be a response to the community’s concerns about fairness in AI.” And although fairness is a noble goal, “if you’re ultimately after fairness, you first have to [improve] performance.”
To do that, it would be much better for the government to think about the hurdles that are in place that hinder performance improvement, he said. “One thing AI systems need is data, which is still very siloed in the healthcare system despite millions spent to ensure interoperability … If we remove [barriers] to lack of good data, it would achieve the goal of fairness and performance much faster.”