LLMs include some of the most rapidly expanding platforms such as ChatGPT, Bard, Bert and many others that imitate understanding, processing, and producing human communication. It is imperative that the risks be examined carefully when using LLMs to improve access to health information, as a decision-support tool, or even to enhance diagnostic capacity in under-resourced settings to protect people’s health and reduce inequity.
While WHO is enthusiastic about the appropriate use of technologies, including LLMs, to support health-care professionals, patients, researchers and scientists, there is concern that caution that would normally be exercised for any new technology is not being exercised consistently with LLMs.
Precipitous adoption of untested systems could lead to errors by health-care workers, cause harm to patients, erode trust in AI and thereby undermine (or delay) the potential long-term benefits and uses of such technologies around the world.
Among concerns that call for rigorous oversight needed for the technologies to be used in safe, effective, and ethical ways include the data used to train AI may be biased, generating misleading or inaccurate information that could pose risks to health, equity and inclusiveness.
In addition, LLMs generate responses that can appear authoritative and plausible to an end user; however, these responses may be completely incorrect or contain serious errors, especially for health-related responses.
WHO proposes that these concerns be addressed, and clear evidence of benefit be measured before their widespread use in routine health care and medicine – whether by individuals, care providers or health system administrators and policy-makers.
pll/jha/lpn