Azeem Azhar's Exponential View / Season 6, Episode 58

The Challenges and Benefits of Generative AI in Health Care

Listen | Podcast loading...

How will AI change health care?

All episodes

January 17, 2024

Artificial Intelligence is on every business leader’s agenda. How do we make sense of the fast-moving new developments in AI over the past year? Azeem Azhar returns to bring clarity to leaders who face a complicated information landscape.

Generative AI has a lot to offer health care professionals and medical scientists. This week, Azeem speaks with renowned cardiologist, scientist, and author Eric Topol about the change he’s observed among his colleagues in the last two years, as generative AI developments have accelerated in medicine.

They discuss:

The challenges and benefits of AI in health care.
The pros and cons of different open-source and closed-source models for health care use.
The medical technology that has been even more transformative than AI in the past year.

@azeem
@erictopol

Further resources:

When AI Meets Medicine (Exponential View Podcast, 2019)
Can AI Catch What Doctors Miss? (Eric Topol, TED, 2023)

AZEEM AZHAR: Hi, I’m Azeem Azhar, founder of Exponential View and your host on the Exponential View podcast. When ChatGPT launched back in November, 2022, it became the fastest growing consumer product ever, and it catapulted artificial intelligence to the top of business priorities. It’s a vivid reminder of the transformative potential of the technology. And like many of you, I’ve woven generative AI into the fabric of my daily work. It’s indispensable for my research and analysis. And I know there’s a sense of urgency out there. In my conversations with industry leaders, the common thread is that urgency. How do they bring clarity to this fast moving noisy arena? What is real and what isn’t? What, in short, matters? If you follow my newsletter, Exponential View, you’ll know that we’ve done a lot of work in the past year equipping our members to understand the strengths and limitations of this technology and how it might progress. We’ve helped them understand how they can apply it to their careers and to their teams and what it means for their organizations. And that’s what we’re going to do here on this podcast. Once a week, I’ll bring you a conversation from the frontiers of AI to help you cut through that noise. We record each conversation in depth for 60 to 90 minutes, but you’ll hear the most vital parts distilled for clarity and impact on this podcast. If you want to listen to the full unedited conversations as soon as they’re available, head to exponentialview.co. Today I’m speaking with another peripatetic mind. Dr. Eric Topol is a renowned American cardiologist, scientist and author. He writes Ground Truths, a newsletter on the latest scientific developments in medicine, and he was a guest on my podcast back in 2019. Now, some four years later, we’ve decided to come back together and compare our notes. This conversation was recorded over more than an hour, and we started off with an update on the science of aging, the new approaches to expanding one’s lifespan and what’s possible today with personalized medicine. If you want to listen to that part of the conversation with Eric, head over to www.exponentialview.co where we’ve published the whole recording. For my listeners here, we start some 20 minutes into our discussion. So, you’ll hear me jump straight into asking Dr. Topol about the impact of AI on healthcare. When we met a few years ago, I think one of your lines was that AI could restore the care in healthcare. And as you look on from four years, of course, we had COVID and it came in and it sped up a lot of things and it slowed down some other things. Do you still believe that? Are we still on that path? Are we ahead of where you thought we might be?

ERIC TOPOL: Well, we’re always behind where we could be. That’s one of the themes I’ve seen over many decades now, unfortunately, because the medical community is resistant to change. COVID didn’t help, if anything, well, in some ways it pushed forward things like telehealth and hospital at home because of desperate need. But in terms of what we’re talking about independent of the pandemic, it was really the transformer model of AI and the eventuality of ChatGPTs development and now a lot more of these. The term large language model will soon be obsolete because it’s not just about language anymore, it’s about any form of data, any mode. But this is, I think, the accelerant, which the medical community will resist as it does with every change. But it could be… We’re on a rescue path if you look at where this can take us as aligned with our unmet and desperate needs in medicine. So that’s why I think this is an inevitability and hopefully we can figure out how to embrace it and get our arms around the potential downsides, which of course there are many. And recently I had a great opportunity to interview Geoffrey Hinton, who many referred to as the Godfather of AI. And he said with all the negatives of AI, that… He sounded the alarms. And I think he’s highly regarded as someone who has been a great proponent of deep neural networks for quite a long time. But here he is saying, “I’m seeing things I’ve never seen before and I’m worried.” But interestingly, he compartmentalized healthcare and medicine as being the one area which is so extraordinary and relatively safe. It’s not the kind of thing where the doomsdayers and the real negativism comes out. So, I think there’s that perspective that shouldn’t be missed here is that we’re not talking about all the worries of artificial general intelligence and somehow the AI taking over the world. We’re talking about fixing medicine nonetheless, there are worries about all the mistakes, and everyone who has used ChatGPT or any of the other models will know that sometimes you get really great stuff and sometimes you get just total BS. It’s fabricated or completely off track. So this is what worries, I think, people in the medical community. I don’t know about your thoughts on that as we-

AZEEM AZHAR: Well, I think that one of the challenges with GPT-4 and ChatGPT is that they are, they’re so kind of persuasive and seductive out of the box. And the thing is that really what they are is they’re part of a final product. They’re like, if you’ve got a gasoline car, they’re the engine, but you still need a drive shaft and a steering wheel and indicator lights and seat belts and wheels and a brake. And we’ve been presented with these technologies and there’s something quite curious about them. I’ve talked to a lot of the computer scientists who build them and they say, we don’t really know how they ended up as they are. I mean, we know the physical steps we went through and we knew what we did to the computers. We don’t know how they got to where they are. So, we have this sort of uncertain thing that is quite seductive and persuasive, but in reality, it’s actually only a component of a finished final product. And I think that what we will start to do to address the weaknesses, the strange hallucinations that go on or the things that it makes up will be essentially the same sorts of things that we did to cars to make internal combustion engines useful. So, we connected them to drive shafts, we gave them brakes. I mean, brakes are wonderful. People think of brakes as slowing down the car, but actually brakes enable you to drive faster as someone who once had to drive his car, a rental car with shopped brake pads to a garage, I know how slowly you drive when the brakes don’t work. So, what we have to do is we have to manage the excitement that we’ve had of this component technology, which seems to be a fully fledged thing, and recognize that actually it’s part of a more complex software architecture, Eric, that will make this thing factually robust. It will make it very, very much, much more reliable than the thing that we currently deal with. I mean, what we have today is it’s a stochastic technology, which means you are rolling a set of dice every time you ask it a question. And humans are variable, but they’re not as variable as all of that. And doctors are not as variable as all of that when they’re in the clinical setting. So I think that for your readers in Ground Truths, one of the key things is to recognize that when technologies come out and you’ve seen lots of them come out over your career and I have over mine, the scientists and technologists get excited because they see the potential, and that excitement can get picked up as something we can use today widely, which isn’t necessarily the case. And so, I would expect over the next year or two for us to actually see really interesting, really powerful products, proper products at the heart of which will be GPT-4 or some other multimodal system that have, I don’t want to get too technical here, but we’ll have the retrieval augmented generation. They will have the anchoring of the responses back to highly reliable databases of curated information. And that’s when you’ll start to get to the point where we can even test these things and say, are they actually good enough to be put in the hands of professionals, let alone clinicians.

ERIC TOPOL: Perspective is essential. Most would agree that this, well, your metaphor is perfect about the engine without the steering wheel and the brakes and whatnot, but most would agree that we will have a far better way to prevent hallucinations, confabulations, and errors over time. And that, we should be planning on that because that’s where it is headed, indeed. They’ll never be eliminated. But the ability to double check whether it to be reenter or, there’s lots of ways to get around this problem. The most important thing in medicine is we’re never going to have a situation where humans are not in the loop. They are not going to make a critical decision about a diagnosis or a treatment to someone, that’s important, that’s not something lightweight, without some kind of oversight by a clinician. So this is why it’s different. One thing if you’re writing a paper and you get references from GPT-4, they’re made up and you can check, you can look and say, oh no, they don’t even exist. What’s going on here? Another thing is you get some really unusual output and you say, no, this doesn’t compute, but you’re the doctor and you’re looking after this patient. So this is another reason why I am very optimistic that we’re still in the early era of these transformer multimodal models that are just going to keep getting better and that’s something that instead of waiting for that, there’s many things that we can do right now that are extraordinary transformative improvement that don’t bring up this worry about the era.

AZEEM AZHAR: Yeah, that’s true. I think one example is just assisting note-taking. And I think that there are already products that are out there that they’ll transcribe alongside a doctor’s own notes. Seems like it’s low hanging fruit. One of the things that I think is quite challenging is that we don’t really know what the performance of these systems is. I don’t know if you saw this paper recently, but Microsoft showed that if you can do good prompting of GPT-4, it would do better on medical questions than these specialized transformer models like BioGPT and MedPaLM. And my point there, I guess is less about, oh, GPT-4’s so good. It’s more about the fact that this was surprising because we don’t have a good theory of these underlying transformer models as to why they work as well as they work. I mean, we know more now than we did two years ago because people have been looking at it, but we still don’t have that good theory. And that feels a little bit slippery, especially in this sort of sophisticated world that we now have of when you want to try and get a therapy approved in Europe or in the US and the kind of levels of proof you need to go on. We used lots of medicines like salicylic acid before we knew what their mechanisms were. But today-

ERIC TOPOL: And we still don’t know their mechanism.
AZEEM AZHAR: We still don’t know them and anesthetics and so on. But today, I don’t think it’s really possible to put out a new sophisticated biologic and say to the FDA, we have no idea how it works, we kind of just knew it, it did, it just did. It doesn’t make sense to our common standards. And I wonder whether that’s going to be a challenge for actually scaling out these transformers given their sort of inexplicably at the moment.

ERIC TOPOL: Well, you just brought up a lot there, Azeem.

AZEEM AZHAR: Sorry, I do that.

ERIC TOPOL: Not surprising. No, there were three major fronts that you just brought up. One was how do these frontier models like OpenAI, GPT-4 outperform medically tweaked models like MedPaLM? And we don’t know, but also as I think we need to emphasize, the transparency is lacking of the frontier. That’s the term for these closed source models. We don’t even know the content that’s in them, but the medically tweaked are obviously only the de minimis. It’s not the whole corpus of medical knowledge. It’s not up-to-date. It’s, who knows? There hasn’t really been the supervised fine-tuning medically of any model that we would say is comprehensive and up to the moment. So that’s still kind of open as to why this frontier model exceeded, and I think it’s really fascinating. Now, another thing you brought up, I don’t want to lose this, is that this ambient conversation between the patient and doctor to have that note, a synthetic note, which is far better than any notes that previously existed in medicine generally. And the only difference is that the doctor has to articulate the physical exam, which otherwise might not have occurred. Now what happens with this note, it isn’t just that you have this great note. It’s now this automated note, there’s so many different levels of benefit. For one, there’s now an audio file that’s indexed to the note linked so that when the patient goes home and forgets what was being discussed or was confused, they can go look exactly at what was said. And also it can be put into whatever level of education or language that you would like through the large language model. But that’s just the beginning. What it also does is it does all the work of the clinician to preempt the data clerk stuff like pre-authorization, which is a big issue here in the US, it’s the follow-up appointments, the tests, procedures, billing, another problem that may not be in the UK, but it does all this stuff. And even will go to the patient to nudge them about, did you check your blood pressure? Whatever was discussed. Now, one other thing which I did not anticipate, and when we spoke years ago, which is promoting empathy. So this is getting really fast pickup in many health systems in the US because there are many different versions of it. But what’s fascinating is that the note is now used by the large language model to detect the level of empathy of the doctor. Why did you interrupt the patient so quickly? Why didn’t you express some sensitivity to the patient or listen to their concern? So we’re having a whole nother… I didn’t ever expect direct promotion of empathy even though the large language model doesn’t even know what empathy is. It’s just obviously been trained by human content. So this is a biggie, and it’s the near term. It doesn’t require a regulatory review. The physicians who’ve used the good systems are saying, “I’m not going back.” They’re saving hours a day of not having to work at keyboards and be a data slave entry. So this is the near term bonus of AI that a lot of people are not, in medicine, that a lot of people are not aware of.
AZEEM AZHAR: What’s fascinating about both of those examples is that that has already been widely consumerized. When ChatGPT came out back in November, one of the things that people would do would be to say, here is a really boring extract from a statute, write it so five-year-old can understand it or make it funnier, but keep the meaning, turn it into rap. And that was use case number one for ChatGPT. And as you say, we have already solved that and have figured that out. And the tone evaluation, which I hadn’t come across of course, has been a longstanding generative AI capability. Even before ChatGPT, there were these products aimed at the other end of human utility, at the social media copywriters to help them write their copy as if they were an expert or as if they were a comedian. And it’s the same set of technologies that is queuing up the doctors to whether they’re being empathetic or not. I mean, that’s really, really interesting. And I think that also is a surprise going back to our conversation four years ago because four years ago what we were saying is the patient is going to get a better experience because the doctor will have more time, and the machine will always be this cold clinical thing, and that’s why doctors won’t get replaced because the machine will be cold and clinical and humans need hugs. And what we’ve discovered is that even for the squishiest and loveliest of physicians, the large language model can still coach them to give them greater empathy. Yes, so I think there is a surprise that we, sort of a black swan actually that we probably could never have predicted.

ERIC TOPOL: Yeah, I certainly never thought of it. Just as you say, the gift of time was the theme of what we could get, which is still we’re going to have. But this component of machine promoting empathy was one that I had never envisioned, and it’s exciting to see it in action, and I wouldn’t be surprised is that it’ll be, instead of just education requirements for clinicians in the future, there’ll be a coaching requirement of having your visits reviewed that you’re sufficiently empathetic and communicative. So we’ll see where this goes, but it’s in the early stages as well.

AZEEM AZHAR: So, my model for that is essentially to say that’s obviously desirable, right? It’s desirable for any professional to have some kind of ongoing professional review. It was just too expensive, right? Pre-ChatGPT, where you couldn’t go off and send your audio recordings to humans to listen to and give you feedback. And so what’s happened is the AI has dropped the cost of doing that, and so we’ll now do it really, really extensively and it’ll become a norm. It does remind me a little bit of what is happening in autonomous driving as well. So within autonomous driving, the dream was the Robocar that will take you from New York to LA and you can just… I’m sure as you would peruse the New England Journal of Medicine and the BMJ and Lancet for those hours. And, of course, we’re quite far away from autonomous vehicles, but the AI systems that have been built for them with the sensors are like better versions of antilock brakes. And so you are starting to see regulations change for future vehicles that they will increasingly need to have, not self-driving, but technology that is in a similar path to make them safer, better driver situational awareness, better braking, better road handling, and so on. And I think that that is another regulated market. So there are two little trends here. One is the declining cost of the AI, and the second is an analogous market, we’re starting to see that these things as they become economically and technologically feasible, start to become mandated. I want to just turn off from AI for a second. I want to share what for me has been the most amazing technology over the last year or so, even probably more than AI, which has been GLP-1 agonists, Ozempic and its cousins and generation two. They’re kind of remarkable. So as my understanding is that what they did was they tackled the circuitry within our systems that made us kind of desire more. So they made us feel a bit full, so we’d stop eating. But it turned out that one of the things that we’ve lost as we’ve become modern humans living in capitalist societies over the last 500 years has been temperance. And that idea that there are feast days, and the rest of the time we kind of run around pretty hungry, and it seems to reverse that as well. So, it seems to be that it’s pushing down cravings for nicotine and alcohol and nail-biting and gambling and all these things on addiction pathways that it seems to have these impacts around then things that doctors really, really care about, like polycystic ovary syndrome. And I think you talked about NAFLD, the sort of spongy liver issue, snoring and sleep apnea, even hair loss, what on earth is going on there?

ERIC TOPOL: Yeah. Well, that actually was the other point that you raised that I didn’t get to in that lot to unpack segment of this conversation. That’s okay because your mind is moving at 150 miles an hour. But the issue here is that explainability, you touched on anesthetics, aspirin, metformin, the list is long of all these drugs that we don’t know. And the GLP-1 family of drugs, which Ozempic, pardon the pun, is a lightweight drug compared to the ones that are also already Tirzepatide and the triple receptors and the ones that we’re moving on to now. These drugs are so potent, and it’s not just as you say about working on our limbic system and decreasing cravings and have our GI system like our stomach moving at a much slower pace to give that sense of fullness through vagal nerve occurrence, but rather, there’s also another amazing part of the story, which is decreasing inflammation. And this is mediated through the brain. And these drugs are not even the best for small molecules to get into the brain because of the blood-brain barrier. But nonetheless, all the things that you touched on, we have evidence that there’s benefit, like you said, fertility with polycystic ovary and liver, but there’s going to be large trials coming out in the next year for prevention of Alzheimer’s disease. And we’re not talking about people who are obese now, we’re talking about using it in people to decrease inflammation. There’s another drug class, which is the largest drug class up until now this one, which will be bigger than any in history are statins. Now, statins also decrease inflammation. The famous medical term, pleiotropic, it is they have many different effects. Well, what it really means is we don’t know how the hell they’re really working. Partly it’s because they reduce the bad LDL cholesterol, but they too have this anti-inflammatory effect, but they don’t prevent Alzheimer’s. They don’t do a lot of the things that GLP-1 drugs are looking like they can have a pronounced impact. So the way I look at this, this is the biggest breakthrough drug class in history, and certainly we’ll get into perhaps the worries that this could… Right now we’re talking about injectables, not the pills yet, that might be a commitment to potentially lifelong treatment because these companies don’t have a particular interest in getting people off the drugs. And so there are issues here, but the discovery, which is fascinating, we knew about these drugs for 20 years. The first one of these drugs was approved in 2005, and we didn’t figure out how they could be used as they are now. It took 20 years to figure that out, which is kind of amazing how dumb we were when you think about it. But no, it’s very exciting.

AZEEM AZHAR: Was that because we didn’t understand the mechanism of satiety back 20 years ago?

ERIC TOPOL: No, we had a pretty good handle on that, but what we didn’t have is that the first one approved was one that was very short-acting, so you had to take multiple injections per day. So the push to get long-acting like what we have now wasn’t there. But the second which was what if we increase the dose, which is what happened here, to go from glucose control to 20, 25% body weight loss, which was unprecedented.

AZEEM AZHAR: Can I just stop you? Because I’m curious about what was different in the time. So back in 2005, those first, that first GLP drug was, I think, targeted for type two diabetes, right?

ERIC TOPOL: Yes. Byetta, yes.

AZEEM AZHAR: Right. And so I’m just wondering about back then, the way in which the whole of the diabetes family was dealt with, it was always seen as a little bit of a headache, and perhaps it wasn’t as prevalent as it is 20 years later. And I sort of wonder about the extent to which the expansion of the obesity problem, the expansion of type two diabetes constructs, not just bigger incentives, but actually a bigger sense of care of how do we go about addressing this than perhaps we would’ve had 20 years ago in 2005?

ERIC TOPOL: Well, you’re bringing up some great points, and that probably contributes, but one of the things that we still don’t understand, which is a big part of this story is that diabetics, type two diabetes, the people that take [inaudible 00:29:17] ones, they don’t have that much weight loss. It’s the people with pure obesity without type two diabetes that have this massive weight loss. And that was another thing we still don’t understand today why that happens. So, the idea of going to the big frontier of obesity where there had never been a drug that was safe and had anything close to this amount of potential weight loss equating to bariatric surgery, gastric bypass, right? But because the fixation was that it was a gut hormone related to glucose, there was not, and the people didn’t lose that much weight, they lost five pounds, or cut a few kilos, whatever, but they didn’t lose like this. So the idea of testing it, which only occurred in recent years, in the last few years in obesity, really took a big jump in the concept of, oh, well, we didn’t see this in diabetes, but maybe it was something different in obesity, the recognition that they’re not the same condition.

AZEEM AZHAR: And the danger, I think one of the challenges that we have with this, the success of this drug is that it is tackling diabetes, it is tackling obesity, it is making… And the prescription rates are running into the millions of adults, maybe the tens of millions in the US. I mean so much so, I think that last year, Denmark, which is the home of Novo Nordisk, which makes Ozempic, would’ve gone into recession had American physicians not been prescribing as much, guess is it called semaglutide in the US? Is that?

ERIC TOPOL: Yeah, semaglutide, right. Yeah, you’re right. Novo Nordisk and Lilly are the two most valued pharmaceutical companies in the world. That’s right.

AZEEM AZHAR: Off the back of this, right? And it’s extremely alluring because it has all of these other impacts that it’s not just the waistline, which is out of control in the US, but it is anyone who’s feeling a little bit addicted to anything, even shopping apparently, and these other medical conditions. So one question I suppose is this is a very rapid diffusion of this technology into a population. It’s the kind of speed of diffusion you normally only see actually with internet software, right? The speed with which TikTok spread, the speed with which Instagram spread.

ERIC TOPOL: ChatGPT, yeah.

AZEEM AZHAR: And ChatGPT. And so that also does beg a question because we’ve done testing, but we haven’t done tested on a hundred million people for a decade. Are there potential dragons there? Are there things that we may come back to bite us because we actually hadn’t stuck people on these things for seven or eight years before we put a hundred million people on them?

ERIC TOPOL: Oh, another really good one, Azeem. And I think some of the concerns that were raised early, we starting to see data that there is no increase in suicide. There is no increase in pancreatic cancer. These were some of the concerns that were raised, but there’s always the known unknowns, like you’re getting at, which is what happens. We do know in some people that can be some substantial loss of muscle mass or bone density. And does that make people much more frail and prone to falling and hip fractures? I mean, who knows where this is headed, especially the longest follow-up we have in any clinical trial is 40 months. That’s nothing if you think about people taking it for decades and just think that a lot of the people who could benefit are even teens and children with morbid obesity, which we have more of today than ever before. So, this is a big problem, which has to be looked at and considered that a priority. There are substantial risks. And the other thing I just want to mention is there’s been terrible exuberance that all these problems like sleep apnea, type two diabetes and the list, they’re all going to go away. That is ridiculous. And let say why, first of all, the people that need these the most are the ones least likely to ever get them because of access and costs, especially in the United States. But moreover, look at statins. They’ve been around for 40 years and still they’re not used in lots of people who would benefit from them. So the idea that we’re going to see all these conditions go away from these drugs, I mean, the potential is there to make a big dent in them, but will people change? Will these drugs become at low cost, which they’re certainly not. Will they become more accessible? Will they be pills rather than injections, which are going to be much more of a way to cultivate their use? So, this craziness that all these conditions are just going to melt away, I just don’t get it.

AZEEM AZHAR: Well, thanks for listening. What you heard was an excerpt of a much longer conversation. To hear the rest of it go to exponentialview.co. Members of Exponential View and the community get access to the full recording as soon as it is available, and they’re invited to continue the conversation with me and other experts. I do hope you join us. In the meantime, you can follow me on LinkedIn Threads and Substack for daily updates, just search for Azeem, A-Z-E-E-M, or if you’re in the US and Canada, A-Z-E-E-M. Thanks.

Latest in this series

All episodes

This article is about AI AND MACHINE LEARNING

Follow this topic

Following

The Challenges and Benefits of Generative AI in Health Care

Latest in this series

This article is about AI AND MACHINE LEARNING

Partner Center

Explore HBR

HBR Store

About HBR

Manage My Account

Follow HBR