Just Calm Down About GPT-4 Already And stop confusing performance with competence, says Rodney Brooks
“What the large language models are good at is saying what an answer should sound like, which is different from what an answer should be.”
—Rodney Brooks
“It gives an answer with complete confidence, and I sort of believe it. And half the time, it’s completely wrong.”
—Rodney Brooks, Robust.AI
see here
https://spectrum.ieee.org/gpt-4-calm-down
He gives an example of a “Google program labeling an image of people playing Frisbee in the park. And if a person says, “Oh, that’s a person playing Frisbee in the
park,” you would assume you could ask him a question, like, “Can you eat a Frisbee?” And they would know, of course not; it’s made of plastic. You’d just expect they’d have that competence. That they would know the answer to the question, “Can you play Frisbee in a snowstorm? Or, how far can a person throw a Frisbee? Can they throw it 10 miles? Can they only throw it 10 centimeters?” You’d expect all that competence from that one piece of performance: a person saying, “That’s a picture of people playing Frisbee in the park.””
Brooks summarizes: “We don’t get that same level of competence from the performance of a large language model. When you poke it, you find that it doesn’t have the logical inference that it may have seemed to have in its first answer.”
IEEE Spectrum says: “Do you think we’re ever going to see anything come out of this application that will justify” the $30 billion valuation for OpenAI.
Brooks: “Probably not. My understanding is that Microsoft’s initial investment was in time on the cloud computing rather than hard, cold cash. OpenAI certainly needed [cloud computing time] to build these models because they’re enormously expensive in terms of the computing needed.”
“I think they’re going to be better than the Watson Jeopardy! program, which IBM said, ‘It’s going to solve medicine.’ Didn’t at all. It was a total flop. I think it’s going to be better than that.”
—Rodney Brooks
Spectrum: “Full self-driving, or level 5, still seems really far away. Or am I missing something?”
Brooks: “No. It is far away.”
Spectrum: GM supposedly lost $561 million on Cruise in just the first three months of this year.
Brooks: “That’s how much cost it cost them to run that effort. Yeah. It’s a long way from breakeven. A long, long way from breakeven.”
Brooks talks about the sensors that were being embedded in roads as one example of “changing the infrastructure of roads to facilitate the introduction of self-driving vehicles. But then the #hype came: “Oh no, we don’t even need that. It’s just going to be a few years and the cars will drive themselves. You don’t need to change infrastructure.” So we stopped changing infrastructure. And I think that slowed the whole autonomous vehicles for commuting down by at least 10, maybe 20 years. There’s a few companies starting to do it now again. It takes a long time to make these things real.”