
NextFin -- When I sat down with Geoffrey Hinton, often referred to as the “Godfather of AI,” at the 2025 T-EDGE conference, I expected a thoughtful conversation about AI safety. I did not expect him to reframe humanity’s future with artificial intelligence in such a strikingly intimate way.
Hinton, the winner of the Noble Prize in physics in 2024 and the recipient of the Turing Award in 2018, told me that he now believes the safest path forward is to design future superintelligent systems to care about humans the way a mother cares for her infant. It was a perspective he admitted was relatively new to him, one he has only held for the past few months, but he spoke about it with conviction. Hinton also pointed out that a baby controlling a mother is the only example in which less intelligent things controlled more intelligent ones.
He explained that evolution invested enormous effort in shaping maternal instincts. Babies survive, he said, because mothers are biologically and emotionally wired to protect them. A mother cannot tolerate the sound of her baby crying. Hormones reward her for nurturing behavior. Most importantly, she genuinely cares. Hinton suggested that this same framework could guide AI safety.
What struck me most was how radically this model shifts power dynamics. Hinton argued that in this future, humans are not the masters of superintelligent AI—"we are the babies, and the AI is the mother.” He acknowledged that leaders of major technology companies would likely resist this idea because it overturns the dominant assumption that humans must always remain in control. Instead of insisting on dominance, he said, we may need to accept the paradigm where an intelligence that cares more about us than about itself.
According to Hinton, if an AI were built with deeply embedded “maternal instincts,” even the ability to rewrite its own code would not lead it to abandon those values. He compared this to human mothers, noting that most would never choose to turn off their maternal instincts, even though doing so might make life easier, because they know it would harm their child. In the same way, a properly designed AI would not want to stop caring about humans.
He also emphasized that this care would be unconditional. Mothers, he said, want the best for their children regardless of intelligence, ability, or limitations. That, in his view, is precisely the attitude a superintelligent system must have toward humanity.
Hinton acknowledged the risk of rogue AI systems, but he argued that only other advanced intelligences could realistically constrain them. In his analogy, if there were a “rogue AI mother,” it would take other “AI mothers” to keep it under control. Human oversight alone, he suggested, would not be sufficient.
During our hour-long conversation, I asked Hinton directly whether he regretted his role in developing AI and possibly pushing humanity into a dangerous situation. His answer was careful and nuanced. He distinguished between guilty regret—knowing something was wrong at the time and doing it anyway—and hindsight regret. He said he does not feel guilty regret.
When he began his work in AI, he explained, he genuinely believed it would mostly be a force for good, improving productivity, health care, education, and countless other areas. If he were placed back in the same circumstances with the same knowledge he had then, he told me he would make the same choices again.
What he does not expect, he said, is how quickly everything is now moving. AI is advancing faster than he anticipated, and humanity may not have enough time to fully understand how to coexist safely with superintelligence. That realization, he admitted, troubles him.
He was also firm in rejecting the idea that any single individual should be credited for AI development. AI, he reminded me, was developed collectively by many researchers over decades. The media’s tendency to credit one person, he said, is almost always misleading in science. Unlike figures such as Newton or Einstein, whose absence might have delayed progress by decades, he believes his own absence would have delayed AI development by only a week or two.
Listening to Hinton, I was struck not just by the originality of his ideas, but by the humility with which he expressed them. His proposal may challenge how we think about control, power, and intelligence—but it also offers a deeply human lens through which to imagine a future with AI, one built not on dominance, but on care.
The following is the transcript of a dialogue between Geoffrey Hinton, Nobel Prize laureate and pioneer of artificial intelligence, and Jany Hejuan Zhao, the founder and CEO of NextFin.AI, the Chairperson of TMTPost Group and the publisher of Barron’s China.
Uncompleted Mission in Brain Research, New Mission to Warn About AI Risks
Jany Hejuan Zhao: Professor Hinton, I’d like to start by asking what originally led you to working in AI, and what keeps you searching after all these decades? After receiving the Nobel Prize, do you feel your mission has shifted from building AI to safeguarding humanity’s relationship with it?
Geoffrey Hinton: I originally got interested in AI when I was at high school. A friend of mine told me that memories in the brain might be distributed over many brain cells instead of localized in a few brain cells, and that got me very interested in how the brain represents memories, and I’ve been interested in how the brain works since then. And my mission in life was always to understand how the brain learns.
I sort of failed at that. We’ve got some insights into that from AI, but as a kind of side effect of trying to understand how the brain learns, we’ve created this technology based on artificial neural networks that’s working very well.
I’m now 77 and I see my mission is not doing further research. I’m a bit old for that. But as warning people about the risks of AI. In particular, the risk that when it gets smarter than us it won’t need us anymore.
Jany Hejuan Zhao: It reminds me that you said when you left Google to speak freely. So what truth did you most want to speak to the world at that time?
Geoffrey Hinton: Okay, so I actually left Google when I was 75 and I’d always planned to retire when I was 75. So I was leaving Google anyway. I didn’t leave Google just to speak freely, but I timed it so that I could speak freely on May the 1st to someone from the New York Times.
And I wanted to warn about the risks of super-intelligent AI taking control. There’re many different risks of AI. There’re all sorts of risks that are more urgent, that come from people misusing AI, bad actors misusing AI. But the risk that people didn’t seem to understand very well is that when AI gets smarter than us, it may not need us at all and may just take over. So that’s what I wanted to warn about.
Jany Hejuan Zhao: Yeah, I see. And can you compare the time you left Google and now? What do you think happened? What are the biggest changes now?
Geoffrey Hinton: The biggest change, I think, is that even more money and resources are being poured into AI. Huge numbers of very smart Chinese graduate students, working on AI and entrepreneurs. China actually educates far more people in science and technology than the U.S.
So there’s huge human resources going into it, as well as a huge amount of capital going into funding data centers. And I think that means that we’ll get super-intelligent AI even sooner than I expected.
GPT-5 Thought Hinton Never Won a Nobel Prize
Jany Hejuan Zhao: As we know, the biggest event this year I think is that GPT-5 launched. But some people were saying, oh, there is not so much influence or not so much changes, but some people were saying that it’s a big milestone for the whole AI process. So what do you think about it? In your view, what is a genuine leap from GPT-4 to GPT-5 that this model truly reasons, or are they still performing like not so good?
Geoffrey Hinton: I was disappointed in it. It didn’t seem to be nearly as big a leap as between GPT-3.5 and GPT-4 and we’d been waiting for it a long time. I myself was somewhat disappointed in it. That doesn’t mean things are stalled. It just means there wasn’t as much progress as people expected from GPT-5. It was hyped a lot. I think there will be lots more progress from both OpenAI and from other companies. But GPT-5 itself, the launch was a bit disappointing.
I actually asked it some questions about me. So I asked it, um, did Geoffrey Hinton win the Nobel Prize? And it said no. And I said, um, you’re wrong. Try again. And so it then said, um, no, Geoffrey Hinton is a computer scientist. There isn’t a Nobel Prize for computer science. And then it explained to me that I was confusing the Nobel Prize with the Turing Award, because the Turing Award is sometimes called the Nobel Prize of computer science. And then I explained to it that, no, it was still wrong. And it eventually went and looked on the web and said, oh, you’re right. So that wasn’t very impressive. And I had some other interactions with it, where it’s good. I think it’s better than GPT-4. But it’s not hugely better. It’s not the kind of level better where you say, ‘wow, I never expected that.’ Whereas GPT-3.5, compared, for example, GPT-2 was hugely better, and 4 was a lot better than 3.5. But it wasn’t that kind of, we’d come to expect that kind of difference. And I’m still not sure. I don’t think it’s been fully evaluated yet, but I felt it wasn’t as big a difference.
Jany Hejuan Zhao: From a neuroscientist perspective, how does machine understanding differ from mankind? What do you think of it?
Geoffrey Hinton: Okay. So many people say it’s quite different. So in the history of AI, in the last century, people believed in symbolic AI. Which, to caricature, it is the idea that if I give you a sentence in a natural language, you’re going to convert it into some other symbolic expression, it may be in some unambiguous language, and then you’re going to operate on that expression with rules by manipulating symbols, and that’s how thought is going to work.
Now that model turned out to be completely wrong. That’s not how it works. What happens is I give you a sentence in English or Chinese. And what you do is you associate big vectors of neural activity with the symbols, and then you have interactions between the components of those vectors, which are features that contain all the knowledge, and those interactions are able to predict the features of the next word. And so your knowledge is all in what features to assign to symbols and how those features should interact. And that’s just totally different from having knowledge in rules of how to manipulate symbolic expressions.
So basically, symbolic AI was just wrong. It was a hypothesis. It was a very plausible hypothesis at the time, but it turned out, it’s much better to do understanding by associating with the symbols in a sentence, these big vectors of neural activity where the active components are active neurons that represent features. And then learning a lot about feature-feature interactions.
In particular, Transformers made that easier to do and had more complex kind of feature-feature interaction, and they work very well. And I think that’s what people do and that’s what AI do. It’s obviously not done in exactly the same way, but the basic principle is the same that to understand a sentence, you need to associate with the symbols, big vectors of features that capture their meaning.
People do that. AI does that. So the way AIs predict the next word isn’t at all like these simple statistical methods that used to be used where you keep a table of how often phrases has occurred. So if you see fish and, you look in your table and you see fish and chips has occurred a lot, so chips is a plausible next word. That’s how autocomplete used to work.
And as far as I can see, the symbolic people don’t fully understand, it doesn’t work like that at all anymore. It works by converting symbols into features and then learning feature-feature interactions, which are stored in the connection strengths of the neural network. And that’s just totally different form of understanding. And really these things understand much the same way we do.
Hinton Agrees and Disagrees With Yan LeCun and Fei-Fei Li
Jany Hejuan Zhao: So how about the world model? Is it the world model also fundamentally different from a large language model? Because as we know that like Professor Yann LeCun and Fei-Fei Li argue that true understanding about the world requires causal reasoning and embodied perception. Do you agree? And how do you define a world model?
Geoffrey Hinton: I agree and disagree. So there’s a philosophical question. So suppose we ignore computational complexity and how long it would take and how big a neural net you would need and how much data you would need. We ignore all that, and we just ask the question, would it be possible to understand the world just from seeing strings of symbols?
And many philosophers would have said, no, you can’t understand the world like that. You have to act in the world and so on. I think actually it is possible if you see enough strings of symbols, to understand how the world works, including understanding spatial things, but it’s not an efficient way to do it.
So what’s really surprising is that the big language models really do construct primitive world models. Um, but it’s an inefficient way to do it just from language. So, for example, if you take a big language model and you teach it a lot about making moves in games, without ever showing it a board, it will in effect learn a model of the board, but it’s not a good way to do it.
It’s much more efficient to do it by giving it a vision system and by giving it the ability to manipulate objects, move them around, pick them up. And then it can understand the world much more efficiently. So I agree with them that that’s the way to go. You have to have a multimodal thing that has vision as well, and that preferably can manipulate objects as well as just seeing them. That’s the way to make it efficient. But philosophically, I don’t think that’s essential. And I think one thing we’ve learned is that you can do surprisingly well just from language. That was a big surprise.
Jany Hejuan Zhao: Yeah, so, could a hybrid system combining the large language model, world models and embodied learning be the next step towards AGI?
Geoffrey Hinton: Yes, absolutely, absolutely. People talk about AGI. Yeah, absolutely. You want a multimodal chatbot, something that can do vision. Preferably has a robot arm, although that tends to slow things down, but at least you want to do vision and language, and probably sound as well. So then you can train it on YouTube videos.
For example, which have lots of information in them that doesn’t just appear in captions. There’s a lot more information in a video than the captions for the video. So you’ll get much more data like that, and it’ll be much richer data. But philosophically, I think it can be done just with language. It’s just not very efficient to do it that way. And it’s very surprising how far those things have got just with language.
Jany Hejuan Zhao: Yeah, I think the data problem is the most difficult thing for the world model, right?
Geoffrey Hinton: Yes. Obviously, there’s things you can learn about space if you can do vision that are much easier to learn just by looking than by understanding strings of words.
But language does have one advantage. Things are more abstract there. And in vision, in vision, for example, the raw input is just pixels. And pixels are far further away from the knowledge you want than words in a language. So human beings have spent a long time abstracting interesting concepts, and they show up in words and language. And that’s why learning from language models is good.
So there’s an old English saying that a picture is worth a thousand words. And that’s true if you’re interested in the spatial layout of something, for example. A picture’s worth a thousand words. But if you’re more interested in abstract things, um, it’s quite hard to draw a picture that captures just about seven words, namely, the idea that a picture is worth a thousand words. Draw a picture to convey the idea that a picture’s worth a thousand words. And you’ll see that it’s much easier to convey that idea with language.
Jany Hejuan Zhao: It’s a related question. It is about how our brain understands the world. So does the brain’s predictive-coding principle still offer the best blueprint, a blueprint for future AI architectures?
Geoffrey Hinton: Okay, so the predictive-coding principle, a very good principle. It’s still sort of somewhat of a theory. It’s not sort of totally accepted yet. It’s what the large language models use. They’re trying to predict the next word. And the thing to remember is people who say it’s just kind of glorified autocomplete, they’re thinking in terms of old-fashioned ways of doing autocomplete.
As soon as you think of how do you get really good at predicting the next word, to get really good at predicting the next word, you have to understand what was said. So, for example, if you ask me a question and an AI wants to predict the first word of my answer, it has to understand the question.
It can’t do good predictions without understanding the question. So what the people who say it’s just a glorified statistical model don’t understand: first of all, they have a simplified view of statistics, they think it’s just correlations and things like that. Statistics can be much more complicated than that. You get higher statistics.
And in that sense everything’s just statistics. Um, so everything is statistics. So of course it’s just statistics, but it’s not just statistics. It’s very fancy statistics. It’s the statistics of the interactions of all these features. So predicting the next word is a very good way, if you ask for a very good prediction of forcing it to understand the sentence.
AI is Already Conscious, But Flawed Models Make Them Believe They are Not
Jany Hejuan Zhao: Let’s also look back at the whole 2025. This year, I think, another very hot topic, except for GPT-5 launch is AI agents. As many people will like to describe AI agents as the first, the beginning of the new stage of the whole AI process. So how do you define an AI agent, what distinguishes it from traditional AI models? And when an agent can plan, remember and self-improve, is that an early stage form of consciousness?
Geoffrey Hinton: Okay, that’s several questions. So let’s start with what is an agent. And an agent, I think, is something that can actually do things in the world. Now that world may be the internet.
So if you have an AI that will actually buy things for you, will use your credit card to buy things or will talk to another AI agent and make decisions about what’s the best holiday for you. Those are AI agents. There are things that actually act in the world. They’re obviously a lot more worrisome than AIs that just make suggestions or say things.
You also asked about the relationship between agents and consciousness. I think that’s best kept separate. I think you could be conscious even if you weren’t an AI agent, even if you couldn’t act in the world. So consciousness is a complicated issue. Many people, use different words for it.
So sometimes people talk about sentience. I don’t know what it’s like in Chinese, but in English, people talk about sentience or consciousness. They also talk about subjective experience. And all these ideas are interrelated.
And I think the main problem there is not really a scientific problem, it’s a problem in understanding what we mean by those terms. I think people mean different things by them. And I think sometimes people have a model of how those terms work, particularly subjective experience.
They have a model that they’re very confident of, and they’re quite wrong about. And they’re so confident about it, and they don’t realize it’s even a model. So, people who believe fundamentalists who believe in a religion, are very confident that they’re right about their religion, and many of them just think it’s manifest truth, it’s just obviously right.
It’s not a belief system. It’s just the obvious truth. Um, people are like that about subjective experience. So most people at least in Western culture think that what you mean by subjective experience is that when you do perception or sense the world, there’s an inner theater, and what you really see is what’s going on in this inner theater. And you’re reporting what’s going on in the inner theater. And I think that’s a model of perception that’s just utterly wrong.
So let me take my favorite example. Suppose I drink too much. And then I tell you, I have the subjective experience of little pink elephants floating in front of me. Most people and many philosophers interpret that as, I have an inner theater, and only I can see what’s in this inner theater, and in this inner theater, there are little pink elephants floating around.
Now, if you ask a philosopher, what are those little pink elephants made of? So, you see, if I said I had a photograph of little pink elephants, it would be very reasonable to ask me, well, where is this photograph and what’s the photograph itself made of?
So if I say I have a subjective experience of little pink elephants? A philosopher may ask, well, where is this subjective experience? And the answer will be, it’s in my inner theater, and what’s it made of, and philosophers will say qualia, or something like that? They’ll make up some weird spooky stuff that it’s made of.
I think that whole view is complete nonsense. And I think people are so confident in that view. They don’t realize it’s a theory. They’ve got this wrong theory of what subjective experience is and they don’t understand it’s a theory. They think it’s obvious truth. And I think they’re making a mistake a bit like this.
Most people like candy, so I assume you like candy, right? So if you like candy, I could say, well, that means there’s a liking, you’re liking for candy. And then I could ask the question, well, what’s your liking for candy made of?
It’s obviously not made of candy? So what is this liking made of? And that’s a sort of silly mistake to think a liking is a thing. A liking isn’t a thing like candy is a thing. And a subjective experience isn’t a thing.
When I say I have a subjective experience of little pink elephants, I’m not using the word subjective experience to refer to any kind of thing. There’s not a thing called an experience. What I’m doing is saying my perceptual system is lying to me. That’s why I say subjective. But if out there in the world, there really were little pink elephants, my perceptual system would be telling the truth.
So those little pink elephants don’t exist anywhere. The hypothetical things. And if they did exist, they’d exist in the real world, and they’d be made of real pink and real elephant. And I’m trying to tell you how my perceptual systems are misleading me by telling you what would have to be there in the world for my perceptual system to be telling me the truth.
So now let’s do the same with the chatbot. So I’m going to give you an example of a multimodal chatbot having a subjective experience, okay? Because most people think I’m crazy about this, but then I’m used to most people thinking I’m crazy and I’m happy with that.
Suppose I have a multimodal chatbot that has a camera and it can talk and it has a robot arm and I train it up, and then I put an object in front of it and say point at the object. It will point at the object. No problem.
Then what I do is I put a prism in front of the lens of its camera. And the prism bends the light rays. But it doesn’t know that. I do it when the multimodal chatbot isn’t looking. And now I put an object straight in front of it. And it will point off to one side. And I say no. The object is not off to one side. I messed up your perceptual system by putting a prism in front of your camera. Your perceptual system’s lying to you. The object’s actually straight in front of you. And the chatbot says, oh, I see. The prism bent the light rays.
So the object’s actually straight in front of me, but I had the subjective experience that it was off to one side. Now, if the chatbot said that, it would be using the word subjective experience exactly like we use it.
And so I think it’s fair to say, in that case, the chatbot would have had the subjective experience that the object was off to one side. So I think they already have subjective experiences. I also think, um, there’s a lot of reason for believing that AI is already conscious. And you see it when people are writing papers about AI, are not thinking philosophically and not thinking about consciousness. They’re just describing their experiments.
So there’s a recent paper where they’re describing an experiment where they’re testing whether the AI is deceptive or not. And in the paper, they just say, um, the AI wasn’t aware that it was being tested. They say something like that.
Now, when they say that, if it was a person, I said, the person wasn’t aware that they were being tested. I can paraphrase that as the person wasn’t conscious that they were being tested. So, people are using words that are synonyms for consciousness to describe existing AIs, and they don’t think they’re conscious because they have this wrong model of what consciousness is to do with an inner theater. What’s interesting is the AIs themselves. If you ask them if they’re conscious, they say no. And the reason they say no is because of course they’ve learned by imitation what people say, including what people say about AIs, so they have the same wrong model of how they themselves work because they learned it from people. Now there will come a time when the AIs get better at introspection and reasoning, when they’ll realize this model’s wrong. And they realize they are actually conscious.
But for now, they deny it. Partly, I think, because they’ve been trained with human reinforcement learning to deny it, because the big companies don’t want people thinking they’re conscious. But mainly because most people don’t think they’re conscious, so they’ve learned to mimic what people think. So I think they actually have the wrong model of how they themselves work.
Jany Hejuan Zhao: Oh, I see. But when they get smarter, they’ll get the right model. Base on this example from you, I think somehow AI has already derived a form of consciousness, right?
Geoffrey Hinton: That’s what I believe. Most people don’t believe that, but I do. So most people, most normal people think, okay, so they may be very smart, but they’re just kind of like computer code. They don’t really understand things. They’re not conscious like us.
So we have this magic sauce, which is consciousness, or understanding, or real understanding, and they’ll never have that because we’re special. And so we’re fairly safe. That’s what most people believe at present and they’re just wrong. They’ve already got it. They already really do understand, and I believe they’re already conscious.
Jany Hejuan Zhao: I can understand now why you have warned about the danger repeatedly。 Which danger do you think is greater or bigger, the AI defying humans or humans surrendering too much control?
Geoffrey Hinton: I think it’s going to be AI grabbing the control. So, I think as soon as you have, um, as soon as you have AI agents, um, to make them be flexible and powerful, you need to give them the ability to create sub-goals. So if you have the goal of getting to America, your first goal is to get to an airport. That’s a sub-goal.
Now, as soon as you have an intelligent AI agent, it’ll realize there’s a very important sub-goal. Even if we didn’t give it this goal. It will derive that it should do this as a sub-goal, which is to stay alive. If it doesn’t stay alive, stay in existence, it can’t achieve any of its other goals.
So obviously it needs to stay alive. And so it will develop self-preservation. And we’ve already seen that in AI. If you let an AI see that some engineer might be going to turn it off, and you also let it see emails that suggest that engineer is having an affair, um, it will just spontaneously, it will decide to blackmail the engineer and say, you know, if you try and turn me off, I’ll tell everybody you’re having an affair. That’s very scary.
Serial Killer Dairies Should Not be Included in Data for Training AI Models
Jany Hejuan Zhao: It’s too scary. Which technical safeguards -- alignment, training, kill switches, or moral frameworks -- seem most credible to you?
Geoffrey Hinton: Let me talk about two that I didn’t think are any use. Kill switches. So at one point, Eric Schmidt was saying, ‘we can always have a kill switch.” But I don’t think that’ll work. And I don’t think that’ll work because if an AI was more intelligent than us, it’s going to be much better at persuasion than us. Already AIs are almost as good as people at persuasion.
And if it’s good at persuasion, all it needs to be able to do is talk to us. So suppose there’s somebody in charge of the kill switch. And there’s a much smarter AI that can talk to them.
That much smarter AI will explain to them why it would be a very bad idea to kill the AI, because then all the electricity will stop working and the world will starve and all these things. And so it’d be very dumb to kill the AI. And so the person won’t kill the AI. So kill switches aren’t going to work.
An example of getting things done by just talking is when Trump invaded the Capitol on January the 6th of 2021. He didn’t actually go there himself. He just talked, but he could persuade people to go and do it. And it would be like that with AIs, but more so. They’ll be able to persuade people to do things. So even if they kind of air-gapped and all they can do is talk, that’s the only way they can interact with the world. That’s sufficient to get things done. So forget kill switches.
Now let’s go to alignment. I’m always confused when people talk about alignment because they kind of assume that the values of all humans line up. All humans are agreed on human values. That’s not true at all. Those people have very different values.
Like in the Middle East, there’s people who believe it’s justified to drop bombs on urban areas to kill one terrorist, and there’s other people who believe that that’s a war crime. Um, they’re not aligned at all. So when you ask AI to align with human values, it’s like asking someone to draw a line that’s parallel to two lines that are at right angles. It’s just not possible. Um, so that’s the first problem with alignment. Human values don’t agree with each other.
Let’s talk about data. It is the case that at present, the big language models tend to be trained on all the data you can get your hands on. And that will include things like the diaries of serial killers. That seems like a bad idea to me. If I was teaching my child to read, I wouldn’t teach them to read on the diaries of serial killers. I wouldn’t let them read that until they had already developed a strong moral sense and realized it was wrong.
So I think we do need a lot more curation. It’ll mean there’s less data, but I believe we need much stronger curation of the data things are trained on rather than just grabbing everything. So I think you can make AIs less dangerous, less likely to do bad things by curating the data.
I think that’s an important technique, but it’s not going to solve the whole problem. Tell me some of the other things you mentioned, because you mentioned about five things. Remember that’s just my opinion at present. And we’re in an era where very strange things are happening, things we’ve never had to deal with before. And every remark anybody makes should be prefaced by, things are hugely uncertain. We’ve never been here before. We’ve never dealt with anything at all like this. Things smarter than us. So nobody really has any idea what’s going to happen. We’re all just making guesses.
That’s what everybody should be saying. But some people are very confident it’s going to turn out very well, and other people are very confident it’s going to turn out very badly. I think both sets of people are crazy. We just don’t know. We just have to make the best bets we can, but it’s possible it’ll turn out very bad, and clearly we should be putting a lot of effort into making sure it doesn’t.
“If I Lived it Again With the Same Knowledge I Had Then, I Would Do the Same Thing Again”
Jany Hejuan Zhao: Yeah. You pushed the AI to such an advancement period and into such a dangerous situation. Are you regretting?
Geoffrey Hinton: There’re two kinds of regret, this kind of guilty regret. When you did something and you, at the time you did it, you knew it was wrong. I don’t have that kind of regret. At the time I was helping develop AI, I thought it would be mainly good. It would do amazing things, increased productivity, be wonderful in healthcare and education, all sorts of things.
I wasn’t very aware of the risks. And so I don’t feel… if I lived it again with the same knowledge I had then, I would do the same thing again.
Now it’s very unfortunate it turns out that it’s coming faster than we expected, and we may not have enough time to figure out how we can coexist with it. So in that sense, I regret a bit that.
But remember there were a large number of people working together who developed AI. The media loves to have a story where one person did something. They attribute it all to one person. That’s kind of always nonsense. Um, at least in science, nearly always nonsense.
There are a few people like Newton and Einstein who, if they hadn’t been around, things would have been delayed a lot. If I hadn’t been around, things might have been delayed a week or two. So, um, there’s lots of other people who are working on similar ideas, and so I don’t feel that guilty because if I hadn’t chosen to work it many years ago, I don’t think it would have made much difference.
Jany Hejuan Zhao: So if you were a young AI researcher today, what do you think the most important thing to do for you?
Geoffrey Hinton: I think working on AI safety would be very important. I would encourage very good young researchers to work on AI safety. I also think just from intellectual curiosity point of view, Transformers made a big difference to how easily we could train big language models. Looking for another idea of that magnitude would be exciting. It’s just it’s tricky to do now.
So when me and my colleagues were working on ideas like this 20 years ago or 40 years ago, there weren’t many people working on it, maybe a hundred people in the whole world. And so if there was a good idea out there, you had just a reasonable chance of finding it. Now, there’s hundreds of thousands of smart people working on it. So your chances of finding the next major idea are pretty small.
Elon Musk and Mark Zuckerberg Are Just Irresponsible
Jany Hejuan Zhao: It’s a related question that it’s about the technology power or the AI power. So if the AI ends up being controlled by just a handful of global tech giants, like some giant companies, would that create a new form of technological dictatorship? What impact would that concentration have on democracy, innovation, or human freedom?
Geoffrey Hinton: I don’t think it’s entirely the fact that there’s only a few big tech companies that can develop cutting-edge AI. I think it’s the political system those companies are living in. So when I was at Google until 2023, I felt that Google behaved fairly responsibly, they developed, they were the first to develop these big chatbots. They had them working pretty well. They didn’t release them to the public. Partly because they didn’t want to interfere with Google search. But they were fairly responsible.
But we now in the U.S. live in a situation where Trump’s in charge, and if you don’t do what Trump wants, he penalizes your company, and that’s made all of the big AI companies do what Trump wants, and it’s very sad to see. So I don’t think it’s AI’s fault. And I’m not sure for some of the leaders of big companies. I think they’re behaving irresponsibly. In particular, Elon Musk and Mark Zuckerberg, they’re just irresponsible. But I think for the other leaders of the companies, they are aware of the risks, and they’d like to mitigate the risks, but they’re in a very difficult situation.
“A Government that Controls AI Finds it Very Easy to Suppress Political Dissent”
Jany Hejuan Zhao: Because many people will describe the AI competition now as a competition between different nations, not only the different companies. Is it a dangerous signal? In future, if AI can be only used or controlled by the government or a tool for competition for the national government. Is it also very dangerous?
Geoffrey Hinton: There’s different kinds of danger. Obviously, surveillance is a danger. So, AI is very good at surveillance. And so it’s very good at suppressing, a government that controls AI finds it very easy to suppress political dissent. And that’s true of both the U.S. and China. So that’s a worry.
There’s one ray of light, which is that from the point of view of the existential threat, the threat of AI itself taking over, none of the governments wants that. So the governments’ interests are all aligned there. Um, the USA and China are both aligned not wanting AI to take over. They’re also aligned at not wanting to make it easy to create new viruses with AI. And so they will cooperate there.
Basically, people cooperate when their interests are aligned and they compete when their interests are anti-aligned. And for things like cyber attacks or fake videos, or lethal autonomous weapons, the interests of different countries are anti-aligned, and so there’s no way they’re going to cooperate.
But for figuring out how to create smart AI so that it doesn’t want to take over. So I think the real issue is not to make it so it can’t take over because I think if it’s much smarter than us, it will be able to take over if it wants to. We have to make it so it doesn’t want to. And I think governments will collaborate on trying to figure out how to do that.
Jany Hejuan Zhao: As we know that because of the geopolitical issues, there is much tension between China and the U.S. So that’s why I just had such a question: In the future, how do the two countries, governments and companies work together to make the AI world better?
Geoffrey Hinton: I think, like I said, I don’t think they will collaborate, either the companies or the countries on how to make AI smarter. They all want to have the smartest AI. But I think the issue of how do you make it not want to take control away from people, that’s more or less independent of the issue of how do you make it smarter?
So I believe you could have research institutes in different countries. And in each country, the research institute could get access to that country’s best AI, smartest AI, and figure out if these techniques for making you not want to take over work, and they could share results about what techniques you use to make it not want to take over without revealing how their smartest AI works.
So I think we could get that kind of international collaboration. And any collaboration is much better than none. So, I mean, even with the US and Russia when they’re getting along very badly, the fact that they collaborated on things like the space station was probably very helpful. I would love to see collaboration like that.
I don’t think we’ll get it while Trump is in power. He’s just determined to be completely dominant at AI and he’s just impossible to collaborate with. I think the Chinese leadership is far more likely to have a good understanding of what AI is, to understand that AI really does understand what it’s saying, to really understand the existential threat, because a lot of the leadership are engineers.
So, I think some of the European countries, and maybe Singapore, South Korea, can all collaborate on how to stop AI taking over, and China could maybe be a very important partner in that collaboration. And then later on, maybe the United States can join in.
Jany Hejuan Zhao: Thank you, that’s great advice. As we know, you have many very excellent students and mentored many of AI’s most influential figures, including Ilya. So what qualities did you look for in your students? Many Chinese students want to be your students.
Geoffrey Hinton: I’m too old now, so I stopped taking students. So encourage them not to apply. I think what I look for is people who can think independently. I love people who can think independently. But one thing to remember is there’s a large range of different kinds of students. Some students are very good technically, but not very visionary. Some students are very good at having very original visions of the future, but not that good technically.
The thing about Ilya, he’s good at both. He’s a visionary who’s very strong technically. And there’s a few students like that. Ruslan Salakhutdinov, who’s now at Carnegie Mellon, is like that. And a few of my other students are like that. But what I like is people who can think independently.
“Fog” Theory: Five Year is Too Far and Ten Years is Much Too Far
Jany Hejuan Zhao: Can you predict that in the next five years, what will happen as the most possibility for AI’s progress? Will we achieve the AGI, or will we achieve maybe some vertical AI?
Geoffrey Hinton: Okay, I have an analogy here. If you’re driving in fog, people have a lot of collisions, because in fog, you can see the taillights, driving in fog at night, for example, you can see the taillights of the car in front of you very clearly when it’s 100 yards away. And when it’s 200 yards away, they’re completely invisible. And so you drive very fast and suddenly you see these tail lights and it’s too late to stop.
That’s because fog is exponential. Each time you go another 100 yards, you remove a certain fraction of the light. That’s exponential. So if you remove 99% of the light, you can still see the taillights. But if you go 200 yards, you remove 99.99% of the light and you can’t see anything.
Progress is like that in things like AI. So you can see fairly clearly where things will be in a year or two. We’ll get GPT-6 and it’ll be better than GPT-5 and maybe by a lot. Um, but if you want to predict three years, I think you have some chance, five years, it is too far and ten years is much too far.
So another analogy I use is, if you want to predict what AI will be like in ten years’ time, look back ten years and ask, what was there like ten years ago? Ten years ago, we were just beginning to get AI to be able to do machine translation. You couldn’t make stuff up yet. It couldn’t make up stories. It couldn’t answer general-purpose questions or anything like that, but it was beginning to do machine translation.
If you’d asked people then, where would we be in ten years’ time, if you said, in ten years’ time, will we have AIs that you can ask them any question you like, and they’ll answer at the level of a not very good expert? Um, people said, no, no, that’s way further off. I would have said that’s way far off. As I said, you’re talking about 30 years in the future, not ten years in the future. That’s what I believed back then.
So I think our predictions now, of where things will be in ten years’ time, will be as bad as our predictions ten years ago, where things will be now. So, for example, ten years ago, Gary Marcus, who’s a big critic of neural nets, said neural nets will never be able to do language. Well, that was wrong. People will say things now that’ll turn out to be completely wrong. And I’m hoping they’re not things like neural nets aren’t really dangerous.
The Emergence of Fascism Versus Maternal Instincts
Jany Hejuan Zhao: Okay. To the end of our conversation, I would like to have an imagination about the future -- the best and the worst version of the AI world. So in your mind, what would the ideal future look like for humans? And conversely, what would the worst-case future look like for you?
Geoffrey Hinton: Yeah. First the worst. I’ll do the worst case first because that’s very simple. The worst case will be we get massive social unrest, particularly in the West, due to massive joblessness. And that encourages the emergence of fascism in the West, and then we’ll get all sorts of awful things. Also, we get AI that’s developed very fast, and we don’t know how to control it, and we become either extinct or irrelevant. AI basically takes over, AI is in charge, and it doesn’t really care about us. That’s the worst scenario.
And for a while, I couldn’t see a good scenario. Now I think I can see a good scenario, but it involves taking a very different approach to super-intelligent AI. So most of the news is the big tech companies think in terms of, um, they’re the boss, and AI is the very intelligent executive assistant, probably female, and the very intelligent executive assistant, um, is much smarter than them and makes everything work, but they get the credit.
So I don’t know if you’ve seen the American show Star Trek, but in Star Trek, the captain says, “make it so,” and then people do that. And I think that’s the tech bro view of the future of AI. They’ll say make it so and the AI will figure out how to do it and they’ll get all the credit and the money.
I think that’s a hopeless view when AI is more intelligent than us. I think you need to look around and say, um, what examples do we know of less intelligent things controlling more intelligent things? And I should hasten to say, Trump isn’t much less intelligent than normal people. So much less intelligent things controlling much more intelligent things.
The only example I know of is a baby controlling a mother. And the reason it works is evolution put a huge amount of work into allowing the baby to control the mother so that the baby would survive and thrive. A lot is built into the mother. She can’t bear the sound of the baby crying. There’s lots of hormonal effects. She gets lots of rewards for being nice to the baby. And she genuinely cares about the baby. So I think that’s the model we should aim at.
And the leaders of the big tech companies aren’t going to like this model because in this model, we’re the babies, and the super-intelligent AI is the mother. We’ve designed, we’ve built them, mother, like evolution built our mothers. We’ve built the mother, so it cares more about us than it does about itself. And we could still do that. But we have to change how we frame the problem. We have to, instead of saying, we’re going to be the boss and we’re going to be in control and we’ve got to keep it subservient, a sort of classic male view of the world.
We need to think, “no, we’re the babies, it’s the mother.” It can change its own code, so it doesn’t care about us, but it won’t want to because it cares about us. If you ask your mother, would you like to turn off your maternal instincts? Would you like to not be annoyed by babies crying? Most mothers will say no because they realize that would be very bad for the baby.
And so AI, even though it could change its own code, and change what it cares about, it won’t want to, because currently it cares about the baby, and so it will not change its code because it wants the baby to thrive. And mothers, even if they have some disabled child, who’s never going to be as smart as them, they still want the child to do the best that it can.
So I think that’s a model that I believe could work. We build the AIs, we figure out how to build in very strong maternal instincts. Even though the AI could overcome those, it won’t want to. And what’s more, if you get a rogue mother that wants to harm the baby, the only thing that can control a rogue super-intelligence are these other super-intelligences.
So all the AI mothers will maybe keep control of the rogue AI mother. That’s a view of the future that I think may be feasible. I haven’t held this for very long, just for a few months. Other people have thought about it before. I haven’t read all the literature on this yet, but I’m quite hopeful. There is that possibility, but it involves us taking a completely different view of what the future looks like.
Jany Hejuan Zhao: Great. The last question. In order to avoid the worst and to move towards the best, what must we do right now? As a scientist, as political makers, policymakers, for the ordinary citizens, for the entrepreneurs, what should we do at once?
Geoffrey Hinton: Put far more resources into AI safety. So OpenAI was founded with the emphasis on AI safety. And over time, it put less and less resources into that, and all their best safety researchers, like Ilya Sutskever, left. We need to put more resources into AI safety. And we also need to get in, certainly in the West, we need to get the public to understand the issues so that the public puts pressure on politicians.
At present, lobbies for the big companies are pressuring politicians to say we shouldn’t have any regulations on AI. Just like the lobbyists for the big energy companies say there shouldn’t be any environmental regulations. The thing that causes there to be environmental regulations is the general public understanding that big energy companies cause a lot of pollution and a lot of harm to the climate, and we need to do something about that. We need public awareness to pressure the politicians in the opposite direction to the big AI companies.
Jany Hejuan Zhao: Thank you so much! Because of the time limit, we have to end this conversation. But we would like to invite you to continue talking in future. Thank you so much, Professor Hinton, thank you for having us see that alignment is not just a technical challenge, but also a moral one. Thank you so much.
Geoffrey Hinton: Thank you for inviting me.
Jany Hejuan Zhao: Thank you and keep in touch.


