Dialogue with the initiator of the "AI Risk Statement": Steam engines will not make humans extinct, but AI will!

2023-06-13 04:11:34

This article is from Beijing by Zhang Xiaojun, author of Tencent News "Periscope"

Image source: Generated by Unbounded AI tool

Half a month ago, hundreds of top artificial intelligence figures, including OpenAI CEO Sam Altman and Turing Award winner Geoffrey Hinton, signed a short statement on the threat of AI to human extinction.

This statement has only one sentence - "mitigating the extinction risk of artificial intelligence should become a global priority along with other societal-scale risks such as epidemics and nuclear war."

David Krueger, assistant professor of artificial intelligence research at the University of Cambridge, was one of the original initiators of the letter and is also a signatory.

On June 9, Krueger had an in-depth dialogue with Tencent News "Periscope" while attending the 2023 Beijing Zhiyuan Conference. He has long been concerned about AI alignment and security issues. Since 2012, he has been concerned about the potential risk of artificial intelligence for human extinction, which the industry calls "X-Risk" (An existential risk).

In response to the risks of AI, one view counters that this intense concern may be similar to the excessive panic of humans when the steam engine appeared hundreds of years ago. But Krueger said that the biggest difference between the two is that the steam engine will not make humans extinct, but AI will.

Krueger believes that AI is smarter than humans, disorderly competition, and building AI systems that have a more direct impact on the world-these three factors will greatly increase the risk factor of AI systems getting out of control. “The more open the system, the more autonomous, the more intelligent, and the more it is designed to achieve long-term goals, the greater the risk of the system getting out of control,” he said.

In his view, the security issue of artificial intelligence is like the issue of global climate change. All parties have their own interests, and there will be many conflicts and differences, which eventually make it a complex global coordination issue. Therefore, it is necessary to start as soon as possible and through the efforts of all parties, so that human beings will not be reduced to the fate of being taken over by AI.

Only in this way can human beings survive.

David Krueger

The following is the essence of David Krueger's talk.

01 "AI Risk Statement" joint letter is only one sentence, it is carefully designed

**Tencent News "Periscope": You are one of the signatories of the "Statement on AI Risk". Can you tell us how this statement came about? **

David Krueger: I came up with this idea more than a year ago, because people are more and more concerned about the risks of AI, especially the risk that AI may cause human extinction. At the same time, many people are not discussing this issue openly.

One big reason is history, where the idea was once considered a fringe idea and people feared that discussing it publicly would negatively affect them, or even their career prospects.

A few months ago, a good time came. Since the release of ChatGPT and GPT-4, people's attention to AI has reached unprecedented levels. For a long time, it was acknowledged that this could be a hypothetical future issue, but it is too early to tell.

As for how this statement came about, I reached out to a number of colleagues, including Dan Hendrycks, director of the Center for Artificial Intelligence Safety. I told him we should make such a statement, and I intend to do it as soon as possible. But I'm not sure I'm good enough to do it. It is urgent. So, Dan took the topic and pushed for the announcement.

**Tencent News "Periscope": What comments do you have on the wording of this letter? **

David Krueger: I propose to use only one sentence. There are several reasons.

First, when you have a long statement, there is a good chance that someone will disagree with some of it.

We saw a few months ago that the Future of Life Institute issued a call for an immediate moratorium on all AI labs training AI systems more powerful than GPT-4 for at least 6 months. A lot of people's reaction to this is that sounds great, but I don't think we can pause the development of artificial intelligence.

Of course they made that statement and it's still valid because once people say we can't pause it's a sign that we need to act. We really need the ability to suspend a technology that is too dangerous to be developed.

I use this example to illustrate that the more you talk, the more people disagree. In this case, we didn't mention how to deal with this risk, because people have different opinions on the correct method; we also didn't say why it might lead to human extinction, because different people have different opinions on this-some people are more worried about technology. Misused, others are more worried about the technology getting out of control, which is not the intentional result of a malicious actor.

Either way, as long as a lot of people agree that this is a huge risk and we need to act, then that's fine.

02AI risk is fundamentally different from steam engine

**Tencent News "Periscope": What is the biggest difference between people's worries about the threat of artificial intelligence and people's fear of steam engines two or three hundred years ago? **

David Krueger: I don't know much about that history. I'm not sure if anyone said it would lead to the extinction of humanity at the time. If someone had said that, I'm also not sure what kind of arguments they would use, it seems unlikely to me.

The key difference is that we're talking about extinction. We're talking about a technology that is potentially smarter and more powerful than humans in every relevant capacity.

The steam engine allows us to create physical forces stronger and faster than humans. But the steam engine is not intelligent and relatively easy to control. Even if one gets out of control, the worst-case scenario is that it malfunctions and those on it could die or be injured. But if an intelligent system or a self-replicating system gets out of hand, a lot of people can die because it can grow and gain more power, and that's the key difference.

**Tencent News "Periscope": Some people think that public statements can only stop good people, trying to make good people pay more attention to security issues and slow down the speed of research and development, but they cannot stop the actions of bad people. How do we prevent the bad guys? **

David Krueger: Regulating behavior through regulation and international cooperation.

I don't really like talking about it in terms of "good guys" and "bad guys" because everyone always thinks they are good guys. The main risk I worry about is not malicious manipulation of AI systems by some bad guy or malicious actor, but something like climate change—individuals might gain more from burning more fossil fuels or making more powerful systems that are harder to control. There are many benefits, but each bears some cost. In the case of climate change, this can cause damage to the environment. In the case of artificial intelligence, the risk is that the system spins out of control and leads to catastrophe.

This is more of a motivational question. In fact, humans care more about themselves, their friends, their loved ones, and their community than some stranger on the other side of the world. Therefore, no malicious intentions are required, only selfish instincts. That's why regulation is needed, it's the way to solve these kinds of problems of common human interest.

03AI alignment work has a lot of unsolved mysteries

**Tencent News "Periscope": Your research interests are deep learning, AI alignment and security. Can you explain what is alignment in a language that ordinary people can understand? You said that "alignment will be one of the key drivers of AI development", why is it so important? **

David Krueger: I like to say that people have three different interpretations of this. One is to make AI systems act on our will. But I don't think that's a good definition, it's too broad and every engineer is trying to make the AI system behave as they want.

There is also a more specific definition, which is "alignment by intent". In my opinion this is the correct definition and refers to making the system try to do what we want it to do. When designing a system, you want it to have the right intentions, motivations, and goals. It still may not be able to act as you wish, because it may not be capable or smart enough to know how to carry out your wishes. But if it has the right intent, you can say it's aligned.

The final meaning people have for alignment is any technological effort to reduce the risk of human extinction. Sometimes, it also refers to the community of people like me who specialize in the field of alignment and security. That's not my preferred definition either. This is just one idea people have of how to solve this problem. Ultimately, though, more work is needed on governance, regulation and international cooperation, such as the conclusion of treaties, that is necessary to mitigate this risk.

**Tencent News "Periscope": What new progress have technology companies and scientific research institutions made in alignment technology recently? What are the most pressing challenges and problems facing you? **

David Krueger: The most important thing is fine-tuning techniques for large language models. They have done a lot of work to change the behavior of the model. For example, the difference between GPT-3 and GPT-4 is that the system is aligned to act more according to the intention of the designer. Mainly through reinforcement learning and human feedback, although the details are not public. This succeeded for the most part, but it didn't completely eliminate the problems with these models.

I worry that this technique may not be sufficient for more robust systems, since the changes it makes to behavior may be relatively superficial. This problem can become more serious as the system becomes more powerful.

It's kind of like the animal training analogy, like you train a dog not to get on furniture. Maybe it does this really well when you're there, but if you leave the room, it still picks up the furniture. Similar situations may occur in these models. They may appear to be aligned, but they can still misbehave if they think we won't notice their misbehavior.

**Tencent News "Periscope": When AI intelligence is much smarter than humans, how can humans complete the alignment work on a super intelligent body? **

David Krueger: This is an open research question. Therefore, it is important to conduct research on AI alignment in order to find the answer to this question.

**Tencent News "Periscope": How can we make AI love human beings instead of harming them through alignment? **

David Krueger: This is the same question as the previous one. I wish I had an answer, but don't know it yet.

04 These three major incentives can increase the risk of AI out of control

**Tencent News "Periscope": In your opinion, at what point in the history of AI is this point in time? **

David Krueger: We have reached a point where the world is waking up to the risks. I have been waiting for this moment for a long time.

**Tencent News "Periscope": Since you listened to the deep learning course taught by Geoffrey Hinton ten years ago, you have been worried that AI may lead to the extinction of human beings. Why did you start worrying at such an early stage? **

David Krueger: I was worried in principle that this would happen at some point because it would one day be smarter than humans, but when I saw Hinton's class, my concerns changed. Deep learning has the potential to produce real intelligence than any other method I've heard before.

**Tencent News "Periscope": Under what circumstances will the artificial intelligence system go out of control? **

David Krueger: One, if they're smarter than us, you start worrying about when they get out of hand in the details, but it's hard to predict exactly how that's going to happen.

A second factor that increases risk is that there is a lot of competition to develop and deploy powerful AI systems as quickly as possible. We currently see this competition exist between Google and Microsoft. There are also concerns about international competition, which could be economic, it could be geopolitical, it could even be military.

The third factor is if you're building AI systems that have a more immediate impact on the world. The systems we see so far are just language models, they just generate text. But there are also a lot of people looking at combining them with other systems, like using them to write code, using them to control different things, whether it's online or using them to control things in the real world. Giving these systems more control and autonomy increases risk.

Comparing that to the systems we have today, which are mostly just trained to predict text, this is a relatively safe way to build systems—compared to asking the system to achieve a goal in a certain environment, especially compared to the system in and This is safer than achieving goals in an environment where the real world, the physical world, interacts frequently. When systems try to achieve goals in the real world, they may naturally try to acquire more resources and power, because these are helpful to achieve long-term goals.

Thus, the more open, autonomous, and intelligent a system is, and the more it is designed to achieve long-term goals, the greater the risk that the system will spin out of control.

**Tencent News "Periscope": If you think that a framework for global collaboration should be formulated to ensure that countries follow common principles and standards in AI development, then what should these specific principles and standards be? **

David Krueger: We absolutely need to do this, and we need to start doing it urgently. Because it will be difficult and will require a lot of discussions and negotiations because there are many conflicts and differences between different countries.

As for the specifics, that's something I'm still thinking about. We want to make sure that we have some very legitimate governing body or governance system that can push for a moratorium if at some point in the future we feel the need to do so. This is an important part of it.

Things get more complicated when it comes to the systems we're developing and deploying. We would like to have some testing, evaluation and auditing mechanisms. We may also need to consider some form of licensing, but there are a lot of details to work out. Right now, I don't have a complete scenario in my head. That's why I hope we can inspire more people in policymaking, with expertise in policy and international relations, to think about it.

**Tencent News "Periscope": In the current artificial intelligence system, which aspects need to be improved as soon as possible to deal with potential risks and threats? **

David Krueger: One is robustness (Note: Robustness refers to the ability of the system to survive in abnormal and dangerous situations). Our current systems have significant problems with robustness, most notably the problem of adversarial robustness, where small changes to an input, even imperceptible to humans, can have a large impact on the behavior of the system . This problem has been a known problem for about 10 years, but still seems to have no solution. This is very problematic if we consider systems that pursue some goal and try to optimize their understanding of that goal. Because depending on their understanding of the goal, the optimal outcome can be very different from what we imagine or intend. And the assessments we're doing right now are hard to spot.

Another is our lack of understanding of how these systems work. We really want to be able to understand how these systems work, it's one of the best ways we can predict their behavior. We want to make sure they don't behave in unexpected and dangerous ways in new situations. This is related to the robustness issue.

05 Is human extinction far away?

**Tencent News "Periscope": Looking at it now, is human beings far away from extinction? How many years is it expected to be? **

David Krueger: Jeffrey Hinton keeps saying that it will take us 20 years or less to get artificial general intelligence (AGI), which is a reasonable time frame. This is quite similar to my point of view.

I think humans could be extinct shortly after that, but it could take longer, and I guess that's what I'm trying to emphasize, even if it's decades away, we need to start addressing that as soon as possible.

Returning again to the climate change analogy. It took us decades to start actually taking effective action, and still not enough is being done to prevent the dire consequences of climate change. This is because it is a complex global coordination problem. Artificial intelligence will face a similar situation. We should start as early as possible.

**Tencent News "Periscope": Can a large language model bring AGI? **

David Krueger: A lot of people are asking this question right now. My point is more complicated. I'd say it's possible, but more likely it will need to be combined with other technologies, and maybe even some new technology will need to be developed.

**Tencent News "Periscope": How do you view the relationship between humans and artificial intelligence? Will humans be an intelligent transition? **

David Krueger: Only time will tell. I hope not. But now, this is a question that we still have some initiative and ability to guide and decide how the future will develop. If we can act in an intelligent and coordinated manner, if we can get lucky, it will be up to us as humans to decide whether AI will take over at some point.

**Tencent News "Periscope": Hinton has a very interesting point of view. He said: Caterpillars will extract nutrients and then transform into butterflies. People have extracted billions of cognitive nutrients. GPT-4 is the human butterfly. Do you agree with this point of view? **

David Krueger: Very poetic, and I don't think it's entirely accurate, but maybe it hits upon some essential truth that an AI system doesn't necessarily need to learn everything the hard way from scratch. Humans need to go through a long evolution to reach the level of human intelligence, but now humans have produced all these cultural products, including all texts on the Internet, which is very inspiring for AI systems. As such, they don't necessarily need to go through all of their evolution all over again to achieve a similar level of intelligence.

**Tencent News "Periscope": Is this your first time in China? What is your impression of coming to China? Do you have any suggestions for the development of artificial intelligence and large-scale models in China? **

David Krueger: This is my first time in China. I just arrived yesterday morning. The whole visit was meeting and talking to people, people were friendly and I had a good experience here. But I don't feel like I've really experienced China. I'm just meeting with researchers, unfortunately it will be a short trip for me, but I hope to get at least a good look at Beijing on my last day before leaving.

(Advice to China) I think it's important to think and understand security and alignment. From the conversations I've had, it's clear that people are already working on this, at least to some extent.

**Tencent News "Periscope": Many scenes in the movie "Her" have gradually appeared in our real world. Do humans have emotions for artificial intelligence? Do you feel an emotional attachment to the AI model you develop? **

David Krueger: I don't have one, but as far as I know, some people do.

Similar artificial intelligence girlfriend chatbots do exist, and someone has become emotionally dependent on the relationship. This is a sad consequence of this technology.

View Original

The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.