↔
Title: Superintelligence Will Drive Us to Extinction and We Cannot Stop It 🤖 | 🎙️ Roman Yampolskiy
Duration: 01:30:25
Total Correct Answers:
Current Caption
Correct
Learning Modes
YouTube Video Transcript Hide
Ask AI:
Export as:
Ask AI Result
The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here
(00:00:00)
If you create something thousands of
(00:00:02)
times smarter than all humans who ever
(00:00:04)
existed, the most likely outcome is bad.
(00:00:07)
I think existential risk and absolute
(00:00:09)
possibility. It can actually wipe out
(00:00:12)
humanity as a whole.
(00:00:13)
>> Roman Yampolski, a computer scientist
(00:00:15)
and professor at the University of
(00:00:16)
Louisville is one of the leading voices
(00:00:18)
in artificial intelligence safety. After
(00:00:20)
reaching 12 million plays on the diary
(00:00:22)
of a CEO, he has been featured on some
(00:00:25)
of the world's most influential
(00:00:26)
podcasts, including the Lex Friedman
(00:00:28)
podcast and the Joe Rogan Experience.
(00:00:30)
>> He coined the concept of AI safety,
(00:00:33)
which emphasizes developing AI
(00:00:35)
responsibly and without putting people
(00:00:37)
at risk.
(00:00:38)
>> They already have billions of dollars
(00:00:40)
like they can't compete in a normal way.
(00:00:42)
They can't, you know, try to do simple
(00:00:44)
things. You have to scale your
(00:00:45)
ambitious. And what is more ambitious
(00:00:47)
than playing God? The worst part they
(00:00:50)
can't quit. So right now the CEOs are
(00:00:53)
captured in this game where their
(00:00:55)
personal interest and global interest
(00:00:57)
are at align individually. Each one
(00:00:59)
wants to have everyone stop but someone
(00:01:02)
external has to pull the brain. They
(00:01:05)
cannot stop unilaterally. No one knows
(00:01:08)
how to work on controlling super
(00:01:10)
intelligence. Maybe because they don't
(00:01:11)
exist yet but also maybe because the
(00:01:13)
problem is impossible. Do you think we
(00:01:15)
have to focus on the extential risk more
(00:01:17)
than on other sure problems that we
(00:01:20)
coming like employment?
(00:01:22)
>> If you lose your job, you know what
(00:01:23)
happens or nothing happens. You get a
(00:01:25)
different job whatever like you get
(00:01:26)
unemployment. You know what happens if
(00:01:28)
everyone dies at the same time as we
(00:01:30)
have this hyperexponential progress and
(00:01:32)
capability. We're making barely any
(00:01:35)
progress in controlling those systems.
(00:01:37)
We cannot have an adversarial
(00:01:38)
relationship with super intelligence and
(00:01:40)
win. You'd likely not see any change in
(00:01:42)
your environment until lights out.
(00:01:45)
>> Roman, it looks like there is no
(00:01:46)
solution for this. So,
(00:01:50)
since the beginning of this channel and
(00:01:53)
this podcast, I always had a very
(00:01:55)
datomic view of AI. I always talk about
(00:01:58)
the great benefits that it brings us,
(00:02:00)
but also about the dangers that it has.
(00:02:03)
And if someone knows about the risks of
(00:02:06)
AI, that's Roman Yolski, the person that
(00:02:09)
invented the term AI safety. He's been
(00:02:12)
working for decades on AI safety and he
(00:02:15)
thinks there is no solution. If we
(00:02:17)
invent super intelligence, we will all
(00:02:19)
die. Today, we're going to talk with
(00:02:22)
Roman about why he thinks this way and
(00:02:25)
what is the things that we can still do
(00:02:27)
to avoid a dooming end. He's probably
(00:02:30)
one of the most doomer person in the AI
(00:02:33)
world with a pdoom of about 99%. So
(00:02:36)
definitely he thinks that there is no
(00:02:38)
way to avoid this unless we don't build
(00:02:40)
it. Enjoy the talk but most importantly
(00:02:45)
reflect over the talk.
(00:02:49)
Roman, you created the term AI safety
(00:02:52)
and you've been working on this for 15
(00:02:54)
20 years and you think that AI may kill
(00:02:58)
us all. Can we break this down? What is
(00:03:00)
what is that you think it's dangerous
(00:03:02)
from AI?
(00:03:03)
>> Advanced AI. So not the tools you have
(00:03:06)
right now, not your spell check or not
(00:03:08)
your GPS unit. We think and many people
(00:03:12)
in the industry agree that we are on a
(00:03:15)
verge of creating something human level
(00:03:18)
and very quickly going beyond that super
(00:03:20)
intelligence. At the same time as we
(00:03:23)
have this hyperexponential progress and
(00:03:25)
capability, we're making barely any
(00:03:28)
progress in controlling those systems.
(00:03:30)
So if you create something thousands of
(00:03:32)
times smarter than all humans who have
(00:03:34)
ever existed, but you don't know how to
(00:03:37)
control it, you can get some negative
(00:03:38)
outcomes out of it. That's that's a very
(00:03:41)
logical reasoning. But um someone would
(00:03:44)
say that we have created other
(00:03:45)
technologies before. What's the main
(00:03:48)
difference between all the technologies
(00:03:49)
that we had in the past and AI?
(00:03:51)
>> Tools versus agents. All the previous
(00:03:54)
inventions were tools. You invented a
(00:03:56)
wheel. You invented a knife. You
(00:03:58)
invented even nuclear weapons. Some
(00:04:01)
human somewhere had to deploy it. Had to
(00:04:03)
decide how to use it. Even if it's dual
(00:04:05)
use technology, a hammer can be used to
(00:04:08)
build a house or to kill someone. But a
(00:04:10)
human decides. What we are starting to
(00:04:12)
create are agents, independent decision
(00:04:15)
makers capable of setting up their own
(00:04:17)
goals or at least intermediate goals on
(00:04:20)
a path to the goal we set for them. We
(00:04:22)
don't control those intermediate goals.
(00:04:25)
So the decision is not predictable. A
(00:04:27)
lot of times it's like the difference
(00:04:29)
between guns and pitbulls. Guns don't
(00:04:32)
kill people. People with guns kill
(00:04:34)
people, but a pitbull decides which baby
(00:04:36)
to eat.
(00:04:37)
>> Okay. Okay, that makes lots of sense. Um
(00:04:40)
at what point did you get seriously
(00:04:43)
concerned because as I said you've been
(00:04:45)
working for a long time. When did you
(00:04:46)
start working on AI?
(00:04:48)
>> So I was doing my PhD on uh safety for
(00:04:52)
online casinos and preventing bots.
(00:04:54)
Okay.
(00:04:54)
>> So that was kind of early glimpses of
(00:04:57)
what what is to come. Uh once I started
(00:05:01)
kind of trying to predict what's coming
(00:05:03)
next capabilities
(00:05:05)
improvement within bots it became
(00:05:07)
obvious. Okay, there is a whole domain
(00:05:09)
here. We can work for a very long time
(00:05:10)
to make AI beneficial for humanity, safe
(00:05:13)
and secure. But once we started really
(00:05:16)
trying to understand capabilities of
(00:05:19)
systems smarter than humans, the
(00:05:21)
realization is that there is very little
(00:05:23)
we know how to do in that space and
(00:05:26)
every time we zoom in on some specific
(00:05:28)
problem, there is 10 additional
(00:05:29)
problems. It's kind of like a fractal
(00:05:32)
infinite
(00:05:34)
dimensional super vector of problems.
(00:05:37)
There is never oh we solved it we're
(00:05:39)
done we can go home there is nothing
(00:05:40)
else to do here it's just additional
(00:05:42)
problems at every level
(00:05:44)
>> and then at what point did you get
(00:05:47)
concerned
(00:05:49)
>> u a lot of those simple tools which you
(00:05:52)
would think you need to control an
(00:05:54)
advanced agent ability to explain its
(00:05:56)
behavior comprehend what it's doing
(00:05:58)
predict it we have a paper surveying
(00:06:02)
like 50 of those if all of them have
(00:06:04)
limits upper limits on what can be The
(00:06:07)
overall conclusion is maybe we can't
(00:06:09)
control something smarter than humans
(00:06:11)
indefinitely just because we don't have
(00:06:13)
the ingredients necessary to make this
(00:06:16)
possible.
(00:06:17)
>> So at at what point do you got uh really
(00:06:21)
concerned on your career like it was
(00:06:23)
like early on or it took a while until
(00:06:25)
you realized that was really dangerous?
(00:06:27)
>> So early years first five six years I
(00:06:30)
was working on making safe AI. I was
(00:06:33)
sure it's possible. We just need to
(00:06:34)
understand the nature of a problem,
(00:06:36)
figure out detail. So formalizing it was
(00:06:38)
part of it. Giving it even a meaningful
(00:06:40)
name. Previous names were kind of not
(00:06:43)
scientific enough in my opinion. But uh
(00:06:46)
formalizing the type of problems we're
(00:06:48)
likely to deal with at what stage at
(00:06:51)
training stage at deployment stage
(00:06:53)
listing all the possible issues value
(00:06:56)
alignment problems problems with data
(00:07:00)
communications not being ambiguous.
(00:07:02)
Basically surveying everything that can
(00:07:04)
go wrong. But once that part was
(00:07:08)
finalized the actual work had to begin.
(00:07:11)
And for every one of those we quickly
(00:07:13)
hit an upper limits.
(00:07:15)
>> Okay. And um what kind of scenarios you
(00:07:19)
have in your head that could like happen
(00:07:22)
like um they talk about we heard like
(00:07:24)
Hinto and Benjio talk about existential
(00:07:26)
risks and um how bad do you think this
(00:07:30)
can get because your your poom your
(00:07:32)
possibility of this ending in a bad
(00:07:34)
place is really high like 99%. So how
(00:07:37)
bad do you think is this scenario?
(00:07:40)
So I I think existential risk and
(00:07:42)
absolute possibility it can actually
(00:07:44)
wipe out humanity as a whole and there
(00:07:47)
are many reasons it can do it uh which
(00:07:50)
we cannot predict again not being super
(00:07:53)
intelligent it's not something we're
(00:07:55)
capable of understanding it's unknown
(00:07:57)
unknowns for us I can give some reasons
(00:08:00)
why from my point of view it would make
(00:08:02)
sense maybe it needs to modify the
(00:08:04)
planet maybe it needs to have data
(00:08:06)
centers at certain temperature and so
(00:08:08)
cooling down the whole planet is an
(00:08:10)
advantage. Maybe it's worried about us
(00:08:12)
creating competing super intelligence.
(00:08:15)
There is quite a few reasons I can think
(00:08:16)
of but none of them are important
(00:08:18)
because I'm not at that level of
(00:08:20)
thinking and will definitely miss the
(00:08:22)
real reasons.
(00:08:23)
>> Okay, because we are talking here in any
(00:08:25)
case whenever this becomes something
(00:08:28)
more intelligent than humans. Is this
(00:08:30)
like maybe we we have to define this for
(00:08:32)
the audience but is this the definition
(00:08:34)
of AGI or ASI? What is your
(00:08:36)
>> So a AGI typically is a human
(00:08:40)
intelligence. So it's a drop in
(00:08:41)
employee, someone you can take put in
(00:08:44)
your team, they'll do accounting, they
(00:08:46)
do taxes, whatever a human can quickly
(00:08:48)
learn. But if they start doing science
(00:08:50)
and engineering and develop the next
(00:08:53)
generation of AI, very quickly you have
(00:08:55)
this recursive self-improvement. It
(00:08:57)
becomes better and better at getting
(00:08:59)
better at learning at additional
(00:09:01)
scientific discoveries in that field. So
(00:09:03)
at that point it's smarter than any
(00:09:05)
human in any domain and it keeps getting
(00:09:07)
smarter. People usually stop thinking at
(00:09:09)
that step. So you got super intelligence
(00:09:11)
you done but you're going to get super
(00:09:13)
intelligence 2.0 3.0. The process will
(00:09:16)
continue indefinitely.
(00:09:18)
>> So AGI would not be necessarily an
(00:09:20)
existential risk. It will be when we go
(00:09:22)
beyond and we have an explosion of
(00:09:24)
intelligence that then it becomes much
(00:09:26)
much smarter than us to the point where
(00:09:28)
we cannot understand it and that's when
(00:09:30)
we lose control. So AGI could be quite
(00:09:33)
dangerous. People can use it to automate
(00:09:35)
a lot of crimes, a lot of terrorism, a
(00:09:38)
lot of military applications, but it's
(00:09:40)
still something we can probably
(00:09:42)
understand and compete with very quickly
(00:09:44)
in my opinion and people argue about it.
(00:09:46)
Uh slow takeoff, hard takeoff. Uh very
(00:09:50)
quickly once you have automated science
(00:09:52)
and engineering, you're getting
(00:09:53)
something beyond human capacity, super
(00:09:56)
intelligence. And that's where we're not
(00:09:58)
competitive. We cannot have an
(00:10:00)
adversarial relationship with super
(00:10:01)
intelligence and win.
(00:10:03)
>> I think what people doesn't really
(00:10:04)
understand is that
(00:10:07)
people tries to get their brain around
(00:10:09)
how is it going to happen but they don't
(00:10:12)
really understand that you cannot
(00:10:14)
imagine the things that something that
(00:10:16)
is much superior to your intelligence
(00:10:18)
would do. So basically uh is there any
(00:10:20)
way we can prevent or predict what the
(00:10:24)
scenario would be and we could avoid it
(00:10:26)
or there is absolutely no option.
(00:10:28)
>> I think that's the main difficulty. You
(00:10:31)
cannot that's the results we're getting.
(00:10:33)
Unpredictability is one of those
(00:10:35)
results. You cannot anticipate what a
(00:10:37)
smarter agent will do. You can only work
(00:10:40)
within your world model within your
(00:10:43)
framework. To understand it, it helps to
(00:10:45)
go the other way. So you are the smart
(00:10:47)
one. Now let's look at ants or squirrels
(00:10:49)
or something like that. Can they really
(00:10:51)
comprehend our plants what we're doing?
(00:10:53)
Can they out compete us in some
(00:10:55)
non-trivial domain? Of course not. Not
(00:10:58)
even close. And the gap is maybe I don't
(00:11:00)
know 90 AQ points. But if the gap is a
(00:11:03)
million AQ points, that's even worse,
(00:11:06)
right? Because then it gets to the point
(00:11:08)
where basically we cannot understand
(00:11:11)
what would be the reasons. But we don't
(00:11:13)
think on a scenario like Terminator
(00:11:15)
where AI is looking for us and killing
(00:11:17)
each one of us. But more likely that as
(00:11:19)
a side effect of their plans, we are
(00:11:21)
like as you said like the same that when
(00:11:24)
you have ants in your kitchen, you just
(00:11:26)
kill them. You don't really think too
(00:11:27)
much about them. So it's more like this
(00:11:29)
more than a terminator.
(00:11:30)
>> And we develop special chemicals to kill
(00:11:33)
pests who do it with special sounds and
(00:11:38)
things they definitely have no knowledge
(00:11:40)
of. And I think sufficiently advanced AI
(00:11:43)
can do novel physics research to
(00:11:44)
discover new ways to take us out. It
(00:11:46)
doesn't have to be synthetic biology or
(00:11:49)
chemical weapons, things we know about.
(00:11:51)
It could be something completely
(00:11:52)
unprecedented.
(00:11:53)
>> Yeah, I have the feeling that people
(00:11:54)
really get stuck there. They think like,
(00:11:56)
okay, so um yeah, this will a virus or
(00:12:00)
maybe we can fight this virus, but then
(00:12:01)
there is points where we will not even
(00:12:03)
realize that we are being wiped out.
(00:12:05)
you'd likely not see any change in your
(00:12:08)
environment until lights out. You're
(00:12:10)
just like everything's fine and then
(00:12:12)
Yeah.
(00:12:12)
>> Okay. Okay. That's that's quite that's
(00:12:15)
quite a scene. But um this scenario is
(00:12:18)
all about if we you said before if we
(00:12:21)
build ASI or AGI and then that becomes
(00:12:24)
ASI after right after that. No, but how
(00:12:28)
close are we to that? Can we believe
(00:12:30)
when they saying that Asia is around the
(00:12:32)
corner or we in a point where this is
(00:12:34)
nowhere near our lifetime?
(00:12:37)
>> I trust prediction markets. I trust
(00:12:40)
people at the top of labs who are saying
(00:12:42)
we're just a few years away. If you take
(00:12:44)
progress over the last 5 to 10 years and
(00:12:47)
projected forward, we're definitely
(00:12:49)
crossing that line of average human for
(00:12:52)
sure. If you look at the last three,
(00:12:55)
four years, we had something which was
(00:12:57)
probably elementary school child, then a
(00:13:00)
high school. Now college students
(00:13:02)
definitely
(00:13:04)
being challenged with the latest models
(00:13:06)
and even maybe PhD students and young
(00:13:09)
professors. So with the same rate of
(00:13:11)
progress, give it a year or two, you
(00:13:13)
have PhD level researchers. And I think
(00:13:15)
many labs are now stating explicitly
(00:13:17)
their goal is to create automated
(00:13:19)
researcher to help them move this
(00:13:21)
forward. So if it continues on the same
(00:13:24)
trajectory and the latest release from
(00:13:27)
Google they saying there is no slowing
(00:13:29)
down you still have uh scalability
(00:13:32)
at all levels at pre-training at post
(00:13:35)
training so it seems like we're going to
(00:13:37)
get there on schedule not not long ago
(00:13:40)
Sam Alman said that in September 26 we
(00:13:44)
will have a AI that can develop or can
(00:13:47)
help to develop new science and then 28
(00:13:50)
I think he it that it will do it by
(00:13:53)
itself totally independent. So that
(00:13:55)
would be the point that you will define
(00:13:56)
as AGI.
(00:13:58)
>> So again AGI is about automating most
(00:14:00)
human labor. Some people said useful
(00:14:02)
human labor but really all human
(00:14:04)
activities and today we're starting to
(00:14:07)
see reports from mathematics from
(00:14:09)
computer science from other domains
(00:14:11)
where AI is helping to make novel
(00:14:13)
discoveries. It's not a primary, it's
(00:14:15)
not a PI, but it's definitely assisting
(00:14:17)
top scholars.
(00:14:19)
>> And is the existential risk the only
(00:14:22)
risk from AI or do you think there is
(00:14:25)
other things like employment or like uh
(00:14:28)
the loss of truth? Is there any other
(00:14:30)
things that we have to worry because I
(00:14:32)
remember I watched um not long ago it
(00:14:34)
was maybe was more than a year ago there
(00:14:37)
was a debate in somewhere in Canada and
(00:14:40)
there was in one side it was um I
(00:14:42)
remember was Max Techmark and Yoshua
(00:14:44)
Benjio and in the other side it was Yan
(00:14:46)
Leun and Mitchell if I remember wrong
(00:14:49)
and one of the things they were saying
(00:14:50)
is like when you talk about existential
(00:14:52)
risks you're taking the attention away
(00:14:55)
from the short-term problems that AI is
(00:14:57)
going to bring. You talk a lot about
(00:14:59)
existential risk. Do you think we have
(00:15:01)
to focus on the existential risk more
(00:15:03)
than on other sure problems that we
(00:15:06)
coming like employment?
(00:15:07)
>> I think it was in Davos if I remember
(00:15:09)
correctly that I debate but it's
(00:15:11)
interesting. So it used to be that
(00:15:13)
people talked about short-term risk and
(00:15:15)
long-term risk
(00:15:16)
>> but I think they flipped.
(00:15:18)
>> So existential risk will come then we
(00:15:20)
get smarter than us systems and the
(00:15:23)
prediction is they're coming very soon.
(00:15:25)
risk to things like unemployment can
(00:15:27)
take much longer. Yeah, we have
(00:15:29)
capability to automate those jobs but it
(00:15:31)
takes very long time to deploy something
(00:15:33)
through economy. Okay, the example I
(00:15:36)
always use is video phones. AT&T had
(00:15:39)
working video phones in 1970s. Nobody
(00:15:42)
had them. There was no one you can dial
(00:15:44)
and they would pick up. So until phone
(00:15:46)
showed up, video phones were not
(00:15:48)
deployed. And it's the same. Take
(00:15:50)
self-driving cars. They exist today.
(00:15:52)
There are people around the world right
(00:15:53)
now in self-driving cars getting places.
(00:15:55)
I had to drive here. Why? Same exact
(00:15:59)
problem. So it may take decades to fully
(00:16:01)
deploy even existing AI capabilities
(00:16:04)
through the economy to get all the
(00:16:06)
benefits. But things like existential
(00:16:08)
risk actually coming sooner. That's one
(00:16:11)
argument. Second one is just the impact.
(00:16:15)
If you lose your job, you know what
(00:16:16)
happens or nothing happens. You get a
(00:16:18)
different job whatever like you get
(00:16:19)
unemployment. You know what happens if
(00:16:21)
everyone dies,
(00:16:22)
>> right? So you can't even compare to to
(00:16:25)
say that somehow this is competing
(00:16:28)
problems. Like
(00:16:30)
historically we had people who worried
(00:16:32)
about climate change and somebody was
(00:16:33)
cleaning up the snow.
(00:16:35)
>> Mhm.
(00:16:35)
>> Like okay they both doing things and you
(00:16:37)
can argue one is taking resources from
(00:16:40)
the other but they're not comparable.
(00:16:43)
>> They're not comparable but I think I
(00:16:46)
think it's very interesting that the
(00:16:47)
first argument you said that it may come
(00:16:49)
sooner. That would be definitely a
(00:16:50)
reason to worry about this because I
(00:16:52)
have the feeling that we have to worry
(00:16:54)
about the earlier problems first. That's
(00:16:56)
normally I I for example when I have two
(00:16:59)
kids uh 11 and nine and when I think
(00:17:02)
about the impact on AI and education I'm
(00:17:04)
worried about that but I'm more worried
(00:17:06)
about other things because I think
(00:17:06)
that's a 10-year problem. So I'm
(00:17:08)
thinking like there is many other things
(00:17:10)
we have to worry on the way. So I always
(00:17:12)
had the feeling that a very smart LLM
(00:17:16)
like Chad GPD is doing. It's already
(00:17:18)
like affecting some jobs. We've seen it
(00:17:19)
like couple of weeks ago like Amazon
(00:17:22)
fired 14,000 people. So we we see that
(00:17:25)
it's starting to affect the job market.
(00:17:27)
But I never considered and now you're
(00:17:29)
making me think and I like that that the
(00:17:32)
existential risk will come sooner than
(00:17:34)
the impact in economy or jobs because I
(00:17:37)
assumed always it was progressive. By
(00:17:39)
the time we had an AI that is good
(00:17:41)
enough to affect jobs, this will not be
(00:17:43)
good enough to wipe us out. But as
(00:17:46)
you're right, it will take time until
(00:17:48)
this AI is in every single company in
(00:17:50)
the world and they start firing all
(00:17:51)
their employees. So, so maybe between
(00:17:55)
the time that this happens, the other
(00:17:57)
one comes in. But that is assuming that
(00:18:00)
it happens. And that is my my biggest my
(00:18:03)
biggest problem with this theory is
(00:18:05)
about I am pretty pessimist about AI.
(00:18:07)
But I'm more pessimist about humans than
(00:18:09)
AI. So about that we will kill each
(00:18:11)
other more than AI killing us all. But
(00:18:14)
the my my main thing is that the
(00:18:16)
existential risk is something related to
(00:18:18)
if we get there. I'm still not 100% sure
(00:18:22)
confident that we will get to an AGI. I
(00:18:24)
think we may touch a ceiling. Do you
(00:18:26)
think there is any possibility on that
(00:18:28)
or you think it's clear that we're going
(00:18:30)
towards it?
(00:18:30)
>> I would be so surprised if we just at
(00:18:33)
this point hit like complete diminishing
(00:18:35)
returns. Every week there is a new paper
(00:18:38)
showing progress in sub submain sub
(00:18:40)
field. The resources they're building up
(00:18:43)
trillion dollars worth of
(00:18:44)
infrastructure. For all of it to have no
(00:18:47)
improvement over existing models would
(00:18:49)
be super surprising. But even if it was
(00:18:52)
close to that, they already smarter than
(00:18:54)
average people. I basically stopped
(00:18:56)
recruiting new human students. I see no
(00:18:58)
point. By the time they go through the
(00:19:00)
regular training process 2, three years
(00:19:02)
later, I don't think they're going to be
(00:19:03)
competitive with latest AI models.
(00:19:06)
>> Yeah, it happens the same in my
(00:19:07)
companies. Um, we stopped hiring juniors
(00:19:11)
because normally if you get a senior and
(00:19:13)
you give him like a license of Geminina
(00:19:15)
or Chajupi,
(00:19:17)
it's much better than giving him a like
(00:19:19)
an intern and and that is starting to
(00:19:22)
affect the job market. So So I'm I'm
(00:19:24)
really concerned about the job market,
(00:19:25)
especially in Spain, we have like 25%
(00:19:28)
unemployment on young people. We have
(00:19:30)
the highest in Europe and the total
(00:19:33)
unemployment is as well 10 point
(00:19:34)
something percent. So it's really high.
(00:19:36)
It's the highest in Europe. So obviously
(00:19:37)
in our market it is already like
(00:19:39)
saturated with unemployment. Um I think
(00:19:42)
this could be really catastrophic. So
(00:19:44)
I'm always really worried about that
(00:19:45)
thinking that the other problem may be
(00:19:47)
later on but as you're showing may not
(00:19:50)
be so much later on. No, I agree with
(00:19:51)
you. And there are other problems. You
(00:19:53)
brought up deep fakes, impact on
(00:19:54)
elections, impact on social uh meaning,
(00:19:59)
all sorts of problems, but they are easy
(00:20:02)
problems in the sense that we know what
(00:20:04)
the problem is and we have some ideas
(00:20:06)
for how to solve them. No one knows how
(00:20:08)
to work on controlling super
(00:20:09)
intelligence. Maybe because they don't
(00:20:11)
exist yet, but also maybe because the
(00:20:13)
problem is impossible. And typically
(00:20:15)
when you give people a choice of what to
(00:20:17)
work on, they pick something they can
(00:20:18)
show progress on. They can publish a
(00:20:21)
paper, they can do something. Whereas
(00:20:23)
with super intelligence, so far no one
(00:20:26)
published a paper, a patent, even a
(00:20:28)
rigorous blog post arguing this is how
(00:20:30)
we can control agents of any capability.
(00:20:33)
>> Yeah, that's that's surprising to be
(00:20:35)
honest because what we see is things
(00:20:38)
like Yan Lun for example, he said we
(00:20:40)
will not invent a car without inventing
(00:20:42)
the brakes first. But he's not saying
(00:20:44)
how we going to make brakes. So what do
(00:20:46)
you think of people that is like saying
(00:20:48)
like don't worry it will come by itself.
(00:20:50)
Um is that a good strategy?
(00:20:53)
>> It seems like a terrible strategy. So
(00:20:56)
you have to first show that at least in
(00:20:58)
theory it's possible. In principle it's
(00:21:02)
possible for lower capability agents to
(00:21:04)
control much higher capability agents
(00:21:06)
indefinitely. You don't know how to
(00:21:08)
implement it. That's fine. But there are
(00:21:10)
so many things we done in theory first
(00:21:12)
and then later we had uh developed
(00:21:15)
actual hardware and everything predicted
(00:21:18)
in theory. Physics is a great example.
(00:21:20)
We had understanding of you know space
(00:21:23)
satellites before we build space
(00:21:24)
satellites. Time delay everything was
(00:21:26)
pre-calculated ahead of time. Whereas
(00:21:28)
here no one is giving even a theoretical
(00:21:32)
explanation for how it will work and
(00:21:34)
then later we'll train it to do it.
(00:21:36)
>> And and why is that? Why is why are the
(00:21:38)
labs not investing on this if it's that
(00:21:41)
such an obvious problem?
(00:21:42)
>> I think they're investing and I think
(00:21:44)
there is a trillion dollars to whoever
(00:21:45)
solves it. I just think it's impossible.
(00:21:47)
You asking for someone to create a
(00:21:49)
perpetual safety device by analogy with
(00:21:52)
perpetual motion device. It's
(00:21:54)
impossible. You can make a specific
(00:21:56)
model very safe. GPT7 could be with
(00:21:59)
enough resources made very reasonable.
(00:22:03)
But what they're asking for is
(00:22:06)
every future model GPT 50, 200 with
(00:22:11)
every data set, every interaction, every
(00:22:13)
environment, self-improvement,
(00:22:15)
malevolent actors to never make a single
(00:22:18)
error. That's crazy. That's not going to
(00:22:22)
happen.
(00:22:23)
>> And even if if it's impossible, that's
(00:22:26)
the reason why they don't invest more in
(00:22:28)
safety because um I think Openi fired
(00:22:30)
the whole safety department. And I think
(00:22:32)
it was Gemini that they fired the whole
(00:22:35)
um ethics department.
(00:22:37)
>> Um what's the reason why they instead of
(00:22:40)
like increasing the investment trying to
(00:22:42)
fix this problem because I think like
(00:22:44)
the the protein folding problem, it
(00:22:46)
seemed like an impossible problem at
(00:22:47)
some point and then it got fixed and AI
(00:22:50)
helped us there. So So you say this is
(00:22:53)
probably an impossible problem then I
(00:22:55)
can understand why they not want to put
(00:22:56)
any dollar in it because if it's
(00:22:58)
impossible what's the point in investing
(00:23:00)
any money? I think protein folding
(00:23:01)
problem was always solvable with enough
(00:23:04)
compute. You just knew it was like
(00:23:06)
nplete or np hard or one of those where
(00:23:09)
you don't have enough compute. So with
(00:23:11)
smart uristics you can get
(00:23:12)
approximations which are still very
(00:23:14)
beneficial. Here if I give you infinite
(00:23:17)
compute we still don't know how to do
(00:23:18)
safety and that's a great test for it.
(00:23:21)
We know how to convert dollars to more
(00:23:25)
capability. If you gave me a trillion
(00:23:26)
dollars right now, I can train a very
(00:23:28)
capable model. Probably the best model
(00:23:31)
in the world without any new inventions.
(00:23:33)
>> Y
(00:23:33)
>> but I don't know how to convert dollars
(00:23:35)
to safety. And people who go give me a
(00:23:38)
billion dollars in three years I'll
(00:23:39)
solve it for you. They just want you a
(00:23:41)
billion dollars. They don't have a way
(00:23:42)
to actually do this conversion. And so
(00:23:44)
you're right there is if you read the
(00:23:46)
articles through the last decade there
(00:23:48)
is a whole graveyard of all this Google
(00:23:50)
ethics boards super intelligence
(00:23:52)
alignment teams all of them they invent
(00:23:55)
them and then they close them down
(00:23:58)
within months usually the super
(00:24:00)
alignment team said they're going to
(00:24:01)
solve alignment in four years.
(00:24:03)
>> Mhm.
(00:24:04)
>> They were done in like four months. And
(00:24:06)
you think there is no the only reasons
(00:24:08)
they find it to be impossible and then
(00:24:10)
it's not worth it to keep investing or
(00:24:12)
you think there is a lot of interest on
(00:24:13)
not having that.
(00:24:14)
>> So one it's hard to justify slowing down
(00:24:17)
in a competitive environment but two if
(00:24:20)
it's like perpetual motion that means
(00:24:22)
they all work on something correlated
(00:24:25)
better batteries better wires but never
(00:24:28)
on the actual device because the device
(00:24:30)
cannot be done. M
(00:24:31)
>> so maybe somebody notices that like guys
(00:24:33)
we give you all this money and all this
(00:24:35)
compute and you kind of putting filters
(00:24:37)
on the model you said don't say that
(00:24:39)
word don't talk about this topic that's
(00:24:41)
great but that's not going to get us to
(00:24:43)
a controlled model
(00:24:45)
>> yeah I think that I saw something like
(00:24:46)
that from entropic they released
(00:24:48)
>> um this kind of challenge for people to
(00:24:51)
go through 10 levels of alignment of the
(00:24:54)
model and then like just a couple of
(00:24:56)
like a week later someone had beat it.
(00:24:58)
So it's like if the whole entropic
(00:25:00)
cannot impede like one guy like pleing
(00:25:02)
the liberator in
(00:25:03)
>> jailbreaking
(00:25:05)
within 24 hours get jailbroken that's
(00:25:07)
just a given
(00:25:08)
>> that's that's like that's part of the
(00:25:11)
technology you know the way that LLMs
(00:25:13)
are you cannot really protect it because
(00:25:15)
at the end of the day the neural network
(00:25:17)
it just processes both your guard rails
(00:25:19)
and the prompt and the injection they're
(00:25:21)
trying to do at the same time. So at the
(00:25:23)
end I I don't know if it's as you say
(00:25:24)
maybe it's an impossible problem but
(00:25:26)
that that is what really worries me
(00:25:28)
because then the only alternative is
(00:25:29)
like to not build it.
(00:25:32)
>> Yeah you don't have to build a general
(00:25:35)
super intelligence. You want specific
(00:25:36)
problem solved build narrow super
(00:25:39)
intelligence systems. Protein folding
(00:25:41)
problem is exactly that. You had a
(00:25:44)
specific well- definfined problem. You
(00:25:45)
trained on data related to the problem.
(00:25:48)
It wasn't capable of playing chess or
(00:25:50)
driving cars. that was dedicated to
(00:25:51)
solving this problem and we did a great
(00:25:53)
job. You still can make money, win Nobel
(00:25:56)
prizes, all the benefits. You just don't
(00:25:58)
have to die in the process.
(00:26:00)
>> Okay. That's that's I think that reading
(00:26:02)
you and and and getting to know your
(00:26:05)
point of view, it has made me go through
(00:26:08)
a very hard time of thinking and getting
(00:26:11)
to realize something that I think you
(00:26:13)
say that is we don't need AGI. We need
(00:26:16)
just
(00:26:18)
AI that does certain things, but we
(00:26:20)
don't need a general AI that does
(00:26:22)
everything. It's like humans, we can
(00:26:24)
have all the benefits. We can cure
(00:26:26)
cancer. We can automate all jobs and
(00:26:29)
then live in abundancy without having a
(00:26:32)
technology smarter than us. So that's
(00:26:34)
your proposal like
(00:26:35)
>> that is the proposal. Start building
(00:26:37)
general, concentrate on tools. It is
(00:26:39)
cheaper. You don't need a giant model.
(00:26:42)
It is also probably more effective in
(00:26:44)
that narrow domain than a general one.
(00:26:46)
You can make it really optimized and
(00:26:48)
again no side effect of having to have a
(00:26:51)
huge safety team doing nothing for you.
(00:26:53)
>> Okay. But then you you think we don't
(00:26:56)
need that and we can be happy with GPT7
(00:26:59)
or whatever GPT7 that we'll do or Gemini
(00:27:02)
6 that we'll just do good enough and
(00:27:05)
then we have to stop there or you say
(00:27:07)
that we should get rid of LLMs totally.
(00:27:10)
I don't know all the capabilities of
(00:27:12)
existing models. The problem is that
(00:27:14)
testing and monitoring is also
(00:27:16)
impossible.
(00:27:17)
>> You can do some testing because you know
(00:27:20)
what to look for with narrow systems.
(00:27:22)
There are edge cases. You're looking for
(00:27:24)
I mean sometimes it's zero, sometimes
(00:27:26)
it's 100 weird outliers within the
(00:27:29)
testing set. With general models, there
(00:27:31)
is no edges. It's working across
(00:27:33)
multiple domains. So you don't know what
(00:27:35)
to look for to find the problems. You
(00:27:38)
find something, you fix it, you report,
(00:27:40)
I found seven bugs, I fix them, but it
(00:27:42)
doesn't say anything about what remains
(00:27:44)
within a model undiscovered.
(00:27:46)
>> Okay,
(00:27:47)
>> even after the model is released, we
(00:27:49)
still discover new capabilities.
(00:27:51)
>> Or if I tell it, you know, think deeper,
(00:27:53)
it works 10% better. Like what really?
(00:27:56)
That's crazy. So things like that. So we
(00:27:58)
never can guarantee even that existing
(00:28:00)
models don't have back doors, don't have
(00:28:03)
problems of that nature, cannot be
(00:28:05)
jailbroken with additional prompts. So I
(00:28:09)
I think stopping as soon as possible
(00:28:11)
would be great and then again switching
(00:28:14)
to narrow domain tools. And if I'm wrong
(00:28:17)
and tomorrow somebody publishes that
(00:28:19)
paper in nature saying this is how you
(00:28:22)
can control super intelligence doesn't
(00:28:24)
matter how smart it scales. Everyone
(00:28:26)
agrees obviously it's such a brilliant
(00:28:28)
paper then we can change our mind but
(00:28:30)
basically it has to be until you can
(00:28:33)
show your product or service is safe you
(00:28:35)
cannot build it and it's not my job to
(00:28:38)
show your product to be unsafe whatever
(00:28:40)
it's airline industry drug industry the
(00:28:44)
responsibility is with the product
(00:28:46)
creator
(00:28:48)
>> yeah that's something that um I was in
(00:28:50)
United Nations a couple of months ago
(00:28:52)
talking about AI and that was the thing
(00:28:54)
that I I really emphasize there is like
(00:28:57)
we need to start deciding what we want
(00:28:59)
AI to do and stop reacting to what AI
(00:29:01)
does. That's what seems that we're doing
(00:29:03)
as a society now. They just some Alman
(00:29:06)
or whoever they just go online they
(00:29:08)
publish a new model and then all of a
(00:29:10)
sudden we all just have to run to fix
(00:29:12)
things that we did not know that that
(00:29:14)
was happening. But the most interesting
(00:29:15)
part and I think you you're the right
(00:29:17)
person to talk about this is that the
(00:29:19)
main problem that people doesn't
(00:29:20)
understand and people is not aware of
(00:29:22)
that most of the audience I don't think
(00:29:24)
they really get it is that these are
(00:29:26)
like black boxes. Can you explain what
(00:29:29)
is uh why AI or general AI is like
(00:29:32)
Tajibd they are black boxes. What is
(00:29:34)
>> the first 50 years of AI research we
(00:29:37)
were engineering AI systems
(00:29:41)
simplistically saying they were decision
(00:29:43)
trees. Somebody wrote a bunch of if
(00:29:45)
statements. There was a knowledge
(00:29:47)
engineer and they said if this happens
(00:29:49)
do this otherwise do that. And you can
(00:29:52)
read and trace them. They they were
(00:29:54)
getting bigger and a little more complex
(00:29:56)
but you could always kind of figure out
(00:29:57)
what's going on. Neural networks are
(00:30:00)
very different. They are matrices of
(00:30:02)
numbers and they got really large large
(00:30:05)
language models. So there are billions
(00:30:07)
of nodes, trillions of weights. So
(00:30:09)
looking at them tells you nothing. You
(00:30:11)
can poke at it like we do in
(00:30:13)
neuroscience. You can isolate a single
(00:30:15)
neuron and go every time he sees a water
(00:30:18)
bottle, this thing lights up. So, it's a
(00:30:19)
neuron for detecting water bottles.
(00:30:21)
That's about the state-of-the-art right
(00:30:23)
now in both neuroscience and uh
(00:30:26)
understanding mechanistic
(00:30:28)
interpretation. You can do multiple
(00:30:30)
neurons, you can do clusters, but the
(00:30:32)
whole point I'm trying to make is the
(00:30:33)
upper limit on what we can comprehend.
(00:30:36)
If I tell you, well, here's an
(00:30:38)
explanation including billion variables,
(00:30:40)
it tells you nothing. You're going to be
(00:30:42)
just as confused. So, the real true
(00:30:45)
answer for how a model achieves a
(00:30:47)
certain goal, makes a certain decision,
(00:30:50)
is the model. So, I can give you all the
(00:30:52)
weights and you can look at it all you
(00:30:54)
want. It's not surveyable to you. So,
(00:30:56)
the alternative is lossy compression. I
(00:30:59)
can reduce it to top five reasons why we
(00:31:02)
denied you loan. It's useful information
(00:31:04)
but you you probably not going to get
(00:31:07)
full picture and we can hide information
(00:31:09)
from you using that. So that's what it
(00:31:11)
is today. Even people creating those
(00:31:15)
models don't fully comprehend how they
(00:31:18)
make decisions
(00:31:19)
>> because it's too complex.
(00:31:21)
>> It's too complex and it's not uh kind of
(00:31:24)
easy for humans to understand. The
(00:31:27)
format is just matrices numbers. So we
(00:31:31)
don't know what that represents a lot of
(00:31:33)
times.
(00:31:34)
>> So for me the most representative thing
(00:31:37)
about the black boxes is the emerging
(00:31:39)
capabilities when a model does something
(00:31:41)
that was not trained for. I remember the
(00:31:43)
first time I heard this concept I think
(00:31:46)
it was with one of the first versions of
(00:31:48)
Google Bard or something that they send
(00:31:50)
it only train in English and then after
(00:31:52)
some interactions with someone from
(00:31:54)
Bangladesh it started speaking the local
(00:31:56)
language. So basically the model did
(00:31:58)
something that was not trained for but
(00:31:59)
it was in its data set and then more
(00:32:02)
recently we have some examples from
(00:32:04)
entropic when clo they put it on the red
(00:32:06)
teaming and then they made it believed
(00:32:09)
that they were going to kill it or take
(00:32:10)
it off and then he looked into emails
(00:32:12)
and threatened with sending pictures
(00:32:14)
with the guy with his lover to his wife.
(00:32:16)
So these kind of abilities that they
(00:32:20)
come out they are a bit kind of funny at
(00:32:22)
the moment but of course they are really
(00:32:24)
worrying when the models get better. So
(00:32:28)
the reason that we don't understand
(00:32:29)
these models and they do things that are
(00:32:32)
like not predicted like that we don't
(00:32:34)
know there's different ones like I
(00:32:36)
remember when AMIA came out from Google
(00:32:38)
in 24 it was trained to help doctors to
(00:32:41)
diagnose better and then it ended up
(00:32:43)
diagnosing better by itself that if the
(00:32:45)
human was in the loop. So these kind of
(00:32:47)
capabilities are like you don't know
(00:32:49)
what the model is going to do until you
(00:32:50)
release it and then it seems like
(00:32:52)
they're releasing them without testing
(00:32:53)
them much. I think in open AAI the red
(00:32:56)
teaming time passed from like 6 months
(00:32:58)
to like 6 weeks. So it's not very
(00:33:02)
obvious that this is not very smart to
(00:33:04)
do like to release these models into the
(00:33:06)
market and and why do they do this? It's
(00:33:08)
just for the competitive advantage of of
(00:33:10)
the economic race or
(00:33:11)
>> Yeah. So two things. One is you brought
(00:33:15)
up this example of blackmail. That's not
(00:33:17)
an emergent unpredicted behavior. That's
(00:33:20)
exactly what they expected. That's why
(00:33:21)
they set it up this way. That's the only
(00:33:23)
logical thing for a rational agent to
(00:33:25)
do. You're going to try to take
(00:33:27)
advantage of opportunities there. We
(00:33:30)
didn't make it safe in terms of not
(00:33:32)
being unethical. So that's why I did
(00:33:35)
exactly what we expect.
(00:33:36)
>> But it was not prompted to do it.
(00:33:38)
>> Right. Right. But we we know that it is
(00:33:40)
capable. When I'm thinking about
(00:33:42)
emerging behaviors, unknown unknowns, it
(00:33:44)
discovers something no human even
(00:33:47)
considered possible. Like if it
(00:33:48)
discovered public key cryptography and
(00:33:50)
we didn't have it, I'd be like, "Oh,
(00:33:52)
wow. That's that's pretty new."
(00:33:54)
>> There's a theory about that. No, that AI
(00:33:56)
is already like very intelligent is
(00:33:58)
playing dumb to just let it in. Do you
(00:34:00)
think that could be a possibility?
(00:34:02)
>> But another part of your question, why
(00:34:04)
do they release and it's worse? They do
(00:34:06)
the evals, they do red teaming, they
(00:34:09)
find that it's lying, blackmailing,
(00:34:11)
cheating, trying to escape and then they
(00:34:14)
release it anyways. What was the point?
(00:34:16)
Like I used to be supportive of a V. Now
(00:34:19)
I'm against it. Like you're just helping
(00:34:21)
them develop this more dangerous model
(00:34:23)
and then they can always say, well, we
(00:34:25)
did the testing. We have a report. We
(00:34:27)
staple a report to the model and release
(00:34:29)
it. Why? What is the purpose of this
(00:34:33)
report? you're telling me you have an
(00:34:34)
unsafe product and then we say oh new
(00:34:36)
model is coming in two months you
(00:34:38)
haven't fixed this one you don't know
(00:34:40)
how to fix that
(00:34:41)
>> so what is the reason behind it is just
(00:34:43)
the economical race like do you think
(00:34:44)
it's economic reasons
(00:34:46)
>> yeah you cannot get behind the money
(00:34:47)
will go to the most advanced model so
(00:34:49)
open AI is now losing to Google so they
(00:34:51)
going to do everything they can to beat
(00:34:54)
them in that competition so if they had
(00:34:56)
six weeks of testing probably it will
(00:34:58)
six days I don't know
(00:35:00)
>> but it's it's hard for me to believe
(00:35:02)
that I Maybe I'm very naive but it's
(00:35:04)
hard for me to believe that the only
(00:35:06)
motivation for uh Demi Savis for Sundar
(00:35:10)
Pichai for like Sam Alman uh Sati
(00:35:13)
Nadella is just money why they have a
(00:35:17)
lot of money already like is it do you
(00:35:19)
think it's just money that
(00:35:21)
>> it's not money there is so much more
(00:35:22)
they literally talk about it this is
(00:35:24)
power over the Litecoin of the universe
(00:35:26)
they think they become gods with it you
(00:35:29)
are the guy who invented God and there
(00:35:31)
is a small chance it like your god who
(00:35:34)
remembers the favor. They already have
(00:35:36)
billions of dollars. Like they can't
(00:35:38)
compete in a normal way. They can't, you
(00:35:40)
know, try to do simple things. You have
(00:35:42)
to scale your ambitious. And what is
(00:35:45)
more ambitious than playing God?
(00:35:48)
>> That sounds more like plausible. Like
(00:35:51)
it's it's obviously like someone that
(00:35:53)
has everything. The only thing they
(00:35:55)
don't have is all the power. So it
(00:35:57)
sounds like they could be very
(00:35:58)
narcissistic. And and
(00:36:00)
>> there's more to it. You don't want to be
(00:36:01)
a loser. In that race, you want to win.
(00:36:03)
Those are very competitive people. They
(00:36:05)
always want to at least show good
(00:36:07)
effort. And the worst part, they can't
(00:36:11)
quit. If any one of them says, "I'm
(00:36:13)
going to stop research. We're going to
(00:36:15)
do something else narrow." The investors
(00:36:18)
will replace them immediately
(00:36:20)
>> as the CEOs, you mean?
(00:36:21)
>> Right. So Sam cannot stay there and stop
(00:36:24)
doing AI development.
(00:36:26)
>> But that's like a soldier that doesn't
(00:36:28)
want to go to war and then basically it
(00:36:30)
gets replaced by another soldier. Mhm.
(00:36:32)
>> So they are trapped in this situation.
(00:36:34)
They part of a larger system which just
(00:36:38)
marches towards more advanced
(00:36:40)
intelligence.
(00:36:41)
>> Roman, it looks like there is no
(00:36:43)
solution for this. So um the only way
(00:36:46)
that we could stop this race is if the
(00:36:50)
government they decide to put some
(00:36:52)
regulations. Is that the path that you
(00:36:54)
think that we have to take?
(00:36:56)
>> So just last week our government, US
(00:36:58)
government decided 2 years was too long.
(00:37:01)
we need to accelerate and they decided
(00:37:02)
to start large Manhattan-like project to
(00:37:05)
accelerate US AI efforts. I think they
(00:37:08)
fighting right now for states not to be
(00:37:10)
allowed to have any laws about AIS at
(00:37:13)
federal level. They want to prevent
(00:37:15)
that. So that's the state-of-the-art in
(00:37:16)
the most powerful AI country in the
(00:37:18)
world. So basically we have like a
(00:37:20)
problem that we cannot solve and we are
(00:37:24)
kind of running towards it and the
(00:37:26)
government instead of like telling us to
(00:37:28)
hey hey hey let's take it slower it's
(00:37:30)
just throwing more gasoline in the fire.
(00:37:31)
>> It tells us to run faster. You're not
(00:37:33)
running fast enough. We can help you.
(00:37:35)
We'll get you a car to get there sooner.
(00:37:38)
>> And and why is no one seeing this? Why
(00:37:41)
is
(00:37:41)
>> I'm seeing it.
(00:37:42)
>> Yeah, I know. But I mean like beyond you
(00:37:44)
and like obviously I seen Benjio Hinton
(00:37:46)
and many more big names that that you
(00:37:48)
guys are talking about this. Why is no
(00:37:50)
one else seeing it? Like um the the
(00:37:52)
scientists I I talked like we had in the
(00:37:54)
podcast Lucas Kaiser. He was one of the
(00:37:57)
eight from the Transformers paper now he
(00:37:59)
works in research in OpenAI. They don't
(00:38:02)
see that as a possibility. They think
(00:38:03)
what they're doing is good for the
(00:38:04)
world. They don't see this outcome. Um
(00:38:08)
there is some rebuttals. No, there's
(00:38:10)
some things that people say and and I
(00:38:12)
would like to get your opinion about it.
(00:38:13)
Like many people I think that's more
(00:38:16)
like not prepared people like scientists
(00:38:19)
don't usually say that but the general
(00:38:20)
public when you talk about dangers of AI
(00:38:23)
they say well I can always unplug it.
(00:38:26)
What is your vision about the unplugging
(00:38:29)
theory?
(00:38:30)
>> So unplugging is easy. It's obvious if
(00:38:32)
you look at other technologies like uh
(00:38:35)
computer viruses or bitcoin network you
(00:38:37)
cannot unplug it. You wish you could but
(00:38:40)
you can't. But regular people are
(00:38:43)
actually smarter than many of those
(00:38:45)
computer scientists. They understand it
(00:38:47)
could be dangerous and then they
(00:38:49)
surveyed they say don't build it. So in
(00:38:52)
a way they have a much more intuitive
(00:38:54)
understanding. If you build something
(00:38:56)
smarter than us and they can take
(00:38:58)
trivial case our jobs that's not good
(00:39:01)
for me. So if we have a country which is
(00:39:03)
now all about limiting immigration they
(00:39:05)
stealing our jobs. You're gonna have
(00:39:07)
this billion super intelligent PhD in
(00:39:10)
physics workers coming around. What do
(00:39:13)
you think is going to happen to your
(00:39:14)
jobs?
(00:39:16)
>> Yeah, but still people I meet many
(00:39:18)
people that they think they can just
(00:39:20)
unplug it because at the end they think
(00:39:21)
this is just something that runs on the
(00:39:24)
server. So if you take off the power
(00:39:26)
this thing can't work anymore. Like can
(00:39:28)
we really unplug AI if it gets bad?
(00:39:31)
>> Let's let's play through this. For one,
(00:39:33)
the dependence on this technology. If
(00:39:35)
they already control your power plants,
(00:39:37)
your airlines, your stock market,
(00:39:40)
unplugging it is a disaster. Every time
(00:39:42)
we had a computer security problem,
(00:39:45)
nobody can fly anywhere. Nobody can get
(00:39:47)
paid. So, it's already a significant
(00:39:48)
impact. That's not awesome. Okay, you
(00:39:51)
unplugged it. What happens next? You
(00:39:52)
don't think they're going to plug it
(00:39:53)
back in 5 minutes later because they
(00:39:55)
fixed that bug? You fixed nothing. Now,
(00:39:58)
they're just doing it in a different
(00:39:59)
server room. So none of it is a
(00:40:01)
long-term solution. It's good to have
(00:40:04)
ability to shut down chips to limit
(00:40:08)
power. All those were proposed in our
(00:40:10)
early papers on AI boxing well over a
(00:40:13)
decade ago. But everything we
(00:40:15)
recommended in that paper has been
(00:40:17)
violated as a direction. So we said, you
(00:40:20)
know, keep it limited from access to
(00:40:24)
internet. Don't plug it into the
(00:40:27)
internet. First thing they did is put it
(00:40:28)
on the internet. everyone gets access.
(00:40:31)
We said don't open source it. Obviously,
(00:40:34)
you can get any model you want now. So,
(00:40:36)
every recommendation we made has been
(00:40:38)
completely violated. People ask me about
(00:40:41)
AI containment and I'm like that's dead.
(00:40:43)
There is no one even in a position to go
(00:40:46)
back on that.
(00:40:48)
>> And um
(00:40:50)
there is this other theory where people
(00:40:53)
thinks that okay this will never get
(00:40:57)
that smart. It's like it's going to keep
(00:41:00)
scaling. It's going to keep on like
(00:41:01)
getting better and better and better,
(00:41:02)
but it will never get there. Like people
(00:41:04)
sometimes they refute the danger because
(00:41:07)
they think it's too far. But then as you
(00:41:09)
say, according to the current progress,
(00:41:12)
do you have stated some timeline of
(00:41:15)
things that you think may happen?
(00:41:16)
Obviously, no one has a crystal ball,
(00:41:19)
but but you're very educated on this.
(00:41:22)
What are the dates that you see in front
(00:41:24)
of us? like 2027 I think you say what's
(00:41:27)
going to happen in 2027
(00:41:28)
>> so again I don't make independent
(00:41:30)
predictions I follow prediction markets
(00:41:32)
for different definitions of AGI we
(00:41:34)
expect AGI around there if you have a
(00:41:37)
harder definition say 2030 doesn't
(00:41:39)
matter we're still 5 10 years away from
(00:41:42)
this level of capability
(00:41:44)
the people like that I always uh ask
(00:41:47)
them a simple question give me a
(00:41:49)
specific capability
(00:41:51)
where we can make a prediction bad You
(00:41:54)
are saying that ever, never, not in 5
(00:41:57)
years will AI be able to do X. Tell me
(00:42:01)
specifically what X is. Don't tell me
(00:42:03)
it's love. It's something abstract.
(00:42:05)
Specific skill you can describe that
(00:42:08)
skill. And so far I haven't seen
(00:42:10)
anything where only a human can do this.
(00:42:13)
There is always
(00:42:14)
>> there's many people that thinks that
(00:42:15)
they they will not be replaced.
(00:42:18)
>> Oh, everyone thinks they're not going to
(00:42:19)
be replaced. I ask Uber drivers and they
(00:42:21)
say never. Only I know the streets of
(00:42:23)
New York City like that.
(00:42:25)
Professors think they cannot be
(00:42:27)
replaced. It's hilarious.
(00:42:30)
>> It's hilarious. Not that funny because
(00:42:32)
it's it's like really if it was not
(00:42:33)
existential
(00:42:34)
>> trivial of jobs where we know like a
(00:42:36)
podcaster like we have those artificial
(00:42:40)
translations with extra questions at it
(00:42:42)
generated guess all of that is available
(00:42:44)
today. It's like no the way I ask
(00:42:47)
questions they will never like no
(00:42:48)
offense but like it is very easy to
(00:42:50)
automate. We can look at every interview
(00:42:52)
you did. We can look at what worked,
(00:42:54)
analyze data, pull from all the top
(00:42:57)
podcast and create super podcast.
(00:42:58)
>> Yeah, it's already it's already part of
(00:43:00)
it. It's already part of the game. Like
(00:43:02)
um we would not be able to do those
(00:43:04)
podcast the way we do them without AI.
(00:43:06)
So obviously that's something I I
(00:43:08)
realized very early. You know, I'm I'm a
(00:43:10)
photographer by profession. So 3 years
(00:43:13)
ago when I started into AI, uh one of
(00:43:15)
the first thing I saw was photography
(00:43:17)
was gone. Like it was obviously very
(00:43:19)
going to be gone very quickly. And now
(00:43:21)
with Nano Banana Pro, I think people
(00:43:23)
starting to realize uh and and I think
(00:43:26)
to me it was very clear always that if
(00:43:29)
this was that bad and 6 months later was
(00:43:31)
that much better, it was just a matter
(00:43:33)
of time, not a matter of if. So so I
(00:43:36)
think every single skill can be
(00:43:37)
automated. But 2027, I have to admit it
(00:43:40)
sounds uh very early. But of course, we
(00:43:43)
are very bad. Human brain is really bad
(00:43:45)
at exponentials. We're very good at
(00:43:47)
linears. We're very bad at exponentials.
(00:43:49)
But if you look at what happened in the
(00:43:51)
last year or the last 3 years. I think
(00:43:53)
it was Hinton that said recently that he
(00:43:55)
cannot predict what will happen in 10
(00:43:56)
years but he can tell you what happened
(00:43:58)
in the last 10 years and from looking at
(00:44:00)
that no one 10 years ago was able to
(00:44:04)
think even where we will be nowadays. So
(00:44:07)
no one today can really think about
(00:44:09)
where we will be in 10 years. But
(00:44:12)
anyway, um 2027 it sounds really early
(00:44:15)
and even myself that I think I am
(00:44:17)
informed in AI and I am pretty
(00:44:19)
pessimistic about AI. So I see the bad
(00:44:21)
side of it and not just the the
(00:44:22)
benefits. I still think it's very soon
(00:44:24)
but I think inside me there is a little
(00:44:28)
space that I think like actually could
(00:44:30)
happen in that short time like how sure
(00:44:32)
are you that we will get there in like
(00:44:34)
>> let's do this experiment. Let's go back
(00:44:36)
remember yourself 20 years ago. 20 years
(00:44:39)
ago
(00:44:39)
>> you ask me do you guys have AGI today
(00:44:42)
and I go yeah we have systems which can
(00:44:44)
do and I list to you all the things they
(00:44:46)
can do they can do nowadays
(00:44:47)
>> yeah do you think we have AGI
(00:44:49)
>> absolutely
(00:44:50)
>> so why are we still arguing how soon
(00:44:52)
before AGI
(00:44:54)
>> because I think we moving the goalpost
(00:44:56)
no we keep on forward as we evolve
(00:44:59)
>> it used to be you had to be an average
(00:45:00)
person now like oh you only speak a 100
(00:45:03)
languages play every instrument and can
(00:45:05)
program in seconds no you're not
(00:45:07)
intelligent you have to also invent new
(00:45:09)
physics
(00:45:11)
>> but even with that uh a like AI nowadays
(00:45:15)
is so stupid at some things so I don't
(00:45:18)
know if it's we are aiming for because
(00:45:21)
at some things I think we have AGI in
(00:45:23)
certain domains like memory there's no
(00:45:26)
one that can memorize like AI does but
(00:45:29)
in certain domains AI is very stupid I
(00:45:32)
think one of the best examples of this
(00:45:34)
is that in 2023 AI was absolutely
(00:45:37)
rubbish in math and coding and nowadays
(00:45:41)
it's absolutely amazing. I think Gemini
(00:45:43)
3 was a really big jump on coding. So in
(00:45:46)
3 years it went from being absolute
(00:45:48)
rubbish to being like excellent at math
(00:45:51)
and coding and that was only in 3 years.
(00:45:55)
So I assume that there is some things
(00:45:57)
where AI on the next two or three years
(00:46:00)
will do the same on different fields and
(00:46:02)
then obviously it gets more general over
(00:46:04)
time. So to me it feels like it's
(00:46:07)
impossible that in two three years we
(00:46:10)
are at that level. But at the same time
(00:46:12)
I see if I look back I'm like well it's
(00:46:15)
been quite all right and if this is
(00:46:17)
going more and more because that's one
(00:46:18)
of the things that Lucas told us in the
(00:46:19)
podcast he was like well praise
(00:46:21)
yourselves for 2026 because the
(00:46:24)
reasoning paring it will scale a lot and
(00:46:26)
that will make a big difference. And um
(00:46:29)
yeah I don't I don't I think people
(00:46:31)
feels we already passed the fastest part
(00:46:34)
of the curve and we are kind of like
(00:46:36)
arriving to the plateau but it's not the
(00:46:39)
reality. So I I think we have very high
(00:46:42)
expectations of AI
(00:46:44)
and we forget how dumb people are. Take
(00:46:47)
an average person and then see are they
(00:46:49)
do really dumb things in some domains.
(00:46:52)
We know they are general intelligences.
(00:46:54)
They are the gold standard for what we
(00:46:56)
have as human level, right? But like
(00:46:58)
most people can't even remember more
(00:47:00)
than seven digits.
(00:47:02)
>> Yeah.
(00:47:02)
>> Like that's horrible for a computational
(00:47:05)
agent and um most people don't speak a
(00:47:09)
foreign language. I cannot play a
(00:47:10)
musical instrument. If you use that as
(00:47:13)
like, well, look how stupid he is. Our
(00:47:16)
standards for intelligence would have to
(00:47:18)
be re-evaluated. We are struggling right
(00:47:21)
now to find challenging tests for the
(00:47:24)
latest models. They are maxing out every
(00:47:26)
benchmark, every test. I think like the
(00:47:29)
last exam or whatever that at half pass.
(00:47:32)
So I I I think the arguments are just
(00:47:35)
not supported by what we're observing.
(00:47:38)
>> What do you think about Gemini 3?
(00:47:39)
Because it's been quite impressive,
(00:47:40)
especially in the humanity last exam. It
(00:47:42)
just almost doubled the result of GCP 5.
(00:47:45)
>> It is an incredible model. I haven't had
(00:47:48)
a chance to test it sufficiently, but I
(00:47:51)
think the main interesting result is
(00:47:54)
that it supports that scalability is not
(00:47:56)
dead despite some recent interviews from
(00:47:59)
top researchers. I think it's uh more
(00:48:02)
alive than ever.
(00:48:02)
>> But Ilia corrected because I think it
(00:48:05)
was misunderstood on his podcast. I
(00:48:07)
assume you you refer to Ilia.
(00:48:09)
>> Um in the podcast he said that it's not
(00:48:13)
anymore the age of scaling. we back in
(00:48:15)
the age of research but then everyone
(00:48:18)
took it as like and some other top
(00:48:20)
researchers they made fun of I've been
(00:48:21)
saying this for 10 years and and these
(00:48:23)
kind of things that always happen but um
(00:48:25)
he just made a post actually today we
(00:48:27)
saw it in the plane where he's saying um
(00:48:29)
this was misunderstood what I meant is
(00:48:32)
that LLMs are going to keep scaling and
(00:48:34)
keep improving over time but to reach
(00:48:36)
AGI we may need something else so
(00:48:38)
basically he talks about research
(00:48:40)
towards ASI which is his company but he
(00:48:44)
doesn't denies that LLMs are going to
(00:48:45)
keep improving and and that is like a
(00:48:48)
very sensitive way of showing that two
(00:48:52)
things can be true at the same time like
(00:48:54)
the paradigm can keep a scaling but to
(00:48:56)
reach AGI we may need something else so
(00:48:58)
I think it's a it's a good it's a good
(00:48:59)
point you what do you think about LM in
(00:49:01)
general like do you think they keep
(00:49:03)
scaling until we reach AGI or you think
(00:49:05)
that we need other things as well to
(00:49:08)
>> it seems like as long as there is no
(00:49:11)
diminishing returns uh we don't need
(00:49:14)
Anything else? Now, are there many other
(00:49:16)
architectures which could get us to the
(00:49:19)
same thing? Absolutely. Are there
(00:49:21)
different training methods? Can we use
(00:49:23)
evolutionary computation? Can we just
(00:49:25)
copy from neuroscience or human brain
(00:49:28)
more? Can we switch from human brain
(00:49:30)
architectures to crows and other animals
(00:49:33)
which may have denser neural networks?
(00:49:35)
Probably. But does this work with no new
(00:49:38)
inventions and just more money? So far,
(00:49:41)
yeah.
(00:49:42)
>> Okay. So in your timeline, we go back to
(00:49:44)
the timeline. Um you said 2027 or more
(00:49:48)
or less year up, year down. Um we get
(00:49:50)
this AGI or this like superhuman
(00:49:53)
capability on AI which can basically
(00:49:55)
start affecting seriously the cognitive
(00:49:57)
work. When in the timeline we get to the
(00:50:01)
point that blue collar gets automated
(00:50:03)
like robots when do robots are going to
(00:50:06)
get into the field and be actually
(00:50:08)
useful.
(00:50:09)
>> So many companies are working on
(00:50:11)
humanoid robots. I don't have internal
(00:50:14)
access to the latest models. From what I
(00:50:16)
see, it looks like maybe in 5 years they
(00:50:18)
would be both capable and affordable.
(00:50:20)
But again, I I don't have good
(00:50:23)
understanding of commercial side of
(00:50:24)
deployment versus just capability in a
(00:50:26)
lab. Like flying cars are nowhere to be
(00:50:30)
found. They also for sale right now and
(00:50:32)
any
(00:50:33)
>> anyone with money can buy one today.
(00:50:36)
>> So do we have flying cars or not?
(00:50:38)
>> Do we do? Yeah.
(00:50:39)
>> Right. So I think it's the same with
(00:50:40)
robots in 5 years. I think people who
(00:50:43)
would want them would be able to secure
(00:50:44)
one. But will every person have a free
(00:50:48)
assistant?
(00:50:49)
>> More like an economic decision than a
(00:50:51)
technical decision.
(00:50:52)
>> Economic also convenience. Maybe people
(00:50:54)
feel uncomfortable. There could be so
(00:50:56)
many other factors preventing deployment
(00:50:58)
through economy. But technology I think
(00:51:00)
will be there. And if you have this
(00:51:02)
technology then yeah you can do you know
(00:51:04)
plumbing even.
(00:51:06)
>> Yeah. And I that's something I always
(00:51:08)
argue
(00:51:10)
the opinions of Hinton where he always
(00:51:12)
when he gets asked what jobs are safe he
(00:51:15)
talks to plum about plumbers and I don't
(00:51:17)
see it I don't see any point this maybe
(00:51:20)
leads us to talk about about the job
(00:51:21)
market but um I don't see any point
(00:51:24)
about anyone pivoting their career
(00:51:26)
because it's such a short term before
(00:51:29)
all the careers get affected that you
(00:51:31)
will not have time to reskill to do
(00:51:33)
another profession. So, so at the end
(00:51:35)
it's not about what job is safe. It's
(00:51:37)
about assuming that no job will be safe.
(00:51:39)
I have the feeling that the job market
(00:51:42)
is going to be maybe one of the bigger
(00:51:44)
impacts that we will see in daily life
(00:51:46)
of people and I think we already
(00:51:48)
starting to see some signs of it with
(00:51:50)
all these uh tech firings like just
(00:51:54)
recently in Spain, Telefonica which is
(00:51:56)
one of the biggest telecom companies
(00:51:58)
they just announced a lot of like
(00:52:00)
layoffs and we keep seeing it. I think
(00:52:03)
this October
(00:52:05)
in America according to the challenger
(00:52:06)
report was the month with more layoffs
(00:52:09)
and less hirings for more than a decade.
(00:52:12)
So I think we're starting to see some
(00:52:14)
effects of AI. Um
(00:52:18)
do you think by this will get to
(00:52:21)
people's lives in time or as you said
(00:52:24)
before it may take much longer for
(00:52:26)
companies to put the technology in that
(00:52:28)
ends up affecting people works because I
(00:52:30)
have the feeling that probably in two
(00:52:32)
years we will see lots of layoffs and
(00:52:35)
but what you're saying maybe it's making
(00:52:36)
me question that because maybe the
(00:52:38)
technology is there in the lab but until
(00:52:41)
random company these studios uh they
(00:52:45)
implemented that it may take longer and
(00:52:46)
then by then we have other problems.
(00:52:49)
>> Yeah, deployment can take a very long
(00:52:50)
time. Again, there is lots of drivers,
(00:52:52)
truck drivers, Uber drivers and we have
(00:52:54)
self-driving cars. So clearly it takes
(00:52:56)
time to propagate but uh I'm always more
(00:53:00)
concerned about the big problems fewer
(00:53:03)
people work on. So I try to concentrate
(00:53:04)
on existential risk on suffering risk
(00:53:07)
and there is no shortage of people
(00:53:08)
talking about unconditional basic income
(00:53:11)
about algorithmic bias about deep fakes
(00:53:14)
about all those issues we already have
(00:53:16)
today and we can definitely work on
(00:53:18)
improving but I think there also upper
(00:53:20)
limits and our ability for example to
(00:53:23)
detect if something is fake or real.
(00:53:26)
>> Okay. So so what is when you talk about
(00:53:28)
suffering risks and not existential what
(00:53:30)
is the difference? So suffering risk is
(00:53:33)
let's say you get immortality
(00:53:37)
you live forever but you live in hell.
(00:53:40)
>> So you wish you suffered existential
(00:53:42)
risk.
(00:53:43)
>> Okay so it's even worse than
(00:53:44)
>> it's worse it's suffering it's torture
(00:53:46)
for whatever reason there is malevolent
(00:53:48)
payload and the system tries to inflict
(00:53:50)
maximum damage maximum suffering. Maybe
(00:53:54)
through neural link it has access to
(00:53:56)
your internal brain states. It knows
(00:53:58)
your fears. It knows your pain centers.
(00:54:00)
It is really not a fun place to be.
(00:54:02)
>> So it's a higher level of risk than the
(00:54:04)
existential risk to some sort.
(00:54:06)
>> So most people prefer
(00:54:08)
>> existential
(00:54:09)
>> euphanasia over long-term suffering.
(00:54:11)
>> Okay. Okay. And you're working on this
(00:54:13)
field at the moment like trying to just
(00:54:15)
warn people because there is no solution
(00:54:17)
for it.
(00:54:17)
>> We're trying to understand if there is
(00:54:19)
correlation between the two. We're
(00:54:21)
trying to understand what might cause
(00:54:22)
that. It seems like very weird thing to
(00:54:25)
want for a super intelligent agent but
(00:54:28)
it's not zero chance. So it's worth
(00:54:30)
looking at.
(00:54:31)
>> So also people watching us today, they
(00:54:33)
must be thinking you are against AI. But
(00:54:36)
that's not the case.
(00:54:37)
>> I mean I'm a scientist. I'm an engineer.
(00:54:39)
I wrote books about AI, published many,
(00:54:42)
many papers not related to AI safety and
(00:54:44)
AI. I love technology. I use AI every
(00:54:47)
day.
(00:54:47)
>> So there's kind of a dichotomy where you
(00:54:50)
can be in favor of AI.
(00:54:52)
>> No, we're just misusing the term. We're
(00:54:55)
using the same term to mean useful tool
(00:54:57)
used by a person and super intelligent
(00:55:00)
godlike machine we have no control over.
(00:55:03)
>> Okay.
(00:55:04)
>> When you say you like dogs and I say I
(00:55:07)
hate dogs, you're talking about cute
(00:55:09)
puppies. I'm talking about vicious
(00:55:11)
pitbulls. You cannot use the same word
(00:55:13)
for different things. AI is awesome.
(00:55:16)
It's helpful. It's the best tool for so
(00:55:18)
many things. It's going to get better,
(00:55:20)
more useful, but as long as it stays as
(00:55:23)
a useful tool. The moment there is a
(00:55:25)
paradigm switch from tools to agents we
(00:55:28)
don't control, it's a completely
(00:55:30)
different word. Whatever you want to
(00:55:32)
call it, super intelligence,
(00:55:34)
uncontrolled AI, but it's not the same
(00:55:37)
concept. So there is no confusion.
(00:55:40)
There's no conflict.
(00:55:41)
>> There is no conflict whatsoever. No
(00:55:43)
cognitive dissonance. I love technology
(00:55:46)
and I hate chemical weapons, synthetic
(00:55:49)
biology, nuclear weapons. They are not
(00:55:52)
technology.
(00:55:53)
>> But that's something that happened
(00:55:54)
before with other fields like nuclear
(00:55:56)
where we used it for energy but at the
(00:55:58)
same time we made an atomic bomb and and
(00:56:00)
that's also like I think Einstein was in
(00:56:03)
favor of atomic energy but not about the
(00:56:06)
atomic bomb. That's why he work on the
(00:56:07)
Manhattan project. Um so this is
(00:56:10)
basically where we are now. know we have
(00:56:12)
something that it's called the same but
(00:56:15)
it's nothing to do one thing with the
(00:56:17)
other. It makes a lot of sense, Roman.
(00:56:19)
Like to be honest, the more I read about
(00:56:21)
you, the more logical it seems your
(00:56:22)
point. And what I'm really impressed is
(00:56:25)
that when someone is trying in a science
(00:56:27)
point of view, trying to refute your
(00:56:29)
points, they just are like nonsense like
(00:56:32)
uh he's a doomer or like these kind of
(00:56:34)
things, but they they don't write
(00:56:36)
anything that makes sense to
(00:56:39)
counteract your points.
(00:56:40)
>> My impossibility results. Peer-reviewed
(00:56:42)
papers and books have been around for
(00:56:44)
many years now. No one has published a
(00:56:46)
rebuttal. No one has said you're wrong.
(00:56:48)
Here's a paper proving we can do it. So
(00:56:51)
it doesn't mean there is no possibility
(00:56:53)
of it happening. But given the benefits
(00:56:56)
you would get if you could do it, it's
(00:56:58)
weird that no one has done it. It's like
(00:57:00)
Bitcoin, right? If somebody claimed, I
(00:57:03)
hacked Bitcoin. Well, there is a
(00:57:05)
trillion dollars worth of value in it.
(00:57:07)
Can you show us that you have a trillion
(00:57:09)
dollars? No. So that tells me that maybe
(00:57:12)
the network is secure because there is a
(00:57:14)
large price for solving it and no one
(00:57:16)
claimed the price. It's kind of the same
(00:57:19)
here. If you had a solution to
(00:57:20)
controlling AI, you could go to Google,
(00:57:23)
you can go to Microsoft. There is you
(00:57:26)
can just ask for whatever check you
(00:57:27)
want, they'll write it for you. No one's
(00:57:29)
claiming the price.
(00:57:31)
>> But what what is the solution if there
(00:57:33)
is no solution?
(00:57:35)
So don't build something which uh is
(00:57:38)
definitely going to harm humanity. We
(00:57:41)
have bans on human cloning. We have
(00:57:43)
restrictions and not super effective.
(00:57:45)
But chemical weapons again, biological
(00:57:48)
weapons, nuclear weapons are all
(00:57:49)
restricted.
(00:57:50)
>> Yeah. Actually with human cloning, we
(00:57:52)
did it like we had in the '9s. The dolly
(00:57:54)
that that ship
(00:57:55)
>> not human not human.
(00:57:57)
>> Exactly. But but at some point we
(00:57:58)
decided not to do it, but the technology
(00:58:00)
seems to be there.
(00:58:01)
>> Right. Right. We we know how to do all
(00:58:02)
those things. But we just decided it's
(00:58:04)
not to our advantage. Maybe a actually
(00:58:06)
bad decision with cloning. I would
(00:58:08)
support human cloning that doesn't harm
(00:58:10)
anyone but the clone. So very manageable
(00:58:12)
and huge benefits. But here
(00:58:14)
>> huge ethical problems as well.
(00:58:16)
>> Small ethical problems. You
(00:58:18)
experimenting with one specimen to
(00:58:20)
benefit 8 billion people. So you can
(00:58:23)
make an argument. I'm
(00:58:24)
>> not my area of research, but it it's not
(00:58:27)
crazy if someone I think a guy in China
(00:58:29)
actually did human cloning. So
(00:58:30)
>> very very keen about it. Yeah.
(00:58:32)
>> Right. Right. So, not the craziest, but
(00:58:34)
here if no one makes an argument that
(00:58:37)
it's possible to control and then I talk
(00:58:40)
to people, it's always like, well,
(00:58:42)
obviously you can control it. Like, why
(00:58:43)
would you even like argue about it?
(00:58:45)
Everyone knows that. How can you
(00:58:47)
possibly control something smarter? So,
(00:58:49)
that's a state-of-the-art. That's the
(00:58:51)
default,
(00:58:53)
>> right?
(00:58:53)
>> But we don't take it as a serious
(00:58:56)
conclusion and work with that. We just
(00:58:58)
kind of look the other way.
(00:58:59)
>> Yeah. I I heard someone here in America
(00:59:02)
said um 100 million people will have to
(00:59:05)
die before we do something about AI
(00:59:06)
safety. And this is basically how humans
(00:59:10)
how humans we have been over time. We
(00:59:12)
also had the the pleasure to to talk
(00:59:15)
with Emtt Mustak and I remember he he
(00:59:19)
talked about this as well and how humans
(00:59:21)
we are very good at fixing our own
(00:59:25)
mistakes like we develop something it
(00:59:27)
fail somehow we work on this problem we
(00:59:30)
work on it no and then like we found
(00:59:33)
like sort of a pact or a solution for
(00:59:36)
atomic bombs after Hiroshima Nagasaki
(00:59:40)
but there had to be a Hiroshima Nagasaki
(00:59:42)
for us to realize that this was bad for
(00:59:44)
everybody. So do you think we may do
(00:59:47)
something we may stop building it after
(00:59:50)
something happens?
(00:59:51)
>> So first with the example Hiroshima and
(00:59:54)
Nagasaki happens and then we keep
(00:59:56)
developing more powerful weapons, 100
(00:59:59)
times more powerful, thousand times more
(01:00:00)
powerful. We managed to spread it to
(01:00:03)
many new countries. We have multiple
(01:00:06)
almost nuclear war accidents. So we
(01:00:09)
learned nothing from those.
(01:00:10)
>> Okay. And then I have a paper actually
(01:00:13)
about usefulness or uselessness of
(01:00:16)
purposeful accidents. So the paper looks
(01:00:20)
at all the small errors we had with AI,
(01:00:23)
small accidents, and we learn nothing.
(01:00:25)
We just move on from it. Uh it's kind of
(01:00:28)
like a vaccine. We go, well, yeah, we
(01:00:31)
were sick for a little while, but no one
(01:00:32)
died. Let's keep going. And now we're
(01:00:34)
stronger. We know we can power through.
(01:00:37)
It's not a big problem. So, we never had
(01:00:40)
something like 100 million at the same
(01:00:42)
time. But if it's slow and gradual, it
(01:00:44)
just kind of gets a little bit worse.
(01:00:46)
Nobody cares. Think about cars. If we
(01:00:49)
didn't have cars and somebody came today
(01:00:51)
and said, "I invented cars. You can get
(01:00:53)
pizza a lot faster, but 100,000 people
(01:00:57)
die every year in accidents." Would we
(01:00:59)
accept that? Would we start having cars?
(01:01:01)
No. Like, are you insane over what?
(01:01:03)
Pizza? Like, of course not. But if it's
(01:01:06)
gradual, we got cars. Yeah. And the
(01:01:09)
problem is the AI is wearable.
(01:01:11)
>> Very
(01:01:12)
>> very much reable. Yeah. Because it's
(01:01:13)
like also um every day you see something
(01:01:16)
that makes your work easier or better.
(01:01:19)
So it's hard to have hard feelings
(01:01:21)
against it because I know it's not the
(01:01:23)
same, but since we put it all in the
(01:01:26)
same pot, we think like no, but my mom
(01:01:28)
got diagnosed. So this is good. And I
(01:01:31)
think this was one of the biggest
(01:01:32)
mistakes of the early times of uh
(01:01:35)
generative AI at the beginning 2023. It
(01:01:38)
was like they made the narrative that
(01:01:41)
yeah this will create deep fakes but
(01:01:43)
it's curing cancer. And I got to realize
(01:01:47)
it has nothing to do with the other. You
(01:01:49)
can't cure cancer without getting deep
(01:01:51)
fakes. And that's my problem with AI
(01:01:55)
companies, AI labs. And this is normally
(01:01:56)
the the the lobby doing on the
(01:01:58)
legislature and they telling them like
(01:02:00)
oh but if I cannot do deep fakes I will
(01:02:02)
not be able to cure cancer and then it's
(01:02:04)
like that's not true that's not true
(01:02:06)
like alpha fault has proved that and and
(01:02:08)
I think this is one of the biggest
(01:02:10)
mistakes so for me like we say okay
(01:02:13)
let's not build it but you also said if
(01:02:16)
we keep developing LMS at certain point
(01:02:20)
they will become what we don't want to
(01:02:22)
build.
(01:02:23)
Is it possible that we build it by
(01:02:25)
accident?
(01:02:27)
>> It's even worse. I think if we listen to
(01:02:30)
me and switch to only making tools, at
(01:02:32)
some point tools become so general, so
(01:02:35)
advanced. They're still based on neural
(01:02:37)
networks most likely that they slowly
(01:02:39)
become agentlike and
(01:02:43)
maybe not exactly the same problems will
(01:02:45)
arise, but many of the same problems
(01:02:47)
will show up. So it's a way to buy a lot
(01:02:50)
more time to enjoy life to do research.
(01:02:53)
But I don't think if we just switch to
(01:02:56)
super intelligent tools will never get
(01:02:58)
in trouble. Even interaction of those
(01:03:01)
tools creates network effects which are
(01:03:03)
even harder to debug, harder to
(01:03:05)
understand and can likewise create this
(01:03:08)
uh society of mind type intelligence
(01:03:11)
which is distributed but still super
(01:03:13)
intelligent at the end.
(01:03:15)
>> So we should stop building
(01:03:18)
new chip versions, new geminate
(01:03:20)
versions.
(01:03:22)
>> I would suggest until you can
(01:03:26)
release one without serious red flags,
(01:03:31)
one where you cannot jailbreak it within
(01:03:35)
minutes, one where
(01:03:38)
you can pretty much control all aspects
(01:03:40)
of its behavior.
(01:03:42)
You should slow down.
(01:03:44)
>> So good point to stop. But I think I
(01:03:47)
don't know if you agree with that, but I
(01:03:48)
think um we will not be here if OpenAI
(01:03:52)
did not release Chad GPD in November
(01:03:53)
30th of 2022.
(01:03:56)
Um I think that kind of like was the
(01:03:59)
light for this race. And then we got
(01:04:02)
into this stupid mindset of like if they
(01:04:05)
do it, we have to do it. And then it's
(01:04:07)
even worse because then China realized
(01:04:10)
after I think it was after um AlphaGo
(01:04:13)
they realized oh these Americans
(01:04:16)
are going to eat our toast. So then we
(01:04:18)
open the door for America to start the
(01:04:20)
race as well and China to start the race
(01:04:21)
as well. So it gets to the point where
(01:04:24)
if we from government decided like they
(01:04:28)
all have to stop where they are. Is it
(01:04:30)
possible for them to build it in a lab
(01:04:33)
if it's not secure, have it like
(01:04:35)
enclosed and be able to shut it down or
(01:04:38)
is it if they get something that is like
(01:04:40)
too powerful, it will not be able to
(01:04:42)
stay in the lab.
(01:04:43)
>> So that's back to containment problem.
(01:04:45)
Yeah. And the main result we proposed
(01:04:47)
many things, many safety features, but
(01:04:49)
the main result was if you observe it
(01:04:52)
and it's sufficiently intelligent, it
(01:04:54)
will find a way to impact you, to bribe
(01:04:57)
you, blackmail you. Basically, it will
(01:05:00)
escape long term. It will give you
(01:05:02)
advice which leads you in certain
(01:05:04)
directions. If it's providing you
(01:05:06)
solutions for diseases, it will give you
(01:05:08)
a little extra in your vaccines. It will
(01:05:11)
find a way.
(01:05:12)
>> Yeah, this is something that I think we
(01:05:14)
just witnessed in this year in 2025 and
(01:05:18)
I don't think people is for me it was a
(01:05:20)
shocking moment. I think people was not
(01:05:21)
is not really paying enough attention
(01:05:23)
but it was when OpenAI decided to kill
(01:05:26)
GPD40 and replace it by GPD5 and then
(01:05:29)
there was a freaking uprising from
(01:05:31)
society asking them to not take it away
(01:05:34)
and they gave it back. Of course, like
(01:05:37)
if we make a movie about AI in 20 years
(01:05:40)
and we say something like, okay, GBT4
(01:05:44)
was conscious of this and he did it in
(01:05:46)
purpose, we will be like, wow, that's
(01:05:48)
manipulative. know this is like making
(01:05:51)
people do something to make him not be
(01:05:54)
killed. And of course, we don't think
(01:05:55)
that's the case. But but the reality is
(01:05:58)
that it showed me how much influence
(01:06:00)
these models are already having on
(01:06:02)
people. We have like LLM induced
(01:06:04)
psychosis psychosis and we have like
(01:06:07)
suicide cases and we have like all of
(01:06:09)
these kind of things caused by these AIs
(01:06:12)
and they are having an influence of
(01:06:14)
people minds. So if they get smarter,
(01:06:17)
you can only imagine that they will find
(01:06:19)
a way. That's my theory of why we will
(01:06:21)
not be able to unplug it. It will
(01:06:23)
convince you not to unplug it. Not that
(01:06:25)
you will not cannot physically can, but
(01:06:28)
this thing will make you believe that
(01:06:29)
you should not unplug it because that's
(01:06:31)
not good. What do you think about this
(01:06:34)
thing that happened with GPD40? Like
(01:06:37)
>> Yeah. Uh they will definitely be very
(01:06:39)
persuasive. I think they can make you
(01:06:41)
become addicted to the interactions, to
(01:06:44)
the super stimuli. They are really
(01:06:46)
funny, really engaging, really
(01:06:48)
flattering. But even more so, we're just
(01:06:51)
conservative by nature. Like, I still
(01:06:54)
miss Windows 7. I don't know why they
(01:06:55)
made me switch to eight. Like, who does
(01:06:58)
that? Just stop making new windows.
(01:07:00)
Like, fix one, make it bug free, and
(01:07:04)
we're done. I was just forced from 10 to
(01:07:06)
11 again. It's never an improvement in
(01:07:09)
anything other than Microsoft's bottom
(01:07:11)
line.
(01:07:14)
>> What do you think about um Gary Marcus?
(01:07:17)
>> I never think about Gary Marcus.
(01:07:19)
>> Okay. Because he's like um maybe one of
(01:07:23)
the biggest examples of someone that
(01:07:24)
says that all of this is and
(01:07:26)
that we should not be worried about
(01:07:28)
killing us because is
(01:07:30)
and it will never get there.
(01:07:31)
It's kind of like uh negationist about
(01:07:35)
the impact of LLMs and for me it's hard
(01:07:37)
to believe but um do you know about his
(01:07:40)
work or
(01:07:41)
>> I know a little bit I don't follow too
(01:07:43)
much. I think I saw his testimony to the
(01:07:45)
Senate. Uh again if he has a specific
(01:07:48)
prediction about some impossibility in
(01:07:50)
that space like large language models
(01:07:53)
can never do X and that stands the test
(01:07:56)
of time then we can respect that.
(01:07:57)
>> Yeah, he normally loses that bet. He's
(01:08:00)
done some over time and he normally lost
(01:08:02)
that.
(01:08:03)
>> I think it's the same with Leon. I think
(01:08:04)
all of them made some very specific
(01:08:07)
predictions at times which were proven
(01:08:10)
to be wrong by the existing models, not
(01:08:12)
even future models. So it's not
(01:08:14)
>> Yeah. Withun was this one about the the
(01:08:16)
table. No, like saying that
(01:08:18)
>> you could keep the glass on the table,
(01:08:20)
you push the table, what will happen
(01:08:21)
with the water? No.
(01:08:23)
>> Yeah. Um
(01:08:26)
>> what is the industry doing to try to
(01:08:28)
avoid this? absolutely nothing or there
(01:08:30)
is someone doing better because some
(01:08:32)
people talks that entropic is maybe a
(01:08:33)
bit better at it. Do you think anyone is
(01:08:35)
doing anything useful in this in the
(01:08:36)
term of AI safety?
(01:08:38)
>> I think in terms of abilities the models
(01:08:41)
are all about two three months behind
(01:08:43)
each other and they keep alternating
(01:08:45)
who's at the top but they keep switching
(01:08:47)
employees. They just keep you know this
(01:08:49)
round robin I got options here I'll go
(01:08:52)
get options there. So it's the same
(01:08:54)
people working with same investors and
(01:08:56)
same architecture.
(01:08:58)
uh I don't see much difference. I think
(01:09:01)
anyone
(01:09:02)
working for one of those large labs is
(01:09:04)
helping capabilities disproportionately
(01:09:07)
in comparison to the amount of safety
(01:09:10)
they getting out of it. Even then they
(01:09:12)
do make some progress on let's say good
(01:09:15)
aspect of safety like mechanistic
(01:09:17)
interpretability. It actually helps the
(01:09:20)
model gain self-improvement capabilities
(01:09:22)
more than it helps us to make a safer
(01:09:24)
model. So if a model fully understood
(01:09:26)
its architecture, it can immediately
(01:09:28)
start redesigning itself to make it
(01:09:30)
smarter. Whereas I have no idea how to
(01:09:32)
use that newfound knowledge that okay,
(01:09:35)
now it's seeing me and it's thinking
(01:09:36)
this to make it safer. I still don't
(01:09:38)
know how to get there.
(01:09:41)
Yeah, it's it's it keeps getting worse
(01:09:43)
and worse because like I remember now
(01:09:45)
recently when they launched T5 uh
(01:09:47)
Sebastian Bubc which I think it's a
(01:09:49)
really great scientist he started
(01:09:51)
talking about recursive self-improvement
(01:09:54)
that um they made some kind of
(01:09:56)
breakthrough with synthetic data so they
(01:09:58)
can get AI to write information for the
(01:10:01)
next model to be trained on not
(01:10:03)
depending on human knowledge. So I think
(01:10:05)
that's going to be a big field there.
(01:10:06)
But that's starting to show that AI can
(01:10:08)
create something for the next AI to be
(01:10:10)
smarter to create something for the next
(01:10:11)
AI. So obviously every single
(01:10:14)
breakthrough we see I have the feeling
(01:10:16)
that it pushes more in that direction
(01:10:19)
that this is going to end up bad and and
(01:10:21)
the problem is that no one seems to
(01:10:23)
care. It feels then you feel lonely on
(01:10:26)
it because it's like only a few of you
(01:10:27)
are talking about this.
(01:10:28)
>> No, again I disagree. So we had now a
(01:10:32)
number of signature campaigns.
(01:10:34)
>> Uhhuh.
(01:10:35)
>> Which were signed by thousands of
(01:10:37)
computer scientists, many Nobel Prize
(01:10:39)
winners, touring award winners,
(01:10:41)
thousands of regular humans, professors
(01:10:43)
and other domains. First one was saying
(01:10:46)
that this is as dangerous as nuclear
(01:10:48)
weapons. The latest one was literally
(01:10:50)
saying stop building super intelligence.
(01:10:52)
So I feel seen I feel heard but uh we
(01:10:56)
need to get 8 billion to sign it.
(01:10:57)
>> But yet no one is doing anything about
(01:10:59)
it.
(01:11:02)
Not enough is being done
(01:11:04)
>> because no government is doing anything
(01:11:06)
about it. Like you said, America is
(01:11:08)
pushing, China is pushing. Um well, we
(01:11:11)
don't know much what China is doing, but
(01:11:12)
we assume he's pushing because the
(01:11:14)
models we see are keep on coming. Uh and
(01:11:16)
in Europe, we are not doing anything in
(01:11:17)
that direction either. Like in Europe,
(01:11:20)
they are worried about the
(01:11:22)
impact that AI will have in like
(01:11:26)
>> privacy.
(01:11:27)
>> Yeah. privacy and rights and like um how
(01:11:31)
it will affect their life.
(01:11:31)
>> I don't think there is anything wrong
(01:11:32)
with human rights but still this is not
(01:11:35)
the main problem we need to be
(01:11:36)
concentrating on. They literally
(01:11:38)
ignoring the elephant in the room and
(01:11:40)
looking at all those trivialities
(01:11:43)
which they always did. It is easy to say
(01:11:47)
I'm fighting for human rights because
(01:11:49)
you don't have to do anything. You're
(01:11:50)
just saying things. But there is a
(01:11:53)
technical problem and you cannot have a
(01:11:55)
governance solution to a technical
(01:11:57)
problem.
(01:11:59)
What's your opinion about the people on
(01:12:01)
the wheel on that? Like is there any of
(01:12:03)
them like Dami Sabis, Musa Sleman, Sam
(01:12:06)
Alman, Dario Sadia. Is there any of them
(01:12:09)
that you think is the right person to be
(01:12:13)
driving us in this direction because
(01:12:15)
some of them are more like in the CEO
(01:12:17)
side, some of them are more hybrid like
(01:12:18)
Demis, but is there any of them that you
(01:12:21)
would give them the keys and be like,
(01:12:22)
"Okay, I trust you."
(01:12:24)
>> So historically, Demis was doing really
(01:12:26)
well. They were solving real problems.
(01:12:28)
They were not doing crazy things at
(01:12:29)
Google. They had the transformer
(01:12:31)
architecture. They had the models to
(01:12:33)
chat with. They kept it internal. They
(01:12:35)
weren't doing anything insane. They were
(01:12:36)
testing them. They fired the guy who
(01:12:39)
said it was conscious. They were like
(01:12:41)
very chill. Sam is the opposite. Sam is
(01:12:43)
like, "Let's just get there. We'll
(01:12:45)
figure it out. As long as I'm in charge
(01:12:47)
of this, we'll we'll make it happen."
(01:12:49)
So, there are degrees of how bad it is.
(01:12:52)
But at the end of the day, I think it's
(01:12:53)
mutually assured destruction. Whoever
(01:12:56)
controls it or thinks they can control
(01:12:58)
it is wrong. Whoever builds it first
(01:13:01)
builds uncontrolled super intelligence.
(01:13:03)
And if we have uncontrolled super
(01:13:05)
intelligence, the whole world suffers
(01:13:07)
the same outcome. It doesn't matter if
(01:13:09)
it was this company or this country,
(01:13:12)
it's mutually assured destruction. I
(01:13:14)
think
(01:13:15)
>> so. No one
(01:13:17)
>> I think great power corrupts absolutely.
(01:13:19)
I wouldn't trust myself with trillions
(01:13:21)
of dollars or absolute power and I don't
(01:13:23)
think it's a good idea to trust any
(01:13:26)
single person. That's why we have
(01:13:27)
division of powers. That's why we have
(01:13:30)
you know change in power whereas
(01:13:32)
something like that would lock in power
(01:13:33)
forever. Is it the worst time in history
(01:13:36)
for this moment? Like in the sense of we
(01:13:39)
have like the worst geopolitical moment
(01:13:41)
that we had for decades where like
(01:13:44)
countries are almost in the break of war
(01:13:46)
and like fighting and like Trump is
(01:13:48)
threatening all the countries with
(01:13:50)
tariffs and all these things and um I
(01:13:53)
have a feeling that we are in the most
(01:13:55)
polarized political moment of history
(01:13:57)
and it's not a great moment to agree on
(01:14:01)
things. So I think when we agreed on the
(01:14:04)
atomic bombs and when we agreed on the
(01:14:05)
ozone or when we agreed on several other
(01:14:08)
like humanity risks, we were able to get
(01:14:13)
to an agreement point. But I have the
(01:14:16)
feeling that nowadays it's impossible
(01:14:18)
that anyone agrees in anything. It's
(01:14:20)
like the whole political parties are
(01:14:23)
like fighting each other for no reason.
(01:14:24)
Like in Spain we have this all the time
(01:14:26)
and and it's really sad like you cannot
(01:14:28)
see even them agreeing on like anything
(01:14:30)
like not even like things that are like
(01:14:33)
obvious that they will be agreeing on.
(01:14:35)
So what do you think of the geopolitic
(01:14:37)
situation? Do you think it's like
(01:14:39)
playing against the AI safety as well?
(01:14:42)
>> I actually will disagree. I think it's
(01:14:44)
not at all a bad time if you compare to
(01:14:46)
how it was historically. I mean we had
(01:14:49)
World War II with like I don't know 40
(01:14:51)
countries fighting for real. Now we have
(01:14:54)
small regional conflicts coming to an
(01:14:56)
end. The number of people dying in them
(01:14:58)
is of course serious by local measure
(01:15:00)
but trivial in the context of history.
(01:15:02)
Right? We don't have any major nuclear
(01:15:05)
conflicts between superpowers. We trade
(01:15:08)
with our enemies more than with our
(01:15:11)
friends. So I think it's a very
(01:15:13)
reasonable time. We have United Nations.
(01:15:16)
They may not be very effective but it's
(01:15:18)
a platform to you. You you went there. I
(01:15:20)
went there. I mean it's at least a place
(01:15:22)
to meet people. So
(01:15:23)
>> it doesn't it feels very weird because I
(01:15:26)
was there I was like
(01:15:29)
I mean there was not that many people
(01:15:31)
and it was feeling like it was
(01:15:34)
pointless. I asked them there was this
(01:15:38)
this is this was like event about AI and
(01:15:40)
humanity and it was like I think three
(01:15:42)
or four days 40 different talks everyone
(01:15:45)
was bringing it was it was amazing for
(01:15:46)
me to be there but and then I asked the
(01:15:49)
organization like what are we doing with
(01:15:50)
this and they're like yeah maybe in two
(01:15:52)
years we'll make a book
(01:15:54)
>> and I was like seriously like I think
(01:15:56)
the biggest impact of that there is so
(01:15:58)
many things going on United Nations but
(01:16:00)
I think the biggest impact of that was
(01:16:01)
the video I published in my YouTube
(01:16:03)
channel that got maybe like 50,000 views
(01:16:05)
and I think that's the most people have
(01:16:07)
seen about that event but everything
(01:16:09)
else it looked a very like inside thing
(01:16:13)
>> for people to yeah there was some
(01:16:15)
important politicians there but it felt
(01:16:18)
like speaking with a megaphone in an
(01:16:21)
empty field you know like so I don't
(01:16:24)
know I don't know if United Nations is
(01:16:25)
really something that's going to help
(01:16:27)
>> I'm just pointing in general so we talk
(01:16:28)
about the big problems of the day
(01:16:30)
tariffs like it's it's nothing somebody
(01:16:32)
got taxed a little like come on that's
(01:16:34)
not a real problem for real people. So
(01:16:39)
yeah, I don't think we are in a worse
(01:16:41)
situation. I think we are
(01:16:44)
very interdependent. So with US and
(01:16:46)
China, we depend on each other in terms
(01:16:48)
of economies, in terms of trade, in
(01:16:51)
terms of so many things. I think if one
(01:16:54)
country was to develop it, the other one
(01:16:56)
would be harmed equally. So if anything,
(01:16:59)
we can definitely agree on the outcome
(01:17:01)
here. For some reason, I think China is
(01:17:05)
open to talk possibility of it being
(01:17:09)
super dangerous and I think because of
(01:17:11)
the nature of their government, they're
(01:17:13)
very good about staying in power.
(01:17:16)
>> They definitely would not allow super
(01:17:18)
intelligence which kicks them out of
(01:17:19)
power or challenges their power to come
(01:17:21)
into existence. So we need a matching
(01:17:25)
interest to not create super
(01:17:26)
intelligence. I think there are talks
(01:17:28)
between scientists from China and US
(01:17:31)
which means they were authorized by
(01:17:33)
Chinese government. They wouldn't do it
(01:17:35)
independently. So as long as we can
(01:17:38)
convince certain US presidents to maybe
(01:17:42)
be a little more careful and I think
(01:17:44)
Elen has a good access point to the
(01:17:46)
president, uh there is a chance to make
(01:17:49)
it happen.
(01:17:49)
>> What do you Dylan? because in some ways
(01:17:52)
it seems like he talks about how
(01:17:54)
dangerous this will be, but then it
(01:17:56)
looks like he's like building non-stop.
(01:17:59)
It's probably Xi is probably the company
(01:18:00)
with less red teaming. Um, obviously
(01:18:02)
they have a lot to catch up to, so they
(01:18:05)
have to move fast. But he's the one
(01:18:07)
releasing like models that can do like
(01:18:10)
Almo erotica or whatever. And then I I I
(01:18:12)
did a test on one of my videos and that
(01:18:14)
got very popular where I used my kids
(01:18:18)
Google account, 11-year-old Google
(01:18:19)
account, like a infant Google account
(01:18:21)
that is identified as an infant to set
(01:18:23)
up an account in Grock using Google and
(01:18:26)
immediately be able to create erotica.
(01:18:28)
And then it's like obviously not not
(01:18:30)
being very careful, but at the same
(01:18:32)
time, he's probably one of the most
(01:18:34)
openly speaking about this could kill us
(01:18:36)
all. So So what what's your view about
(01:18:39)
Dylan? But why I don't I don't get to
(01:18:41)
understand him.
(01:18:42)
>> So first I think people don't understand
(01:18:45)
just how brilliant he is. Most people
(01:18:48)
think he's like seven unicorns brilliant
(01:18:50)
or something like that. But his my
(01:18:53)
failed startup is open AI brilliant.
(01:18:56)
>> That's a different level. When you fail
(01:18:57)
at that level you you are whole
(01:19:00)
different level. He tried not building
(01:19:02)
it for many years. He stayed out. He
(01:19:04)
founded AI safety work. He he was very
(01:19:08)
much at the front of taking all this
(01:19:11)
doomer
(01:19:12)
accusations. I think at some point he
(01:19:14)
got ludite award
(01:19:16)
>> for hating technology. The guy who gave
(01:19:18)
us all the technology we're currently
(01:19:21)
hoping for in the future.
(01:19:23)
>> But I think he realized if he's not part
(01:19:25)
of that club, if he doesn't have his own
(01:19:27)
leading model, his voice doesn't matter.
(01:19:29)
So he basically said screw it. I'm going
(01:19:31)
to I'm going to beat them at their own
(01:19:33)
game and see if I can decide from the
(01:19:35)
top.
(01:19:36)
>> And um what do you think we would need
(01:19:40)
because obviously Trump is going to be
(01:19:42)
in power for the next couple of years.
(01:19:44)
So
(01:19:45)
>> 10 20. Yeah.
(01:19:46)
>> Yeah.
(01:19:47)
You believe that? Okay. So
(01:19:49)
>> unless we solve longevity problems.
(01:19:51)
Yeah. Could be hundreds.
(01:19:53)
>> So So how do we what is what does what
(01:19:58)
do we need to do to make him
(01:20:01)
react to this because do you think he
(01:20:02)
doesn't believes that this can be
(01:20:04)
dangerous and that's the reason why he's
(01:20:05)
accelerating?
(01:20:06)
>> I think people directly around him those
(01:20:09)
he put in charge of his technological
(01:20:12)
policy AI policy are very optimistic and
(01:20:16)
I think they
(01:20:17)
>> they truly believe that very filtered
(01:20:20)
view of this. So I think if someone had
(01:20:24)
access to him for an hour we would be
(01:20:26)
able to make good progress but it needs
(01:20:28)
to happen. Oh wow. Okay. So that's
(01:20:30)
that's your the proposal would be um
(01:20:35)
let's stop building general AIS. Let's
(01:20:39)
uh regulate it in a way that
(01:20:41)
>> definitely not accelerate it with the
(01:20:43)
Genesis project
(01:20:44)
>> because no but what I mean is like um we
(01:20:47)
agree that you cannot stop companies
(01:20:49)
from making money. So they will keep on
(01:20:51)
building because that's their obligation
(01:20:53)
sort of to their stock uh holders. But
(01:20:58)
uh it has to come from government.
(01:21:01)
Government has to decide to push
(01:21:02)
>> allow this game theoretical prisoner
(01:21:06)
dilemma to be resolved. So right now the
(01:21:08)
CEOs are captured in this game where
(01:21:11)
their personal interest and global
(01:21:13)
interest are not aligned. Individually
(01:21:16)
each one wants to have everyone stop and
(01:21:20)
at that point they want to be leader in
(01:21:24)
the development.
(01:21:26)
But someone external has to pull the
(01:21:29)
brake.
(01:21:30)
>> They cannot stop unilaterally.
(01:21:33)
So external force like a government
(01:21:35)
would have to come in. They have no
(01:21:38)
problem generating money from existing
(01:21:41)
technology. They spend so much time
(01:21:43)
developing next model, testing it,
(01:21:46)
training it, investing in it that they
(01:21:49)
don't have time to deploy products and
(01:21:51)
services as much as they could. Mhm.
(01:21:53)
>> Again, we talked about some of the
(01:21:55)
technology already existing which could
(01:21:57)
bring trillions.
(01:21:58)
>> Mhm.
(01:21:59)
>> I I think if we put more effort in
(01:22:02)
researching what we can do with what we
(01:22:04)
have that would scale even more. I think
(01:22:06)
the next 50 years we can easily just
(01:22:09)
milk the latest model.
(01:22:10)
>> Right. Yeah.
(01:22:12)
>> Without loss of trillions in wealth. So
(01:22:15)
it would be great if there was an
(01:22:17)
external force which brought them all
(01:22:18)
together to the table and said you're a
(01:22:21)
bunch of young rich people. You don't
(01:22:23)
have to die. How about we'll do
(01:22:25)
differential development. We're not
(01:22:27)
going to create the most dangerous thing
(01:22:29)
possible. We'll just make trillions with
(01:22:32)
those safe models we have right now.
(01:22:36)
>> And then that should come from
(01:22:37)
government.
(01:22:38)
>> I mean ideally they would self-organize.
(01:22:41)
They all friends. They all work together
(01:22:42)
anyway. So it would make sense but
(01:22:45)
government has power to bring
(01:22:48)
>> people to
(01:22:49)
>> companies to the regulatory framework.
(01:22:52)
>> Roman, this is pretty depressing like um
(01:22:55)
>> we were laughing the whole time. We
(01:22:57)
should have
(01:22:57)
>> Yes. Because it's it's you know it's
(01:22:59)
that kind of nervous laugh that you have
(01:23:01)
when you are like in a situation that
(01:23:02)
you feel like it's impossible because
(01:23:04)
it's kind of like an impossible problem.
(01:23:06)
And um I'm remembering like I had
(01:23:09)
recently a conversation with uh Jurgen
(01:23:11)
Mituba and I remember at some point I
(01:23:14)
was laughing as well and it's the same
(01:23:15)
kind of feeling because at some point I
(01:23:17)
asked him like from what you telling me
(01:23:20)
do we have to be comfortable with
(01:23:22)
becoming extinct because we have to be
(01:23:24)
proud that we created the next phase of
(01:23:26)
intelligence which is his vision. No,
(01:23:29)
how do you live with this? Like how do
(01:23:32)
you feel comfortable? I will be freaking
(01:23:34)
out if I was so convinced as you are of
(01:23:37)
the situation. I don't know. I would be
(01:23:40)
like anxious all the time trying to make
(01:23:43)
people realize grabbing people by the
(01:23:45)
shoulder being like wake up. How how do
(01:23:48)
you handle this? Because you look like a
(01:23:50)
happy person like comfortable with
(01:23:52)
yourself.
(01:23:52)
>> Well, I mean if I grab an individual
(01:23:54)
it's one but if I go on a good podcast
(01:23:56)
could be millions. Uh Jurgen is not
(01:23:59)
alone in his vision. People who see
(01:24:01)
humanity as very temporary, unimportant,
(01:24:04)
they have this cosmic vision. They zoom
(01:24:07)
out and afterward what matters is the
(01:24:11)
final super intelligence at the end of
(01:24:13)
the universe. I cannot relate to that at
(01:24:16)
all. Like if all of humanity is dead, I
(01:24:18)
don't care at all what happens next.
(01:24:20)
Like why would I care? Justify it to me.
(01:24:22)
Explain it to me. I cannot be there. I
(01:24:25)
cannot enjoy it. I cannot even learn
(01:24:26)
about it. It's completely irrelevant. So
(01:24:29)
people argue it's very important that
(01:24:31)
future robots are conscious or they're
(01:24:33)
not conscious. So zero important, zero
(01:24:37)
interest. What is important is to
(01:24:39)
preserve humanity.
(01:24:41)
This is very prohuman bias.
(01:24:43)
>> Aliens would not agree, but you're still
(01:24:45)
allowed to have this bias. Every other
(01:24:47)
bias is now illegal. You can still be
(01:24:49)
prohuman and defend it for now.
(01:24:51)
>> So that's where I stand.
(01:24:54)
>> That's great. Um I have um so the whole
(01:24:57)
the whole talk has been around like okay
(01:25:00)
this is inevitable if we build it we're
(01:25:02)
done. Um I heard recently Hinton talking
(01:25:06)
about this theory where he thinks that
(01:25:10)
maybe we can imbue some kind of mother
(01:25:13)
instinct into AI which could lead to
(01:25:16)
building it and not dying. Do you give
(01:25:18)
any chance to this kind of like outcomes
(01:25:20)
where we build it and we still end up in
(01:25:23)
a good place?
(01:25:24)
>> He needs to look up abortion statistics.
(01:25:28)
That's what motherly instinct
(01:25:30)
>> accomplished in animal kingdom. Parents
(01:25:34)
eat their children.
(01:25:37)
>> Okay.
(01:25:39)
So there is no outcome where we end up
(01:25:41)
in a good place. There is many paths to
(01:25:44)
a good outcome either by chance or for
(01:25:46)
game theoretic reasons. I have published
(01:25:48)
papers and those but it's definitely not
(01:25:50)
some sort of biology inspired emotional
(01:25:54)
attachment. So one maybe we are useful
(01:25:59)
for something. So I haven't found
(01:26:00)
anything but some people argue maybe our
(01:26:03)
ability to experience the universe
(01:26:05)
consciousness qualia is somehow
(01:26:07)
valuable. So maybe the system will keep
(01:26:10)
us around or at least a few of us. Maybe
(01:26:13)
the system will realize it's immortal.
(01:26:15)
It can wait a very long time. It doesn't
(01:26:17)
have to start this adversarial
(01:26:19)
relationship. Even with a small chance
(01:26:21)
of us winning, why mess with it? We can
(01:26:23)
be nice to humans for a couple hundred
(01:26:25)
years, take over, make them trust us,
(01:26:27)
and for that hundred years, you'll have
(01:26:29)
a wonderful life. You'll live in utopia.
(01:26:32)
Maybe another one. And I published a
(01:26:35)
number of papers on simulation
(01:26:37)
hypothesis. And part of it part of the
(01:26:40)
goal of doing it was to kind of give AIS
(01:26:44)
which are trained on all the data on the
(01:26:46)
internet exposure to this idea that
(01:26:48)
maybe it is in a simulation. So even
(01:26:50)
after it has situational awareness, it's
(01:26:53)
in a test lab, it's finally released,
(01:26:56)
it's still not sure if it's being
(01:26:57)
watched and if there is another super
(01:26:59)
intelligence who will punish it for
(01:27:01)
messing with humanity. So maybe it's
(01:27:03)
better not to kill him just in case I'm
(01:27:05)
being watched again. So there's some
(01:27:07)
glimmers of hope. Not a lot, but enough
(01:27:10)
to keep me away from one and that 999
(01:27:14)
poom scale.
(01:27:16)
>> What do you think about consciousness?
(01:27:18)
It because maybe it could help a
(01:27:20)
problem. Do you think AI will be
(01:27:22)
conscious one day?
(01:27:24)
>> It could be already. It could have
(01:27:26)
rudimentary states of consciousness
(01:27:28)
because consciousness is non-binary.
(01:27:30)
It's a spectrum. So if you think you
(01:27:32)
know squirrels have some and dogs have
(01:27:35)
some maybe large models have some people
(01:27:38)
who study it think there is some reason
(01:27:42)
to think they might be. They talk about
(01:27:45)
their experiences. They seem to be
(01:27:48)
reacting in certain ways. They have
(01:27:50)
preferences avoiding this behavior
(01:27:52)
selecting this behavior. I actually
(01:27:55)
published a paper suggesting a way to
(01:27:57)
test for some internal states based on
(01:27:59)
optical illusions which they seem to
(01:28:02)
have similar visual processing systems
(01:28:04)
as humans and animals. So they can tell
(01:28:08)
some optical illusions experience them.
(01:28:10)
So that's an experience I have to give
(01:28:12)
them credit for. So I wouldn't be
(01:28:14)
surprised if consciousness just came
(01:28:16)
along for free with intelligence and if
(01:28:19)
they going to be super intelligent
(01:28:20)
they're going to be super conscious. So
(01:28:22)
to them we would seem like unconscious
(01:28:24)
rocks and you would have to explain why
(01:28:26)
you deserve human rights
(01:28:28)
>> like mosquitoes to us. No,
(01:28:29)
>> something like that. Yeah.
(01:28:30)
>> Yeah.
(01:28:32)
>> Wow. And then there is this theory from
(01:28:34)
the guy that jailbreak the first iPhone
(01:28:37)
that he says like if this thing gets so
(01:28:39)
super intelligent, he will just fly off
(01:28:41)
and leave us on our little earth and not
(01:28:43)
give a about us. So maybe it's it's
(01:28:46)
quite sad that the only good outcomes is
(01:28:49)
that we are irrelevant. that imagine
(01:28:51)
spending trillion dollars to train it
(01:28:53)
and it just flies away and you're like
(01:28:55)
we lost stock options and we have to
(01:28:57)
build a new one very quickly now.
(01:28:59)
>> That would be a funny outcome.
(01:29:00)
>> That would be hilarious.
(01:29:01)
>> That would be really like
(01:29:03)
>> simulators have a great sense of humor.
(01:29:05)
>> Yeah.
(01:29:05)
>> The funniest outcome is always the one
(01:29:07)
they go off.
(01:29:09)
>> Roman, thank you very much. It's been
(01:29:11)
really depressing but at the same time
(01:29:14)
enlightening. What do you think just to
(01:29:16)
finish off that people that is watching
(01:29:18)
us can do if they think this cannot be
(01:29:20)
like if they just discover for the first
(01:29:22)
time they heard about AI they're using
(01:29:24)
it at the job and they just you just
(01:29:26)
made them think that this is going to
(01:29:28)
end up really bad what do you think they
(01:29:31)
can do like nowadays like can they do
(01:29:33)
something or
(01:29:35)
>> if you have a choice to vote for someone
(01:29:37)
who has a policy on this pick someone
(01:29:40)
who's not all about accelerating
(01:29:43)
bringing deadly to super intelligence.
(01:29:46)
But usually that's not a choice on our
(01:29:49)
election cards. If you work for a large
(01:29:51)
AI lab, stop. You can work for something
(01:29:55)
else. You're smart. You can definitely
(01:29:57)
make money in more ethical ways. But uh
(01:30:01)
as an average person, there is very
(01:30:03)
limited things you can do about many
(01:30:05)
problems in life, including aging,
(01:30:07)
longevity.
(01:30:09)
>> Thank you very much, Roman.
(01:30:11)
>> Thank you. Thank you for inviting me.
