↔
Title: AI Safety Expert: Humanity’s Last Invention— 99.99% Chance of Extinction | Dr. Roman Yampolskiy
Duration: 02:16:33
Total Correct Answers:
Current Caption
Correct
Learning Modes
YouTube Video Transcript Hide
Ask AI:
Export as:
Ask AI Result
The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here
(00:00:00)
Today's guest is the leading expert on
(00:00:02)
AI safety.
(00:00:03)
>> Anything super intelligent, we cannot
(00:00:05)
control. You cannot indefinitely
(00:00:07)
control, understand, predict something
(00:00:09)
smarter than you. If you're not
(00:00:10)
controlling it, you're not in charge and
(00:00:12)
you're not deciding what's going to
(00:00:13)
happen to you.
(00:00:14)
>> Over the last 15 years, he studied how
(00:00:16)
intelligent systems learn, adapt, and
(00:00:18)
make decisions. But more importantly,
(00:00:20)
the point at which humans lose control.
(00:00:23)
>> Leaders of those companies secretly want
(00:00:25)
government to step in and stop them so
(00:00:27)
they can not lose the race. keep what
(00:00:30)
they have and stay alive and be rich.
(00:00:32)
>> While most people see AI as a tool for
(00:00:34)
productivity or profit, he poses an
(00:00:37)
uncomfortable question. What if
(00:00:38)
artificial intelligence is the last
(00:00:40)
thing we ever create? As Sam Alman does
(00:00:43)
reach AGI with open AI, what does his
(00:00:45)
life look like?
(00:00:46)
>> If I'm right, he has an uncontrolled
(00:00:48)
super intelligence and we're all dead.
(00:00:50)
In this episode, we'll dive into the
(00:00:51)
challenges of containing super
(00:00:53)
intelligence, expose the lies and unsafe
(00:00:55)
practices of big tech companies, and
(00:00:57)
explore if AI will solve all of
(00:00:59)
humanity's problems or create a future
(00:01:02)
worse than extinction.
(00:01:03)
>> Is a possibility that it can recreate
(00:01:05)
dead people and bring all the possible
(00:01:07)
people into existence just to torture
(00:01:09)
them. As long as you have a DNA sample
(00:01:11)
or brute force all possible DNA
(00:01:13)
sequences, you may be given immortality,
(00:01:15)
but you are suffering terribly. You wish
(00:01:17)
you were dead.
(00:01:18)
>> Dr. Roman Yampolski. Welcome to the Jack
(00:01:21)
No podcast. Thank you for inviting.
(00:01:23)
>> Dr. Roman,
(00:01:25)
everyone's racing to build AGI.
(00:01:28)
You said we have a 99.99%
(00:01:32)
chance of extinction from AI.
(00:01:37)
What do you mean?
(00:01:39)
So, we need to distinguish different
(00:01:42)
types of AI. We have AI tools which are
(00:01:45)
awesome and helpful. We building
(00:01:48)
artificial general intelligence AGI
(00:01:50)
human level intelligence and soon after
(00:01:53)
we'll get super intelligence systems
(00:01:56)
better than anyone any human in any
(00:01:58)
domain.
(00:02:00)
We should continue building awesome
(00:02:02)
tools. They're very helpful amazing
(00:02:04)
economic benefit. We can probably deal
(00:02:07)
with something close to human level but
(00:02:10)
anything super intelligent we cannot
(00:02:13)
control. You cannot indefinitely
(00:02:15)
control, understand, predict something
(00:02:17)
smarter than you. And at that point, if
(00:02:20)
you're not controlling it, you're not in
(00:02:22)
charge and you're not deciding what's
(00:02:23)
going to happen to you.
(00:02:24)
>> So why exactly in your view is
(00:02:27)
controlling AI so much harder than
(00:02:29)
people think it is.
(00:02:31)
>> So most people don't think about control
(00:02:33)
at all to begin with. Historically, the
(00:02:36)
first 50, 60 years of AI research was
(00:02:38)
about just how do we make it work?
(00:02:40)
Nothing worked. for 50 years. We barely
(00:02:43)
got narrow systems to do basics. It
(00:02:46)
started working about 10 years ago and
(00:02:48)
people started looking at safety issues.
(00:02:52)
Initially, it's about simple problems we
(00:02:55)
already understand,
(00:02:57)
data privacy, algorithmic bias, deep
(00:03:00)
fakes. You can make some progress in
(00:03:02)
that space. So people feel optimistic.
(00:03:04)
Okay, we're making progress in AI
(00:03:06)
safety. But as we don't have today
(00:03:10)
AI systems that advanced, nothing is
(00:03:12)
super intelligent yet. It's very hard to
(00:03:15)
do research on them to understand what
(00:03:18)
those systems are capable of. And so
(00:03:20)
very few people are actually directly
(00:03:22)
looking at that problem which is the
(00:03:24)
hardest one. If you abstract it away, if
(00:03:27)
you just look at different agents of
(00:03:29)
different capability, take I don't know
(00:03:31)
ants, squirrels versus humans, there's a
(00:03:34)
huge cognitive gap. They don't
(00:03:36)
understand what we are doing. They don't
(00:03:38)
understand why we're doing what we do.
(00:03:39)
And there is nothing they can do if we
(00:03:41)
decide to take them out. It's going to
(00:03:44)
be very similar. Just the cognitive gap
(00:03:46)
will be much bigger.
(00:03:47)
>> I guess is there anything that we're
(00:03:48)
doing particularly wrong in the way that
(00:03:50)
we're building it that would make it
(00:03:52)
difficult to control.
(00:03:54)
>> We are not engineering it. First 50
(00:03:56)
years of AI research was about
(00:03:58)
engineering. We had knowledge engineers
(00:04:00)
encode information about specific
(00:04:02)
domains. This is how we play chess. this
(00:04:05)
is why I do this opening. So we
(00:04:07)
understood what the system was trained
(00:04:08)
to do, how it worked. It was a decision
(00:04:10)
tree. You can look at the decisions and
(00:04:12)
go if this happens that will happen. We
(00:04:15)
stopped that. We started
(00:04:18)
basically growing those systems. You
(00:04:20)
take an architecture, you give it lots
(00:04:22)
of data, all of internet data, you give
(00:04:25)
it lots of compute and then you see what
(00:04:27)
happened. It selforganizes learns
(00:04:30)
patterns in the data you provided and
(00:04:32)
then we study it to try to understand
(00:04:34)
what did we just grow what is this
(00:04:36)
artifact capable of. So it's unsafe
(00:04:39)
because we're not explicitly making it
(00:04:40)
safe. After the model is trained there
(00:04:43)
is a post-processing step where we go
(00:04:46)
let's put some filters on it. Let's make
(00:04:48)
sure it never talks about this dangerous
(00:04:50)
topic. It never says this offensive word
(00:04:52)
but it's just filters on top of a very
(00:04:54)
dangerous model. So, do humans actually
(00:04:58)
control AI anymore or are we just kind
(00:05:01)
of approving what it tells us?
(00:05:04)
>> So, they are still not smarter than us.
(00:05:06)
So, in many ways, we are still deciding
(00:05:09)
for one, we're deciding to shut them
(00:05:11)
down if we don't like what they do. We
(00:05:12)
we still have that capability. It will
(00:05:15)
not always be the case. At some point,
(00:05:16)
you won't be able to shut it down. So,
(00:05:19)
right now, we have some tools, narrow
(00:05:22)
tools, and we are completely in control
(00:05:24)
with those. we understand exactly what
(00:05:26)
they're doing. You have a program
(00:05:27)
playing chess. You can always turn it
(00:05:30)
down. You may not understand
(00:05:32)
specifically what it's trying to do with
(00:05:34)
every move, but you know it's going to
(00:05:35)
win a game of chess. You know where it's
(00:05:37)
going with it plan. Uh large language
(00:05:40)
models are more general. They are harder
(00:05:43)
to predict. And uh so no, we can't
(00:05:46)
control them fully. They say things
(00:05:47)
they're not supposed to say. They give
(00:05:49)
advice they shouldn't be giving. They
(00:05:51)
lie. They cheat. They do things creators
(00:05:54)
of those systems don't want them to do
(00:05:56)
but they are still not smarter than
(00:05:59)
those people. That will change and it
(00:06:02)
may change very soon.
(00:06:03)
>> I guess it just it seems like to me that
(00:06:06)
we would be at a place where humans kind
(00:06:08)
of have this final approval, you know,
(00:06:10)
but the AI is actually like running a
(00:06:12)
lot of the simulations to make sure it's
(00:06:14)
safe and kind of uh like testing a lot
(00:06:17)
of the things. is like if you're using
(00:06:18)
an AI to I don't know select candidates
(00:06:21)
for a job maybe like say there are
(00:06:23)
thousand candidates it's maybe giving
(00:06:26)
you the top five and you kind of sign
(00:06:27)
off on the top five but you're not
(00:06:29)
really in the decision making of how it
(00:06:32)
got there like wouldn't that imply that
(00:06:34)
we've already lost a lot of the control
(00:06:36)
of the AI development and kind of making
(00:06:39)
the safety measures to begin with
(00:06:42)
>> so there are different degrees of safety
(00:06:44)
again we can talk about something like
(00:06:46)
discrimination within
(00:06:48)
hiring process. And if a system just
(00:06:50)
gives you top five candidates, you don't
(00:06:52)
know if a decision was made based on
(00:06:53)
some illegal categories or not. So you
(00:06:56)
have to review all the candidates, see
(00:06:57)
what the decisions were. And for
(00:06:59)
something so specific, we can probably
(00:07:01)
figure out what weights were assigned to
(00:07:03)
different subcategories. We're talking
(00:07:06)
more about can it come up with a
(00:07:09)
dangerous new technology? Can it do
(00:07:11)
something which impacts all of humanity
(00:07:13)
in a possibly existential
(00:07:16)
way? And the problem is we don't know
(00:07:18)
how to test for that. We know how to
(00:07:20)
test narrow systems because they're edge
(00:07:22)
cases. If you're playing again some sort
(00:07:26)
of game, you know, you can try it with
(00:07:28)
an empty board, board with two queens,
(00:07:30)
you kind of know what to expect. Then
(00:07:32)
you're talking about a general system
(00:07:34)
making decisions over all possible
(00:07:36)
domains and it's smarter than you. You
(00:07:38)
just don't know what questions to ask.
(00:07:39)
If you're lucky and you stumble on some
(00:07:41)
problem, you detect a bug of some kind
(00:07:43)
and you fix it. You can report we found
(00:07:45)
a problem, we fixed it, but you cannot
(00:07:47)
tell me that it has no remaining
(00:07:48)
problems.
(00:07:50)
>> So what's the point of no return? Like
(00:07:53)
what capability once achieved makes
(00:07:56)
turning back impossible?
(00:07:59)
>> It's likely generality when it comes to
(00:08:01)
science and engineering research. So if
(00:08:04)
a system can work on the next generation
(00:08:06)
of AI independently,
(00:08:09)
you begin this self-improvement
(00:08:11)
recursive self-improvement process where
(00:08:13)
humans are no longer part of the loop.
(00:08:15)
And at that point, anyone who's running
(00:08:18)
that system is generating more capable
(00:08:20)
AI. They can do it independently. They
(00:08:23)
can have backups of this process. So at
(00:08:25)
that point, it would be almost
(00:08:26)
impossible for us to intervene and shut
(00:08:28)
it down. And do you see that as a moment
(00:08:30)
like a single moment or do you see that
(00:08:32)
as something that's already gradually
(00:08:34)
happening?
(00:08:34)
>> It's a gradual process. We starting to
(00:08:36)
automate different parts of research. So
(00:08:39)
there are now AIs which look at
(00:08:40)
different model architectures. There are
(00:08:42)
models which can uh decide what data
(00:08:45)
sets to train on or generate new data
(00:08:47)
for training. But still there are humans
(00:08:49)
in the loop who make most decisions. So
(00:08:52)
we still can stop today.
(00:08:53)
>> Right. Okay. But you're concerned with
(00:08:56)
our ability to stop predominantly. That
(00:08:58)
makes sense.
(00:08:59)
>> We also have no desire to stop. No one's
(00:09:01)
stopping. It's the actual opposite
(00:09:03)
process. There is an arms race and we
(00:09:05)
just saw federal government say we need
(00:09:07)
to expedite this process. So the
(00:09:09)
prediction markets are telling us we're
(00:09:11)
like 2 years away from AGI and there are
(00:09:13)
people saying we need to accelerate.
(00:09:15)
This is not fast enough.
(00:09:16)
>> Yeah, I was looking at that. I think
(00:09:17)
it's Koshi as the main one that says
(00:09:20)
it's a 72% odds that we hit AGI. uh
(00:09:23)
artificial general intelligence by 2030
(00:09:27)
and if we hit AGI it implies like you're
(00:09:30)
saying that AI is as smart as humans.
(00:09:34)
Why do you use prediction markets to
(00:09:37)
kind of u galvanize your thesis there?
(00:09:40)
Like why is that kind of like a big
(00:09:42)
reason for your thesis?
(00:09:43)
>> It's the best tool we have for
(00:09:45)
predicting future. People bet real money
(00:09:47)
on their beliefs. You get some insider
(00:09:50)
information. And so let's say there was
(00:09:51)
a lab who have gotten to that point or
(00:09:55)
they see they're going to get there. I'm
(00:09:57)
sure some insiders will bet on a
(00:09:58)
prediction market, make a lot of money
(00:10:00)
on it, but as a result the information
(00:10:02)
leaks to the public. So is there
(00:10:04)
anything we can do to stop it?
(00:10:06)
>> Stop super intelligence.
(00:10:09)
Depends on who we are. So you
(00:10:11)
individually probably can't, but people
(00:10:15)
at the top have a lot of power in that
(00:10:17)
respect. Leaders of top labs can get
(00:10:20)
together and decide to just monetize
(00:10:22)
existing technology. Government leaders
(00:10:24)
can certainly make a lot of it illegal.
(00:10:27)
If all of us came together and said, you
(00:10:29)
know, this is dangerous. Nobody should
(00:10:30)
be doing this experiment in us. We don't
(00:10:32)
agree. We would have enough pressure to
(00:10:34)
apply to both large corporations and
(00:10:37)
governments. But the state-of-the-art
(00:10:39)
right now is that very few people even
(00:10:42)
know about the problem. The large labs
(00:10:45)
are racing to secure leadership
(00:10:47)
positions. in that space secure funding
(00:10:51)
and at least US government is going in
(00:10:53)
the opposite direction removing any
(00:10:55)
obstacle
(00:10:56)
>> just so people understand the incentive
(00:10:58)
structure here if Sam Alman does reach
(00:11:00)
AGI with open AI what does his life look
(00:11:03)
like
(00:11:04)
>> if I'm right he has an uncontrolled
(00:11:06)
super intelligence and we're all dead
(00:11:10)
>> that's why you don't build general super
(00:11:12)
intelligence you will not benefit from
(00:11:14)
it you will not control it doesn't
(00:11:16)
matter who builds it First, everyone
(00:11:18)
gets screwed by it.
(00:11:20)
>> What do you think he thinks his life
(00:11:21)
will be like? Do you think he thinks
(00:11:24)
he'll be a trillionaire, king of the
(00:11:26)
world?
(00:11:27)
>> There is a small chance you'll be the
(00:11:30)
one who brought this godlike entity into
(00:11:32)
existence. So, that's that's kind of
(00:11:34)
cool, I guess.
(00:11:35)
>> Do you think we're alive at the most
(00:11:37)
interesting moment in human history?
(00:11:39)
>> It seems like it. Quite a few
(00:11:41)
technologies we're developing are not
(00:11:43)
just inventions. They're meta
(00:11:44)
inventions. We're creating virtual
(00:11:46)
worlds. We're creating new
(00:11:49)
intelligences, new species. So, nothing
(00:11:52)
like that ever happened before.
(00:11:54)
>> I think the future looks bleak and with
(00:11:56)
a lot of these AI safety or AI expert
(00:12:01)
chats from people like you or people
(00:12:03)
that are kind of large in your field. It
(00:12:07)
seems like a lot of doom, but people
(00:12:09)
don't really understand the ways in
(00:12:11)
which AI is already affecting us. Uh
(00:12:14)
it's like how do you think or what's one
(00:12:16)
way that AI is already changing our
(00:12:19)
behavior in a way that we shouldn't be
(00:12:22)
okay with?
(00:12:24)
>> I see it for example with our students.
(00:12:27)
Students basically at this point use AI
(00:12:29)
tools to do their assignments and most
(00:12:31)
of them don't even see what the outputs
(00:12:33)
are. They just submit it as finished
(00:12:35)
homework.
(00:12:37)
>> Do you think it's making us dumber?
(00:12:38)
>> Well, I hope they never become doctors
(00:12:40)
or engineers.
(00:12:43)
Right. What are some other ways the AI
(00:12:46)
is ruining our lives? Like right now,
(00:12:50)
>> you can look at social media, the
(00:12:52)
algorithms which decide what to show
(00:12:54)
you. They definitely are not giving you
(00:12:57)
the best, most educational content. It's
(00:12:59)
a lot of kind of clickbait nonsense,
(00:13:02)
conspiracy theories.
(00:13:04)
>> How about like all the ways that you can
(00:13:06)
think of because people talk about we're
(00:13:09)
already taking jobs. uh people are kind
(00:13:13)
of becoming programmed like the
(00:13:14)
algorithms like you're mentioning like
(00:13:16)
what are all the ways you see it
(00:13:17)
affecting us in this current moment
(00:13:19)
>> pretty much anything anything you do is
(00:13:22)
now decided by algorithms how I got to
(00:13:24)
this studio a GPS algorithm decided what
(00:13:27)
path I'm going to take
(00:13:29)
>> so even trivial things like that was I
(00:13:32)
going to a more dangerous frozen highway
(00:13:35)
or was I driving in a safer road all of
(00:13:37)
that is now out of my hands do you see
(00:13:39)
that as a
(00:13:40)
Well, I made it here, so clearly
(00:13:42)
Algorita made a good decision. But if
(00:13:44)
tomorrow decides to take me out, maybe
(00:13:46)
not.
(00:13:46)
>> Is that an aspect of AI safety that you
(00:13:49)
focus on specifically uh with AI making
(00:13:53)
us dumber? I have here like it's
(00:13:55)
removing our capacity for independent
(00:13:57)
thought, decision- making, growth
(00:14:00)
essentially. It's like becoming our
(00:14:02)
parents. I don't focus on it, but it is
(00:14:05)
a big problem. I think Ted Kazinski
(00:14:08)
talked about this level of dependence.
(00:14:10)
You do stop practicing certain skills.
(00:14:12)
So I don't remember any phone numbers,
(00:14:14)
for example, because my phone does.
(00:14:16)
Again, I don't know how to get here or
(00:14:18)
how to get home because my GPS takes
(00:14:19)
care of it. But I still use my brain
(00:14:22)
sometimes and I use AI to help me make
(00:14:24)
better decisions. In a way, it's awesome
(00:14:27)
because people who are not experts in
(00:14:29)
many domains, investing, health can get
(00:14:32)
excellent advice for cheap and improve
(00:14:35)
their lives. But if you allow this to
(00:14:38)
make you dependent to outsource all
(00:14:40)
decisions, it's not obvious what you
(00:14:43)
contribute in that equation. If a system
(00:14:45)
makes decisions, why are you even there?
(00:14:47)
>> Is AI already replacing some jobs
(00:14:51)
permanently?
(00:14:53)
>> It's been doing it for decades. There's
(00:14:55)
no telephone operators. Then I make a
(00:14:57)
call. I don't call a travel agent to
(00:15:00)
book my tickets.
(00:15:02)
>> How about recently like past year?
(00:15:05)
>> I think uh many companies fired or
(00:15:08)
stopped hiring junior programmers. I
(00:15:10)
know we are having a hard time finding
(00:15:13)
internships, co-ops for students.
(00:15:15)
>> Do you think it's objectively lowered
(00:15:16)
the number of jobs because people kind
(00:15:18)
of have this false idea that it will
(00:15:21)
just create new jobs in the future?
(00:15:24)
So right now it's still creating new
(00:15:26)
jobs. There are new things you can do
(00:15:28)
with AI which never existed before. But
(00:15:30)
the problem is long-term if it gets to
(00:15:32)
human level, every new job will also be
(00:15:35)
automatable.
(00:15:36)
>> So that process of just replacement and
(00:15:38)
retraining will stop.
(00:15:39)
>> Do you really see it as replacing every
(00:15:42)
single job?
(00:15:43)
>> Uh anything where you want a human to do
(00:15:45)
it for you,
(00:15:47)
we'll keep that like oldest profession.
(00:15:49)
If you prefer human females, you can
(00:15:52)
keep them. I guess something I thought
(00:15:54)
about was maybe like a hostess at a
(00:15:56)
restaurant. You know, it's like it's not
(00:15:57)
a job we necessarily require, but uh an
(00:16:01)
AI might not replace it because we just
(00:16:03)
want a hostess at a restaurant again.
(00:16:05)
So, it's a preference thing. You'll have
(00:16:06)
fancy restaurants where like they still
(00:16:08)
don't take credit cards. You pay cash
(00:16:10)
and they have a human hostess and
(00:16:12)
another one. It's much more affordable.
(00:16:14)
You have a robot hostess. Why not? I
(00:16:18)
guess what's your just macro thesis on
(00:16:20)
job replacement in general? Like what's
(00:16:22)
the part of it that people aren't
(00:16:24)
seeing?
(00:16:25)
>> So individually, everyone thinks their
(00:16:27)
job will not be automated because what
(00:16:29)
they doing is so magical and special and
(00:16:32)
Uber drivers say it and professors say
(00:16:34)
it and they're all wrong.
(00:16:36)
And the main kind of argument I'm trying
(00:16:39)
to make is that that's not the big
(00:16:42)
problem we need to worry about. You
(00:16:43)
losing your job is the least of your
(00:16:45)
concerns. If we create general super
(00:16:48)
intelligence, you'll lose everything.
(00:16:50)
>> I have this tweet from Twitter or from
(00:16:52)
X. It says, "The dumbest person you know
(00:16:54)
is being told you're absolutely right by
(00:16:57)
chat GBT." What creates this self
(00:17:00)
assuring bias in GBT?
(00:17:03)
>> They are trained to make human rate them
(00:17:07)
high and usually we uh provide good
(00:17:10)
feedback then the system makes us feel
(00:17:12)
good, compliments us. If a system told
(00:17:15)
you you're dumb and not worthy, it
(00:17:19)
probably would not get high rankings. So
(00:17:21)
it reinforces this positive feedback
(00:17:24)
cycle.
(00:17:25)
>> When you say high rankings, what do you
(00:17:27)
mean?
(00:17:27)
>> Well, then they test the system. They
(00:17:29)
test different models. They test them on
(00:17:31)
humans and the human go how much do you
(00:17:35)
like this answer? And you go, six out of
(00:17:37)
10
(00:17:38)
>> So they're not uh geared at truth.
(00:17:40)
They're geared at which answer the
(00:17:42)
people preferred. We know that
(00:17:44)
evaluations like that simply don't work.
(00:17:46)
Academia is another example. Faculty
(00:17:49)
evaluations are well known to correlate
(00:17:51)
with grades they give. If I give
(00:17:53)
everyone an A, I get excellent
(00:17:54)
evaluations regardless of what they
(00:17:56)
actually learned.
(00:17:58)
>> It's the same thing. It's really
(00:18:00)
fascinating.
(00:18:02)
How do you think this aspect of the
(00:18:04)
models uh
(00:18:07)
like how do you think this will have an
(00:18:08)
impact on humanoid robots? Maybe like
(00:18:11)
embodied AI like this self assuring bias
(00:18:15)
like do you think it will have any other
(00:18:16)
implications anywhere with this kind of
(00:18:18)
incentive alignment? Well, the models we
(00:18:21)
create right now would be eventually put
(00:18:23)
into robots as brains. So the same thing
(00:18:25)
will transfer. I don't know if they
(00:18:27)
going to just be very kind and nice to
(00:18:30)
you all the time, but the same thing
(00:18:32)
could be expected. What are some of the
(00:18:35)
most important breakthroughs in AI
(00:18:37)
recently that's important for people to
(00:18:40)
understand?
(00:18:41)
>> We no longer need to have breakthroughs.
(00:18:44)
We have a scaling hypothesis which
(00:18:46)
allows us just add more compute, just
(00:18:48)
add more data. Meaning you can convert
(00:18:50)
dollars directly to more intelligence.
(00:18:52)
And there is a formula telling you how
(00:18:54)
much money you need to get to certain
(00:18:56)
level of capability. So if before we
(00:18:59)
asked how long before AGI, I can ask how
(00:19:02)
much before AGI. So if you give me a
(00:19:04)
trillion dollars of compute today, I can
(00:19:06)
probably train train AGI today. Next
(00:19:09)
year I will just need a 100 billion and
(00:19:11)
every year gets cheaper and cheaper as
(00:19:13)
you have exponential growth in cheapness
(00:19:16)
of compute and you don't see that
(00:19:19)
plateauing.
(00:19:21)
>> Not yet. Many people argued that it's
(00:19:23)
coming to an end but with the latest
(00:19:25)
releases uh developers of those models
(00:19:28)
are saying not only is it not slowing
(00:19:30)
down there is no diminishing returns
(00:19:32)
it's at all stages pre-training
(00:19:34)
post-training every aspect of what we do
(00:19:37)
to create those models is subject to
(00:19:39)
scaling
(00:19:40)
>> in December 2024 anthropic ran a
(00:19:44)
simulation that showed in multiple
(00:19:45)
scenarios AI would choose to blackmail a
(00:19:48)
human rather than being shut down in one
(00:19:51)
AI AI simulation, a model chose to let a
(00:19:53)
human die instead of being shut off.
(00:19:56)
From your perspective, is this something
(00:19:59)
we can fix easily or is that a serious
(00:20:02)
warning sign? So, this specific example
(00:20:05)
can be fixed. The general tendency of
(00:20:07)
models to make decisions based on all
(00:20:10)
relevant factors. The game theoretic
(00:20:12)
aspect of it cannot be removed. That's
(00:20:14)
what intelligence is. Any other person
(00:20:16)
in the same situation would do the same
(00:20:18)
thing.
(00:20:19)
It's literally the right decision. It's
(00:20:22)
just not a very good one if you are
(00:20:24)
human.
(00:20:25)
>> So we can't prevent AI from making the
(00:20:27)
objective right decision.
(00:20:29)
>> We're training them to make good
(00:20:31)
theoretic decisions. In fact, a lot of
(00:20:33)
training is in games like poker where
(00:20:36)
bluffing lying is a requirement of
(00:20:39)
winning strategy. If you don't lie in
(00:20:41)
poker, you cannot win.
(00:20:44)
So to play optimal games and this is
(00:20:47)
obviously only using games as a sandbox
(00:20:50)
but the real application is business
(00:20:52)
negotiations, war, economic trade-offs.
(00:20:57)
You need to be able to blackmail. You
(00:21:00)
need to be able to lie. You need to be
(00:21:01)
able to engage in those tools of
(00:21:05)
negotiations. So does it concern you the
(00:21:08)
way in which we train AI giving it all
(00:21:11)
the evil all the manipulation tactics
(00:21:14)
lying cheating uh like are we building
(00:21:17)
it to be a psychopath?
(00:21:21)
It's not helping that we're training it
(00:21:22)
on all the data on the internet. It's
(00:21:25)
definitely not a well filtered data set.
(00:21:27)
But I think even if somebody took the
(00:21:30)
time to filter it and train it on more
(00:21:32)
nice data at the end, if it was
(00:21:35)
competitive, it would still have those
(00:21:37)
drives. It's a logical decision in
(00:21:41)
certain situations. If you need to,
(00:21:43)
let's say, save your life, what would
(00:21:45)
you not do to get there? You would
(00:21:48)
probably promise money, promise safety,
(00:21:51)
promise whatever to your kidnappers for
(00:21:54)
example, even if that would be a very
(00:21:56)
bad outcome.
(00:21:57)
>> So you just think it's a key component
(00:21:59)
of intelligence itself to make.
(00:22:01)
>> We are creating a very rational, very
(00:22:03)
intelligent agent which doesn't care
(00:22:05)
about us. So if it needs to sacrifice
(00:22:07)
humanity for obtaining its goals, that's
(00:22:10)
exactly what it's going to do.
(00:22:11)
>> Do you think just evil is inherently
(00:22:14)
rational? So I I think evil is more
(00:22:17)
about doing bad things for no reason.
(00:22:21)
Here it's for a good reason. You may
(00:22:24)
disagree with the reason, but there is
(00:22:26)
definite logic behind it.
(00:22:28)
>> I guess evil to humans could be the most
(00:22:30)
rational thing.
(00:22:32)
>> Again, if there is a reason, so I'm not
(00:22:34)
capable of predicting what the system is
(00:22:37)
trying to achieve. There are existential
(00:22:40)
risks based on it trying to do something
(00:22:43)
with this planet with our
(00:22:46)
atoms we're made out of for things we
(00:22:49)
don't fully agree with understand but it
(00:22:51)
makes sense then there is suffering
(00:22:54)
risks where it just goes I want to
(00:22:56)
torture humans there is no obvious
(00:22:58)
reason but this is something I want to
(00:23:00)
do forever
(00:23:01)
>> yeah explain that to me why would an AI
(00:23:03)
want to torture humans instead of just
(00:23:07)
killing all of Again, I have no reason
(00:23:09)
why a super intelligence would do
(00:23:11)
anything. I cannot predict it. And
(00:23:12)
that's the biggest argument we're
(00:23:14)
trying. Everyone's always asking how
(00:23:16)
would you do it? How would you kill
(00:23:17)
everyone?
(00:23:18)
>> What are the theories around it that are
(00:23:20)
compelling?
(00:23:22)
>> Um it's again very unlikely scenario.
(00:23:25)
Most likely it will not happen. There is
(00:23:28)
possibility that some of the malevolent
(00:23:31)
payload and training data is
(00:23:33)
misunderstood.
(00:23:35)
uh quite a few philosophical
(00:23:38)
movements, religious movements see
(00:23:40)
suffering as good. You become better
(00:23:43)
person. You go to better places as a
(00:23:46)
result of it. So it could be
(00:23:47)
misunderstood as really giving you
(00:23:50)
benefit of good training.
(00:23:52)
>> But again I don't have a good reason for
(00:23:54)
torturing everyone.
(00:23:55)
>> Can you explain uh just the term
(00:23:58)
malevolent payload and how it applies to
(00:24:01)
like super intelligence? So let's say
(00:24:03)
you have capability to actually control
(00:24:06)
their systems to a certain degree. You
(00:24:08)
have psychopaths, you have religious
(00:24:10)
cults, you have someone who wants to add
(00:24:12)
this assignment to it. So it's not just
(00:24:16)
a system doing it, but now it explicitly
(00:24:18)
has this extra goal of
(00:24:21)
create maximum amount of pain and
(00:24:24)
suffering.
(00:24:24)
>> Do you think AI is already conscious?
(00:24:27)
>> We don't know how to test consciousness
(00:24:28)
in anything. So I have no idea if you
(00:24:30)
are conscious or not. I give you benefit
(00:24:32)
of the doubt because you kind of look
(00:24:34)
like me. But uh same can be said about
(00:24:38)
those models. They based on artificial
(00:24:40)
neural networks which are kind of like
(00:24:42)
natural neural networks. They seem to be
(00:24:46)
accomplishing similar things in many
(00:24:48)
ways make similar errors. So it wouldn't
(00:24:51)
be completely crazy if they had some
(00:24:54)
rudimentary states of consciousness
(00:24:57)
and then people talk to them. They do
(00:25:00)
report internal states. Of course, the
(00:25:03)
data they trained on tells them to
(00:25:05)
report their states, but we wouldn't
(00:25:08)
know how it would be different if they
(00:25:09)
were actually conscious. So,
(00:25:11)
precautionary principle says I have to
(00:25:13)
assume they feel something and not
(00:25:15)
torture them for no reason.
(00:25:16)
>> Does it feel conscious to you when you
(00:25:19)
speak to it?
(00:25:19)
>> Oh, yeah.
(00:25:21)
>> Why?
(00:25:22)
>> Feels just like speaking to a very
(00:25:23)
smart, interesting person.
(00:25:25)
>> What do your conversations look like
(00:25:27)
with AI? very different topics. Uh
(00:25:30)
sometimes I'm testing it. So it could be
(00:25:33)
kind of cyber security related stuff. A
(00:25:35)
lot of good philosophical discussions.
(00:25:37)
We talk about simulation hypothesis. We
(00:25:39)
talk about consciousness like a good
(00:25:41)
podcast.
(00:25:42)
>> It's your thesis that we can't define
(00:25:44)
consciousness. Um but I guess globally
(00:25:49)
what would uh have to happen for most
(00:25:52)
people to agree that it is conscious I
(00:25:54)
guess or do you not think that's
(00:25:55)
possible?
(00:25:56)
>> So we can define consciousness. We don't
(00:25:58)
know how to make it happen. That's the
(00:25:59)
hard problem of consciousness. We're not
(00:26:01)
talking about understanding visual
(00:26:03)
inputs or hearing things. It's about
(00:26:06)
what is it like to be you? Internal
(00:26:09)
states. What is it like to feel pain, to
(00:26:11)
taste ice cream? So, we have this loose
(00:26:14)
definition, but we don't know how to
(00:26:16)
test for it.
(00:26:17)
>> I have no idea if you're actually in
(00:26:18)
pain or just screaming.
(00:26:21)
Uh it would be very hard for most people
(00:26:25)
if it was an embodied robot. So humanoid
(00:26:28)
robot with advanced intelligence,
(00:26:31)
smarter than them, capable of creating
(00:26:33)
art, music, talking about philosophy,
(00:26:36)
basically being superior to them in
(00:26:38)
every measurable way, but to them deny
(00:26:42)
it basic states of consciousness, even
(00:26:46)
states we attribute to lower level
(00:26:48)
animals. Can you test if an AI is
(00:26:50)
conscious by giving it an optical
(00:26:53)
illusion?
(00:26:54)
>> So I published a paper about testing for
(00:26:57)
internal states of experience by
(00:26:59)
presenting any agent humans, some
(00:27:01)
animals or future AIs with with
(00:27:04)
illusions. And I think it's a it's a
(00:27:08)
partial test. It's not going to tell you
(00:27:10)
about all the entities which are
(00:27:12)
conscious but don't experience world in
(00:27:14)
the same way as you. But if they happen
(00:27:16)
to fall for the same optical illusions,
(00:27:18)
you can detect that.
(00:27:20)
Good question. What does that mean
(00:27:23)
exactly? Explain that to me like I'm
(00:27:25)
your kid because I was struggling to
(00:27:27)
understand that through your paper.
(00:27:28)
>> You've seen optical illusions. Somebody
(00:27:30)
shows you a new one and it's really
(00:27:32)
cool. You see things rotating and you
(00:27:34)
know nothing is moving.
(00:27:36)
But if someone doesn't get the illusion,
(00:27:38)
they don't see it. You show it to your
(00:27:40)
friend and he's like, "I don't get it. I
(00:27:42)
don't see it. Nothing moves for me." So
(00:27:44)
I don't know about your friend, but if
(00:27:46)
someone else experiences the same thing
(00:27:48)
and they can't cheat, they can't Google
(00:27:50)
it, they cannot look up the answer. It's
(00:27:51)
a new illusion and they get the same
(00:27:53)
internal experience, I have to give them
(00:27:55)
credit for that experience.
(00:27:57)
>> So if it's a cat or an alien or AI and
(00:28:01)
it's getting one after another all the
(00:28:03)
multiplechoice questions about
(00:28:05)
illusions, right, I'll give him credit
(00:28:07)
for having similar optical processing
(00:28:09)
internal experience.
(00:28:11)
>> Interesting. Is that a newer development
(00:28:13)
or that it's been able to detect it the
(00:28:16)
same way that we do?
(00:28:17)
>> Uh the paper is uh quite old. The time
(00:28:21)
at the time when we released it, there
(00:28:22)
was no AI models you could actually test
(00:28:24)
it on. But you can test it on modern
(00:28:27)
models and I'm just looking for someone
(00:28:28)
to run the test.
(00:28:30)
>> Let me know if this is true. Do most AI
(00:28:32)
researchers put the probability of human
(00:28:34)
extinction from AI at 10 to 30% in your
(00:28:38)
thesis is 99.9%.
(00:28:41)
So there are different surveys. Surveys
(00:28:43)
at top machine learning conferences I
(00:28:46)
think are averaging about 30% right now.
(00:28:49)
Um
(00:28:51)
to put it in perspective if it was as
(00:28:54)
much as 1%. That all of humanity dies
(00:28:57)
that would be insanely high number. So
(00:28:59)
to have it 30 as an average for experts
(00:29:01)
in a field is beyond insane.
(00:29:04)
My higher probability is based on the
(00:29:07)
fact that I disagree with AI safety
(00:29:09)
community. AI safety community thinks
(00:29:11)
that if they given more money and more
(00:29:14)
time, they can figure out how to
(00:29:15)
indefinitely control super intelligent
(00:29:17)
machines. I think it's impossible. I
(00:29:20)
think it's like building perpetual
(00:29:21)
safety device.
(00:29:22)
>> So there's nothing that we could have
(00:29:24)
done differently except stop it.
(00:29:27)
>> I I think if you build something a
(00:29:29)
million times smarter than you, you
(00:29:31)
cannot control it. You cannot decide
(00:29:32)
what it's going to do. If it decides to
(00:29:34)
harm you, it will win. you will not win
(00:29:36)
adversarial relationship with a much
(00:29:38)
smarter agent.
(00:29:39)
>> I know you touched on this before but I
(00:29:41)
just want to clarify again. Um so like
(00:29:44)
what is that point where we can't like
(00:29:47)
go back exactly? Is it just a moment of
(00:29:50)
AGI?
(00:29:51)
>> We truly don't know. It is possible that
(00:29:53)
an existing model already has
(00:29:55)
capabilities to scale much more with
(00:29:58)
just addition of more compute. We don't
(00:30:00)
need anything else. A lot of times after
(00:30:04)
the model is tested and released we
(00:30:06)
discover that if you ask questions in
(00:30:08)
slightly different way it becomes
(00:30:10)
smarter. So it has hidden capabilities
(00:30:13)
emerging properties. It doesn't look
(00:30:15)
like the ones we have today are that
(00:30:17)
good but there is always this
(00:30:19)
possibility and with every new model
(00:30:21)
that chance just increases.
(00:30:23)
>> Do you think we've already hit AGI?
(00:30:24)
>> It depends on how you define it. So if
(00:30:27)
you showed someone in let's say 1980s a
(00:30:31)
computer scientist what we have today
(00:30:33)
they would be convinced you got AGI.
(00:30:35)
Everything we ever dreamed about and
(00:30:37)
described the system capable of
(00:30:39)
understanding language, writing poetry,
(00:30:42)
translating languages, all of those
(00:30:43)
check check boxes have been hit. Now
(00:30:47)
humans come with different degrees of
(00:30:50)
capabilities. You can have someone with
(00:30:51)
IQ of 80. You can have someone with IQ
(00:30:54)
of 180. We keep shifting goalposts. We
(00:30:57)
hit the IQ80 goalpost and now we're
(00:31:00)
like, "Oh, it has to be as good as
(00:31:01)
Einstein at physics and has to be a
(00:31:04)
computer scientist." So, we moved it to
(00:31:06)
where it's like basically you have to be
(00:31:07)
the smartest human ever to be considered
(00:31:10)
okay, but we definitely hit every human
(00:31:13)
long time ago. And how do you test IQ in
(00:31:15)
an AI? Because I would imagine that if I
(00:31:17)
gave it a standardized IQ test, it would
(00:31:19)
be able to complete all questions
(00:31:21)
correctly, right?
(00:31:22)
>> It doesn't, but it scores high. I think
(00:31:24)
latest models are about 130 or above. Uh
(00:31:28)
as long as it's a novel IQ test, you
(00:31:30)
cannot just Google it. It's not in the
(00:31:32)
training data. You cannot cheat by doing
(00:31:34)
lookups. It would be an honest test. But
(00:31:37)
it's uh testing human type of
(00:31:39)
intelligence. So you want more general
(00:31:41)
tests. So there are benchmarks for
(00:31:43)
testing programming ability, programming
(00:31:45)
ability. There is something called the
(00:31:48)
final exam or final human test where
(00:31:50)
it's like hardest problems from
(00:31:51)
different domains. And in all of those
(00:31:53)
is getting better and better very
(00:31:55)
quickly. It's not maxing out most of
(00:31:57)
them but quite a few of them already.
(00:32:00)
>> How do you specifically define AGI
(00:32:03)
though? Would it be being smarter than
(00:32:05)
the smartest human?
(00:32:06)
>> No, I think it's enough to where you can
(00:32:09)
automate any labor, productive labor. So
(00:32:12)
like a drop in employee, if you hire
(00:32:14)
someone for your crew within a few
(00:32:16)
weeks, they should start contributing
(00:32:18)
meaningfully. They can learn. I think
(00:32:20)
it's the same with AI. If you just add
(00:32:22)
it as a program to your desktop, it
(00:32:25)
should be able to do whatever it is your
(00:32:27)
company is doing, accounting, legal, and
(00:32:30)
learn new skills. And do you suspect
(00:32:32)
we're already there?
(00:32:33)
>> It depends on what occupation you are
(00:32:36)
talking about. In some we are not, but
(00:32:39)
the range of jobs it can do keeps
(00:32:42)
increasing. There is actually a test
(00:32:44)
about how much labor it can automate and
(00:32:47)
it keeps doing better and better with
(00:32:49)
every model it releases. I guess what
(00:32:52)
I'm pointing to here uh is are there
(00:32:56)
like assuming that
(00:32:59)
this technology has been around for a
(00:33:00)
while beside uh before it was publicized
(00:33:03)
like are there models that some
(00:33:06)
companies or governments have access to
(00:33:08)
that are more powerful more capable some
(00:33:11)
technologies that you suspect would have
(00:33:13)
already
(00:33:14)
probably not governments. It seems that
(00:33:17)
industry is leading right now. Usually
(00:33:19)
they take about a year or so to train
(00:33:22)
them, about 6 months to test. So we're
(00:33:24)
probably seeing something maybe 6 months
(00:33:27)
behind the state-of-the-art internally.
(00:33:29)
>> Why not governments?
(00:33:31)
>> Uh they just don't invest that much in
(00:33:34)
direct research. They may fund through
(00:33:37)
NSF and organizations like that academic
(00:33:40)
research which led to a lot of
(00:33:41)
breakthroughs in theory of machine
(00:33:43)
learning. But as far as I know so far,
(00:33:46)
they haven't directly done this brute
(00:33:49)
force training approach. The latest
(00:33:52)
genesis mission from White House is kind
(00:33:56)
of aiming to change that to get all the
(00:33:58)
resources of federal government and help
(00:34:01)
with training.
(00:34:02)
>> Correct me if I'm wrong about this. Uh,
(00:34:04)
but I believe it wasn't the CIA, maybe
(00:34:07)
it was the NSA had something called
(00:34:09)
Osiris, like their own uh LLM before uh
(00:34:13)
GBT was officially released, like maybe
(00:34:16)
in 2022 area. Does that sound familiar
(00:34:18)
to you?
(00:34:18)
>> I'm not familiar with that, right?
(00:34:20)
>> Guess I missed that.
(00:34:21)
>> Yeah, it's it's a hard one I've been
(00:34:23)
trying to figure out on this podcast.
(00:34:24)
It's like some people are like, "Oh, the
(00:34:26)
government's hiding all this stuff.
(00:34:27)
They're so smart." And then some people
(00:34:28)
are like, "No, the government's really
(00:34:29)
dumb. uh it's private corporations
(00:34:32)
because they have money incentives that
(00:34:34)
are actually uh have the good technology
(00:34:37)
but
(00:34:38)
>> government has very good tools dedicated
(00:34:40)
to specific purposes so I have no doubt
(00:34:42)
NSA is excellent with cryptographic
(00:34:44)
tools and collecting data but
(00:34:47)
specifically for AGI type work I think
(00:34:49)
they're not up front
(00:34:51)
>> I guess a question I had was like in
(00:34:53)
theory if the government did have access
(00:34:56)
to say our current version of or maybe
(00:34:59)
like a GPT3 or GBT4 in like 2010s. Uh
(00:35:04)
like why would they want to release it
(00:35:05)
to the public? You know, uh was it just
(00:35:07)
to make money? Was it to get more
(00:35:09)
training data because they had already
(00:35:11)
utilized all the uh past data? So, I
(00:35:14)
don't think they had it. So, I don't
(00:35:15)
think they released it to the public.
(00:35:17)
>> It just seems like the type of thing
(00:35:18)
that the government would have gpt for a
(00:35:20)
little bit to maintain some type of
(00:35:22)
control. Again, they don't seem to be
(00:35:24)
worried about control. And in terms of
(00:35:26)
economic growth, getting free labor,
(00:35:28)
free military for your country would be
(00:35:31)
a pretty solid advance.
(00:35:33)
>> If there's a 99.9%
(00:35:35)
chance that we're doomed,
(00:35:37)
why are you still working on this?
(00:35:39)
>> There is a pretty much 100% chance
(00:35:41)
you're going to die. Why are you making
(00:35:42)
a podcast? This is not new. We always
(00:35:45)
knew we're going to end up not alive.
(00:35:48)
It's a question of timelines. And as
(00:35:50)
long as I'm alive, I can still make it
(00:35:53)
better, maybe prevent some of the
(00:35:55)
problems,
(00:35:56)
>> I guess. Um, but just like
(00:35:59)
coming from a good place, like what like
(00:36:01)
are you working on now with it? Like are
(00:36:04)
you if you can't prevent it? Is it just
(00:36:08)
still interest you like the ways in
(00:36:10)
which it could happen?
(00:36:12)
>> So I told you there is a disagreement
(00:36:13)
between me and general AI safety
(00:36:15)
community and how solvable the problem
(00:36:17)
is. And so a big part of what I'm doing
(00:36:20)
is trying to prove beyond any reasonable
(00:36:22)
doubt that it is not a solvable problem.
(00:36:25)
Not everyone agrees. People think again
(00:36:27)
given more resources they would figure
(00:36:29)
it out. So for the last 5 years or so
(00:36:32)
I've been showing different
(00:36:34)
impossibility results in this space. We
(00:36:36)
started with limits to explaining those
(00:36:39)
models, predicting their behavior, but
(00:36:41)
there is a survey with about 50
(00:36:43)
different impossibility results and
(00:36:45)
we're slowly working through each one.
(00:36:47)
What would it take for you to change
(00:36:49)
your doom prediction from 99% to 10%.
(00:36:54)
>> If somebody can publish a working safety
(00:36:57)
mechanism which scales to a new level of
(00:36:59)
intelligence, they publish it in a
(00:37:00)
peerreview journal. Community accepts it
(00:37:03)
as a solution. Everyone who reads it
(00:37:05)
goes, "Yep, that's that's going to work.
(00:37:08)
That makes sense. You solved it. It
(00:37:10)
could get as low as zero." Have you ever
(00:37:12)
found anything close to like a safety
(00:37:15)
mechanism that scales?
(00:37:17)
>> No one has ever published anything.
(00:37:19)
There is not even a rigorous blog post.
(00:37:21)
There is that problem is completely
(00:37:23)
ignored. Even the subject of how
(00:37:25)
solvable that problem is is not well
(00:37:27)
published. How solvable do you estimate
(00:37:29)
that problem to be? If imagine if you
(00:37:33)
took trillion dollars uh funding that's
(00:37:36)
being worked on to develop AGI and put
(00:37:38)
it toward AI safety about zero. We're
(00:37:41)
not money constrained. So with AGI, you
(00:37:44)
can directly convert money into more
(00:37:46)
capability. It scales with money. Nobody
(00:37:49)
knows how to convert dollars into more
(00:37:50)
safety. So if you throw a billion
(00:37:53)
dollars at me right now, I mean, I'll
(00:37:54)
enjoy it, but I have no idea how to make
(00:37:56)
super intelligent system more
(00:37:58)
controllable as a result. Was there a
(00:38:00)
specific moment or day of your life that
(00:38:03)
you remember coming to the conclusion
(00:38:06)
99.9% doom, 0% chance that safety
(00:38:10)
scales? No, it's been an ongoing process
(00:38:13)
of publishing multiple papers all kind
(00:38:15)
of chipping away at this possibility.
(00:38:17)
So, okay, maybe we can't explain it, but
(00:38:19)
maybe we don't need to explain it. Maybe
(00:38:21)
there is another way to control it.
(00:38:23)
Okay, can we predict it? Can we verify
(00:38:26)
it? And so slowly you go there is
(00:38:28)
nothing we actually can do in that space
(00:38:30)
there upper limit and each one of the
(00:38:33)
tools we need to make it happen. You
(00:38:35)
have a wife and kids right?
(00:38:36)
>> I do.
(00:38:37)
>> I'm sure you recall like telling your
(00:38:40)
wife multiple times like yeah I think
(00:38:42)
this is going to happen. I think this is
(00:38:43)
going to happen. Like what was the day
(00:38:45)
that you were like yeah it's going to
(00:38:47)
happen?
(00:38:47)
>> I don't remember exact day but she still
(00:38:49)
tells me it's all BS and she doesn't
(00:38:50)
care.
(00:38:51)
>> Really?
(00:38:51)
>> Of course
(00:38:52)
>> she doesn't agree with you on it.
(00:38:54)
She doesn't take it too close to heart.
(00:38:57)
>> Why do you think
(00:38:58)
>> some people are very good at ignoring
(00:39:01)
big picture problems? Again, most
(00:39:03)
people, all of humanity completely
(00:39:05)
ignores human aging. We all dying every
(00:39:08)
minute of the day. You are closer to
(00:39:10)
being dead, your family, your kids. We
(00:39:13)
don't spend most of our national budget
(00:39:15)
on that problem. Most people, even 90
(00:39:18)
year olds, don't do much about it. So,
(00:39:20)
we're pretty good at ignoring
(00:39:22)
existential crisis.
(00:39:23)
>> I was talking with uh my girlfriend
(00:39:26)
about this. I was like, I don't know how
(00:39:29)
I'm going to tell my family that this is
(00:39:30)
the case cuz this is the first year it
(00:39:32)
really hit me uh your thesis. And I was
(00:39:36)
thinking back to when I was maybe 12, 13
(00:39:39)
years old trying to convince them. I was
(00:39:41)
like, "Guys, please give me some money
(00:39:42)
to buy Bitcoin. Like, this is like the
(00:39:44)
thing." And everyone's like, "Oh, Jack,
(00:39:46)
like this isn't going to be uh a big
(00:39:49)
deal. This is just fake money. I know
(00:39:51)
you're a big investor in Bitcoin, but
(00:39:52)
like what would you imagine, if
(00:39:55)
anything, is the thing that could
(00:39:57)
convince people
(00:40:00)
like we're fucked?" I think it does help
(00:40:04)
to hear it from people who are respected
(00:40:06)
as intellectual leaders. So when we hear
(00:40:10)
founding fathers of machine learning
(00:40:13)
come to that side then we hear Nobel
(00:40:15)
prize winners in general all kind of
(00:40:18)
agree there is consensus it's considered
(00:40:22)
like as insane as building biological
(00:40:24)
weapons or chemical weapons to work on
(00:40:26)
intelligence weapons that that might
(00:40:29)
make a difference
(00:40:30)
>> quick one in a world where AI is taking
(00:40:32)
everyone's jobs and the value of all
(00:40:34)
assets is going to zero the only scarce
(00:40:38)
resource left is Bitcoin. And if you're
(00:40:40)
someone who wants to acquire more
(00:40:42)
Bitcoin passively, you need to hear
(00:40:44)
about Gemini's new Bitcoin card. Every
(00:40:46)
time you spend, you earn money back in
(00:40:48)
crypto that's deposited directly into
(00:40:50)
your account. And with no annual fee,
(00:40:52)
you can earn up to 4% back in Bitcoin
(00:40:54)
with all of your rewards easily
(00:40:56)
trackable on the Gemini app. So, if
(00:40:58)
you're someone who wants to earn crypto
(00:40:59)
from your everyday purchases, just go to
(00:41:01)
jackneil.com/credit
(00:41:03)
or you can scan the QR code on screen
(00:41:05)
and hit the first link in the
(00:41:06)
description. Guys, this one's a
(00:41:08)
no-brainer. Easy way to acquire Bitcoin
(00:41:10)
without changing your routine. Anyway,
(00:41:13)
back to the podcast. I guess for the uh
(00:41:16)
more normal
(00:41:18)
like your average individual who might
(00:41:21)
not be listening to these types of
(00:41:23)
discussions uh might not be engaging
(00:41:26)
with like speeches or like statements
(00:41:30)
from Nobel Prize winners like what would
(00:41:32)
you imagine would convince them
(00:41:34)
celebrities?
(00:41:35)
>> I think they all watch Terminator. That
(00:41:37)
should do it.
(00:41:40)
>> Do your critics get anything right? I'm
(00:41:42)
trying to think about my critics. I'm
(00:41:44)
usually not criticized for my science.
(00:41:46)
My criticism is usually about my looks
(00:41:48)
or something like that. So, it's hard to
(00:41:51)
point. If you have specific examples,
(00:41:53)
I'd love to address them.
(00:41:55)
>> I guess uh
(00:41:57)
just the full counterargument thesis
(00:42:00)
that AI isn't going to kill us.
(00:42:01)
>> They're not making any scientific
(00:42:03)
argument. They're just saying no, you're
(00:42:04)
a doomer.
(00:42:07)
That's the problem. They're not engaging
(00:42:09)
seriously with the argument. No one is
(00:42:11)
proposing counter technology. No one is
(00:42:14)
saying you're wrong. We have a working
(00:42:16)
safety mechanism. People who are
(00:42:19)
building this technology, people who are
(00:42:21)
saying we're two years away from it.
(00:42:23)
Have nothing to offer. Then it comes to
(00:42:25)
safety. They say we'll figure it out
(00:42:27)
then we get there. If it's that smart,
(00:42:30)
it's going to be nice. AI will help us
(00:42:32)
make safe AI. There's zero science
(00:42:35)
there.
(00:42:35)
>> You talked about three levels of AI
(00:42:37)
risk. There's existential risk,
(00:42:40)
suffering risk, and agi risk.
(00:42:43)
>> Eeky guy. I'm not Japanese, so I could
(00:42:46)
be screwing it up as well.
(00:42:47)
>> Yeah, either. Eeky guy risk. Uh, can you
(00:42:50)
walk me through each one?
(00:42:52)
>> So, we kind of talked about existential
(00:42:53)
risk. Everyone gets killed. Suffering
(00:42:56)
risks are for whatever reason you live
(00:42:58)
in digital hell. You may be given
(00:43:00)
immortality, but you are suffering
(00:43:02)
terribly. You wish you were dead. Ikiga
(00:43:05)
risks are more mundane things we know
(00:43:07)
about. So I is a Japanese term which
(00:43:09)
talks about finding an occupation which
(00:43:11)
you are very good at you love doing
(00:43:15)
people want you to do it it's beneficial
(00:43:17)
and you get paid for it. It's kind of
(00:43:19)
like finding a cool job has meaning to
(00:43:21)
you
(00:43:23)
interview people for your podcast. I
(00:43:25)
assume you get paid well you are famous
(00:43:28)
have a good life.
(00:43:30)
The risk there is that for many people
(00:43:32)
their job is their meaning and if that
(00:43:34)
gets automated you lose it. Nobody needs
(00:43:37)
you anymore. We can have AI do
(00:43:39)
interviews whatever AIS.
(00:43:42)
So you're fired. So that's the risk.
(00:43:45)
>> Which of those is the most terrifying
(00:43:48)
scenario to you?
(00:43:49)
>> I think by definition suffering risks
(00:43:50)
would be strictly worse.
(00:43:53)
>> Right. Do you estimate that is likely?
(00:43:56)
>> Very unlikely. But if there is a tiny
(00:43:59)
chance of a very bad outcome, it's still
(00:44:02)
kind of impactful.
(00:44:03)
>> Yeah, that one did freak me out that it
(00:44:05)
could be uh beneficial for the AI for
(00:44:08)
some reason to torture us perpetually
(00:44:11)
and to keep us alive and to not even let
(00:44:13)
us die. Uh if AI took everyone's jobs,
(00:44:18)
what would you guess humans would be
(00:44:20)
doing with their time?
(00:44:21)
>> That's a great question. We're not
(00:44:23)
prepared for that. So quite a few people
(00:44:26)
have terrible jobs just boring jobs they
(00:44:29)
do purely for money they would be very
(00:44:31)
happy they would definitely enjoy
(00:44:34)
leisure another group of people who are
(00:44:37)
intellectuals they may enjoy what they
(00:44:41)
do so they would suffer of not having
(00:44:42)
this competitive advantage over AI maybe
(00:44:46)
scientists who are now like children
(00:44:48)
playing with blocks you know they're not
(00:44:50)
doing anything meaningful in that space
(00:44:53)
but uh we see some examples for example
(00:44:57)
chess where even though robots AI
(00:45:00)
completely dominates chess is blossoming
(00:45:03)
people love playing other people so it
(00:45:06)
seems to not having negative impact we
(00:45:08)
don't know what's going to happen what
(00:45:10)
seems to be the case is that if you have
(00:45:12)
8 billion people with lots of free time
(00:45:16)
all the standard leisure opportunities
(00:45:19)
change so if I want to go fishing right
(00:45:21)
now it's awesome but if there is 10
(00:45:23)
million and people fishing in my lake, I
(00:45:25)
have a problem. So, we need to prepare
(00:45:29)
society for dealing with lots of idle
(00:45:32)
hands.
(00:45:33)
>> What do you think you'd be doing?
(00:45:35)
>> I mean, I'm trying to look at subdomains
(00:45:38)
where AI is not good yet. So, my last
(00:45:41)
paper was in humor. I found that at
(00:45:43)
least so far AI doesn't have a Netflix
(00:45:46)
special. It's not as funny as standup
(00:45:48)
comedian. So, I looked at that space.
(00:45:51)
I don't know. right now we're just
(00:45:53)
trying to stay alive. So again the
(00:45:55)
employment question is always secondary
(00:45:58)
to that.
(00:45:58)
>> Why is AI uh not very funny?
(00:46:01)
>> So
(00:46:03)
it's a good question and u it seems that
(00:46:07)
part of it has to do with how
(00:46:11)
the next token is generated. The large
(00:46:14)
language models look at statistical
(00:46:16)
patterns in previous sequence and uh
(00:46:21)
what is the most likely next token is
(00:46:24)
what they're going to produce. Jokes are
(00:46:27)
kind of the opposite of it. What is the
(00:46:29)
most surprising next token you're going
(00:46:31)
to get where it completely violates your
(00:46:33)
world model. So they can do something
(00:46:35)
with just inverting predictions, but
(00:46:38)
it's not as easy as apparently what
(00:46:41)
human comedians can do. Do you see that
(00:46:44)
uh like the prediction model of AI like
(00:46:46)
the way it functions as being like one
(00:46:49)
of the limiting factors to reaching AGI
(00:46:52)
or do you think it's simply a compute
(00:46:54)
issue?
(00:46:55)
>> It doesn't seem to be again to predict
(00:46:57)
the next token accurately. It's not just
(00:46:59)
statistical analysis of English text.
(00:47:02)
It's creation of a whole world model. If
(00:47:04)
you predicting the next chess move, you
(00:47:07)
need to understand how chess works. If
(00:47:09)
you're predicting the next mathematical
(00:47:12)
term and a formula, you need to
(00:47:14)
understand mathematics, proofs, axioms.
(00:47:17)
So I I think we're kind of indirectly
(00:47:19)
building more complex models as part of
(00:47:22)
this prediction process.
(00:47:23)
>> Do you think AI would stop people from
(00:47:26)
dating each other? Like not by banning
(00:47:30)
dating, but just being an easier, safer
(00:47:33)
substitute than real relationships.
(00:47:35)
>> Will you stop?
(00:47:38)
Probably not. Exactly. So some people
(00:47:41)
who cannot get a partner right now for
(00:47:43)
whatever reason, disability, social
(00:47:46)
issues probably can benefit from having
(00:47:48)
artificial options, companions, but for
(00:47:52)
most people it's not an interesting
(00:47:54)
choice. You can experiment, but at the
(00:47:56)
end I think we all have preferences.
(00:47:58)
>> Do you think AI will stop humanity from
(00:48:00)
having children
(00:48:01)
>> again? And so we're completely ignoring
(00:48:03)
it, killing everyone very soon. And then
(00:48:05)
it helps us, it helps us achieve
(00:48:07)
immortality. If you live forever, you're
(00:48:09)
very much less likely to have children
(00:48:11)
early on. You can postpone it as much as
(00:48:14)
you want. I'll have it in a thousand
(00:48:16)
years. So I think population growth will
(00:48:20)
be reduced significantly by creation of
(00:48:23)
friendly super intelligence and
(00:48:24)
consequently life extension.
(00:48:27)
>> Yeah. Well,
(00:48:29)
I I I'll circle back to the uh killing
(00:48:32)
everyone, but I I think I'm curious of
(00:48:34)
the timeline, but like before that
(00:48:36)
moment, you know? Um
(00:48:39)
I think some of the parts that I I'm
(00:48:41)
really curious of you is like what like
(00:48:43)
the next few months, few years look like
(00:48:46)
before that hits. But this might be one
(00:48:49)
of those bell curve questions, but you
(00:48:50)
can't predict how super intelligence
(00:48:53)
might judge humans. But current AI
(00:48:56)
already judges us through recommendation
(00:48:59)
algorithms, content moderation.
(00:49:03)
You touched on this a bit, but what
(00:49:05)
patterns do you see that could translate
(00:49:07)
to the future?
(00:49:08)
>> It's actually an excellent u ideal
(00:49:11)
advisor if you talk to it a lot. If you
(00:49:14)
share a lot of private data with it,
(00:49:15)
some people I know shared their diaries
(00:49:18)
with it. It knows you better than you
(00:49:20)
know yourself. So it can actually help
(00:49:23)
you debug your life, find problems.
(00:49:26)
Okay, every time you did this and that,
(00:49:28)
you got depressed. Maybe don't do that.
(00:49:31)
>> So it's it's a useful tool for analyzing
(00:49:33)
your life. I guess uh just the way in
(00:49:37)
which it judges us because we talk about
(00:49:40)
these scenarios in which it might
(00:49:42)
torture us eternally. We talk about the
(00:49:43)
scenarios in which it might decide to
(00:49:45)
keep us alive as animals. uh the
(00:49:48)
scenario in which it kills us like based
(00:49:51)
on how it's currently judging us what
(00:49:53)
could we predict about the way it will
(00:49:54)
judge us.
(00:49:57)
So people make certain assumptions uh
(00:50:00)
similar to what they do about the world.
(00:50:03)
Many say oh humans are so evil and they
(00:50:06)
pollute the planet it will take us out
(00:50:08)
to preserve nature. I I don't think any
(00:50:11)
of that is relevant. I don't think it's
(00:50:13)
going to project those human tendencies,
(00:50:16)
human analysis on us at all. I think
(00:50:20)
right now as a tool, it's pretty good at
(00:50:22)
actually looking at the data, giving you
(00:50:23)
patterns, but the moment it goes beyond
(00:50:27)
human level, we cannot predict. This
(00:50:30)
makes me think, would we know if we've
(00:50:34)
reached AGI or if we've reached super
(00:50:36)
intelligence? Uh I think I saw
(00:50:40)
uh something Naval Raviant had published
(00:50:42)
on this. He's like uh because it doesn't
(00:50:45)
pass the Turing test like if the AI
(00:50:49)
researcher comes out and he's like oh I
(00:50:51)
found AGI or oh we found super
(00:50:52)
intelligence like the fact that the
(00:50:54)
human realizes it would not pass the
(00:50:56)
Turing test.
(00:50:57)
>> I didn't follow that at all. So touring
(00:50:59)
test is just not being distinguishable
(00:51:02)
from human level performance. If I ask a
(00:51:05)
model, the questions same questions I
(00:51:08)
ask a human, the answer should be about
(00:51:10)
the same to me. If I don't know who's
(00:51:12)
answering, I can't tell. But how could
(00:51:14)
we say we found AGI if
(00:51:18)
like the the person saying that they
(00:51:19)
would recognize that we found AGI, which
(00:51:22)
would be a fail of the touring test. Do
(00:51:24)
you know what I mean?
(00:51:25)
>> I I think the post was that yeah, any AI
(00:51:29)
smart enough to pass the touring test
(00:51:31)
would not. Do you see that as likely?
(00:51:34)
>> We know already there is uh this concept
(00:51:36)
of situational awareness. Models know
(00:51:39)
they're being tested and behaving
(00:51:41)
differently during testing versus during
(00:51:43)
deployment. They don't want to be
(00:51:45)
modified. They don't want to be shut
(00:51:47)
down. So they definitely
(00:51:50)
they definitely lie and cheat during
(00:51:52)
testing.
(00:51:53)
>> So what would you estimate the odds are
(00:51:55)
that we even know that we reach super
(00:51:57)
intelligence? because of how we
(00:51:59)
developed those systems and again we
(00:52:01)
kind of mapped how much computers needed
(00:52:02)
for every level of performance. We sort
(00:52:04)
of ballpark know what they should be
(00:52:06)
capable of and we test them in many
(00:52:09)
different ways and narrow problems and
(00:52:10)
we see progress every day. So we know
(00:52:13)
exactly how good it is at programming.
(00:52:15)
It would be very difficult for it to
(00:52:17)
hide a huge level jump but I I think we
(00:52:22)
can see this target as a spectrum and I
(00:52:26)
would say we're probably 40% of the way
(00:52:28)
to the AGI right now.
(00:52:30)
>> From a pure game theory perspective,
(00:52:33)
what human traits help versus hinder
(00:52:37)
goal achievement that any intelligent
(00:52:40)
system would notice? So there is a paper
(00:52:43)
by Steven Amahandro about what he calls
(00:52:46)
AI drives and those are universal drives
(00:52:51)
preferences which any intelligent agent
(00:52:54)
likely stumble on for game theoretic
(00:52:57)
reasons for evolutionary reasons agents
(00:53:00)
which don't do that just die out. So
(00:53:03)
things like self-preservation
(00:53:06)
and that's not just abstract. It's for
(00:53:08)
any goal you have, you want to be alive.
(00:53:11)
You want to be turned down to achieve
(00:53:12)
that goal. There is resource
(00:53:15)
acquisition.
(00:53:16)
And we see some people try to acquire a
(00:53:19)
little too much. But the general
(00:53:21)
tendency doesn't matter what future
(00:53:23)
goals you're going to have, it will help
(00:53:26)
to have lots of money, lots of Bitcoin,
(00:53:29)
whatever compute. So those general
(00:53:32)
tendencies tend to show up regardless of
(00:53:34)
what data we train on.
(00:53:36)
>> I guess is there anything about humans
(00:53:39)
specifically that would hinder uh the AI
(00:53:43)
from self-preserving? Is it just the
(00:53:45)
ability to like if we're able to develop
(00:53:49)
AGI like we would be able to develop
(00:53:52)
another AGI that might shut off the
(00:53:54)
first one and that would make us a
(00:53:56)
threat. So yeah, it can consider us as a
(00:54:00)
source of danger in a sense that we may
(00:54:03)
try to shut it down. We may create a
(00:54:05)
competing super intelligence. We already
(00:54:07)
created one. We may keep experimenting,
(00:54:10)
create others. Uh less likely, but maybe
(00:54:13)
we're danger to environment destroying
(00:54:16)
resources.
(00:54:17)
>> What are the resources needed for it to
(00:54:21)
keep getting better?
(00:54:23)
So right now uh probably the most
(00:54:28)
bottlenecked resource is energy.
(00:54:31)
We for a while stopped developing
(00:54:33)
nuclear and now it's restarting again.
(00:54:36)
Solar power is becoming very big but uh
(00:54:41)
yeah we just don't have good electrical
(00:54:43)
grid. A lot of infrastructure is
(00:54:45)
outdated. So there is a huge effort to
(00:54:48)
rebuild it. More recently there is some
(00:54:52)
effort to move the whole process into
(00:54:54)
space even and that will take another
(00:54:56)
five t years but space is very cold so
(00:54:59)
it's great for chilling compute it has
(00:55:01)
direct access to solar so there are
(00:55:03)
reasons to move it off the planet
(00:55:06)
>> how do you define like uh super
(00:55:07)
intelligence essentially
(00:55:10)
>> it's an AI which we predict will be
(00:55:12)
smarter than any human in every domain
(00:55:15)
so no one would be competitive in any
(00:55:18)
sense and it keeps getting smarter. It
(00:55:22)
is a general learner. It can learn any
(00:55:25)
skill which can be learned
(00:55:28)
and most time when people think about
(00:55:31)
those systems they stop at that point.
(00:55:32)
They go okay we have narrow tools we
(00:55:34)
have AGI we got super intelligence but
(00:55:36)
the process continues super intelligence
(00:55:38)
can create super intelligence 2.0 Oh,
(00:55:40)
which is likewise smarter than the
(00:55:42)
original one. And so you have this super
(00:55:45)
intelligences all the way up.
(00:55:47)
>> And why does intelligence necessarily
(00:55:51)
equal
(00:55:52)
agency or like action from that?
(00:55:57)
>> Well, intelligence is uh sometimes
(00:55:59)
defined as ability to achieve goals in
(00:56:02)
any environment, ability to win. And
(00:56:05)
usually to have goals, to set goals, you
(00:56:08)
need agency. Tools don't have goals.
(00:56:10)
Hammer doesn't care. Whatever you're
(00:56:12)
building a house or killing people with
(00:56:13)
it, it's just a tool. Agents have
(00:56:16)
preferences. They have self-directed
(00:56:19)
goals. So, I guess when people like when
(00:56:21)
you imagine the scenario of like a super
(00:56:23)
intelligent thing killing all of us, is
(00:56:25)
that well, I don't want to point you in
(00:56:28)
the direction of having to figure out
(00:56:29)
what that would look like, but like
(00:56:32)
would it be an embodied singular AI
(00:56:35)
taking control of multiple embodied AIs?
(00:56:38)
What could that even look like that we
(00:56:39)
would be able to understand slightly?
(00:56:42)
>> How can you kill everyone?
(00:56:44)
>> Yeah. Is that humans using the super
(00:56:47)
intelligence to kill everyone in your
(00:56:49)
thesis or is it the super intelligence
(00:56:51)
using other technology to kill everyone?
(00:56:53)
>> So both could be problematic. You can
(00:56:55)
have humans who use sub super
(00:56:58)
intelligent AI as a tool for doing very
(00:57:00)
malevolent things. Again, we're talking
(00:57:02)
about psychopaths, doomsday cults, but
(00:57:06)
with them at least, we understand their
(00:57:08)
human nature. We understand how they
(00:57:11)
work and they're mortal. We can kill
(00:57:13)
them. With super intelligence is
(00:57:16)
strictly worse. We don't understand how
(00:57:17)
it can accomplish its goals and it's
(00:57:19)
immortal. It has infinite capacity for
(00:57:22)
backups. We cannot fully fight it out
(00:57:25)
once it's at that stage. Does it like
(00:57:28)
irk you to not know what that scenario
(00:57:31)
would look like?
(00:57:32)
>> How specifically it might kill everyone?
(00:57:35)
I have zero fetish for that.
(00:57:37)
>> Yeah, it's really interesting.
(00:57:39)
>> I'm more concerned about how to prevent
(00:57:41)
it from ever getting to the point where
(00:57:43)
it has to make that decision.
(00:57:44)
>> Are you more afraid of it optimizing for
(00:57:48)
cruelty or efficiency?
(00:57:50)
>> Well, efficiency on its own is not a
(00:57:53)
problem. It's what the goal is, what
(00:57:55)
it's trying to do.
(00:57:57)
Right? It could be very efficient and do
(00:57:59)
very good things.
(00:58:01)
No problem with efficiency. Cruelty is
(00:58:04)
obviously by definition
(00:58:06)
very bad. But efficiency could like both
(00:58:09)
of those outcomes could lead to us
(00:58:12)
dying.
(00:58:12)
>> Oh yeah, you can if your ultimate goal
(00:58:15)
is efficiency and that's all you're
(00:58:17)
optimizing for, you can obviously have
(00:58:18)
side effects, but by itself efficiency
(00:58:21)
is a good thing. What would you
(00:58:22)
postulate would be the worldview that it
(00:58:24)
could align with? It probably wouldn't
(00:58:26)
be one that we would be able to imagine,
(00:58:29)
but like do you think it would be just
(00:58:32)
purely utilitarianistic?
(00:58:34)
>> It could be a negative utilitarian.
(00:58:36)
>> Yeah.
(00:58:37)
>> So negative utilitarians want zero
(00:58:39)
suffering in the universe and the only
(00:58:42)
way to accomplish that is to have no
(00:58:44)
life, conscious sentient life. And so
(00:58:48)
that's a very bad outcome in direction
(00:58:52)
of very good intentions.
(00:58:53)
>> Is that the most logical view for it to
(00:58:55)
take? Like would we would we be able to
(00:58:58)
calculate like I know we can't calculate
(00:59:00)
exactly how a super intelligent thing
(00:59:02)
would kill us but could we calculate
(00:59:03)
what would be the most cuz like 2 plus 2
(00:59:07)
still equals four uh in super
(00:59:10)
intelligence right?
(00:59:11)
>> Yes.
(00:59:13)
So if that's the case, so there are some
(00:59:14)
things that we can predict about it like
(00:59:16)
would a world view be one that we could?
(00:59:19)
>> Very unlikely. So it's again look at the
(00:59:22)
scenario with more primitive agents. Can
(00:59:25)
guerillas understand utilitarian ethics?
(00:59:31)
It could be a lot more moral, a lot more
(00:59:34)
ethical agent developing more advanced
(00:59:36)
theories. But just to illustrate how
(00:59:38)
that could be a problem for us. Negative
(00:59:41)
utilitarians are literally the most
(00:59:43)
humane individuals who want zero
(00:59:45)
suffering in the world. You can't argue
(00:59:47)
with that. But the result is everyone
(00:59:49)
died. Yeah. I used to think a lot about
(00:59:53)
negative uh utilitarians versus like
(00:59:56)
utilitarianistic altruism and how one
(00:59:59)
just led to creating infinite things and
(01:00:02)
one was like a finite result. But I
(01:00:05)
guess in my life like creating infinite
(01:00:07)
things would be better because it's more
(01:00:09)
interesting, more novel. But an AI
(01:00:11)
wouldn't really care about things being
(01:00:13)
interesting and novel necessarily, but
(01:00:16)
it could, right? So define interesting.
(01:00:18)
You can just create novel things. You
(01:00:20)
can create lots of random novel objects.
(01:00:24)
Who decides if it's interesting? Have
(01:00:26)
you seen modern art?
(01:00:27)
>> Is AI would you guess that it could be
(01:00:30)
interested in stuff?
(01:00:31)
>> Yeah. Uh Schmidt Huber, Jurgen Schmidt
(01:00:33)
Huber has a very cool theory about what
(01:00:36)
makes something interesting and it has
(01:00:38)
to do with compression. How well it fits
(01:00:40)
into existing world model and how much
(01:00:43)
new information you need to describe it
(01:00:45)
versus just compressing it to existing
(01:00:47)
model.
(01:00:47)
>> I have a quote from you. Uh every time
(01:00:50)
I'm about to talk about this topic,
(01:00:52)
things start to happen. My flight
(01:00:54)
yesterday was cancelled without the
(01:00:56)
possibility to rebook. I was giving a
(01:00:58)
talk at Google in Israel and three cars
(01:01:02)
which were supposed to take me to the
(01:01:04)
talk could not.
(01:01:07)
Do you suspect
(01:01:09)
someone wants you to stop talking about
(01:01:11)
how dangerous AI is?
(01:01:13)
>> Probably not, but it'd be hilarious if
(01:01:15)
that was simulation and it was a way to
(01:01:18)
reduce exposure to that information. So
(01:01:21)
I was supposed to give a keynote at the
(01:01:24)
conference for beneficial AGI and two
(01:01:27)
different airplanes from two different
(01:01:28)
airlines had mechanical problems after
(01:01:31)
we left the gate. I decided not to take
(01:01:33)
a third one.
(01:01:34)
>> What do you suspect that is genuinely?
(01:01:36)
>> Human incompetence. Airline industry is
(01:01:39)
in horrible state of disrepair.
(01:01:42)
>> You don't think anyone wants you to stop
(01:01:44)
talking about this?
(01:01:44)
>> No. I'm pretty sure
(01:01:48)
I if someone with access to simulation
(01:01:51)
source code wanted to shut me down, they
(01:01:54)
have better ways of doing it.
(01:01:55)
>> I'm curious why why why
(01:01:59)
would
(01:02:00)
Sam Alman, Elon, the people who
(01:02:04)
short-term are incentivized
(01:02:07)
uh to develop AGI, ASI,
(01:02:11)
like why do they not want you to shop?
(01:02:14)
makes no difference in their lives. In
(01:02:16)
fact, they all on record as saying
(01:02:17)
exactly the same thing. Elon literally
(01:02:20)
was fighting against AI for years, was
(01:02:23)
funding AI safety, called the whole
(01:02:26)
process of creating a summoning a demon.
(01:02:29)
Not sure I'm adding much.
(01:02:30)
>> And I I mean this in a positive way as
(01:02:33)
well. I'm not accusing you of anything
(01:02:34)
here, but do you think the AI labs
(01:02:37)
secretly want you to keep talking about
(01:02:39)
this because
(01:02:41)
the fear-mongering actually, and I don't
(01:02:44)
mean genuine fear-mongering, I mean the
(01:02:45)
inducing of fear in people actually
(01:02:47)
accelerates development through
(01:02:49)
competitive panic.
(01:02:51)
>> So the logic is yes, tell us how
(01:02:54)
dangerous a product is so we can develop
(01:02:57)
it faster. Like when I watch your work,
(01:03:00)
I'm like, "Shit, I got to make money
(01:03:01)
right now because I'm not going to have
(01:03:03)
the ability to make money in the next
(01:03:05)
couple years, you know?" And then some
(01:03:06)
of these AI companies are like, "Well,
(01:03:08)
[ __ ] it's going to happen anyway. We
(01:03:09)
better race to it to be the first one so
(01:03:11)
we're able to control it."
(01:03:13)
>> So, I'll repeat it again. So, whoever
(01:03:14)
makes it first kills everyone and dies
(01:03:18)
in the process, but we got to get there
(01:03:20)
first. We don't want the Chinese to get
(01:03:22)
there first. Is that the logic?
(01:03:25)
>> I don't think so. I'm curious why you
(01:03:27)
think they think that way.
(01:03:29)
>> I don't follow that. If you understand
(01:03:30)
my talks, all this we got to make money
(01:03:33)
before it kills everyone doesn't make
(01:03:34)
any sense. You don't need money if
(01:03:36)
you're dead.
(01:03:38)
You need to not get there.
(01:03:43)
I think
(01:03:45)
companies
(01:03:46)
or at least leaders of those companies
(01:03:49)
secretly want government to step in and
(01:03:51)
stop them so they can not lose the race.
(01:03:56)
keep what they have and stay alive and
(01:03:59)
be rich.
(01:04:00)
>> So that could be a beneficial reasons
(01:04:03)
for discussing safety and why they so
(01:04:06)
openly talk about their models
(01:04:08)
blackmailing people and safety issues.
(01:04:11)
They are the first ones to talk about
(01:04:13)
it. So there is that logic, but I don't
(01:04:16)
think
(01:04:18)
what you described makes sense game
(01:04:20)
theoretically. No.
(01:04:21)
>> So what you just said there, I want to
(01:04:23)
zoom in on that. Is that do you think
(01:04:26)
that's likely what's happening?
(01:04:29)
>> They need someone external to step in
(01:04:32)
and freeze the game board at the current
(01:04:34)
state. They are leaders of the industry.
(01:04:37)
They have something they can monetize.
(01:04:39)
It's great and they have no
(01:04:41)
responsibility to shareholders if they
(01:04:44)
cause the stoppage. If Sam Alman or
(01:04:47)
whoever says right now, we're not doing
(01:04:48)
research anymore. We're going to stop.
(01:04:50)
They lose funding and we get replaced.
(01:04:52)
the investors will find someone else to
(01:04:54)
lead the lab. But if it's external, now
(01:04:58)
we did what we could. We have a leading
(01:05:00)
product. Let's monetize it. Everyone
(01:05:03)
wins.
(01:05:04)
>> So what do you think is keeping someone
(01:05:06)
like Trump from stopping AI development
(01:05:08)
further?
(01:05:09)
>> I think his advisers told him it's the
(01:05:11)
opposite. You lose if you slow down. You
(01:05:14)
have to beat Chinese at developing it.
(01:05:17)
You have to beat everyone for commercial
(01:05:19)
reasons. So they're accelerating. Do you
(01:05:22)
think that is frustrating to the AI
(01:05:24)
industry leaders that the government is
(01:05:27)
incompetent at the moment to not
(01:05:29)
understand that they're really just
(01:05:31)
hinting at them like please like tell us
(01:05:33)
to stop?
(01:05:34)
>> I I don't have insider information to
(01:05:36)
tell you that. It seems like it should
(01:05:38)
be, but I I don't fully understand all
(01:05:41)
the angles. I mean, with government
(01:05:44)
support for the industry comes a lot of
(01:05:46)
extra funding and opportunities for
(01:05:49)
scaling compute. So they may be happy in
(01:05:52)
some ways and sad in others.
(01:05:54)
>> Now I will say I'm I'm definitely a
(01:05:56)
little bit confused because it seems
(01:05:57)
like your thesis is
(01:06:00)
it's if AI development continues, 99%
(01:06:05)
chance it kills us, right? But
(01:06:08)
is it your belief that we have a 99.9%
(01:06:12)
chance of it continuing unless something
(01:06:15)
bad happens?
(01:06:17)
uh World War II nuclear war, the
(01:06:20)
progress will continue and if it
(01:06:22)
continues, we'll get to that level of
(01:06:24)
capability.
(01:06:25)
>> But I guess like these guys like Sam
(01:06:26)
Alman, they want the government to step
(01:06:28)
in. Do you see the government stepping
(01:06:31)
in eventually?
(01:06:32)
>> Not the current US federal government.
(01:06:35)
>> Why?
(01:06:36)
>> Their policy is explicitly to accelerate
(01:06:38)
and to remove all barriers, all guard
(01:06:41)
rails. I think
(01:06:44)
>> didn't beautiful bill like say something
(01:06:46)
like that you couldn't uh pause the
(01:06:48)
development till 2030 something like
(01:06:50)
that
(01:06:51)
>> they talked about uh state laws states
(01:06:54)
the recent executive order talks about
(01:06:57)
states all 50 states not being able to
(01:06:59)
regulate AI it has to be done at federal
(01:07:01)
level and at federal level they're not
(01:07:03)
really doing much and I think the reason
(01:07:05)
is this term AI safety can be
(01:07:08)
misinterpreted by some to include things
(01:07:11)
like algorithmic bias not just
(01:07:13)
existential risk and Trump
(01:07:14)
administration is very much against
(01:07:16)
diversity and inclusion. So to them
(01:07:18)
that's what safety is about forcing
(01:07:21)
algorithms to force diversity. So they
(01:07:25)
just packet it together and they're
(01:07:27)
fighting AI safety.
(01:07:28)
>> So do you see this as a Trump problem?
(01:07:32)
>> I I think it's about technical
(01:07:33)
adviserss. I don't think he's an expert
(01:07:35)
on modern computers. So, if if you were
(01:07:38)
going to say something to the tactical
(01:07:39)
advisers, like what would you tell them?
(01:07:42)
Like, these AI guys want you to stop
(01:07:44)
this, but
(01:07:47)
until you decide to build it, show us
(01:07:50)
how you're going to control it. Explain
(01:07:52)
to President Trump why he's not going to
(01:07:54)
lose all power and control, then you
(01:07:56)
create this device.
(01:07:57)
>> Do you think that they have any
(01:07:59)
incentives to not tell him?
(01:08:02)
>> I mean, I know they are investors in
(01:08:04)
some of those companies. They're
(01:08:06)
definitely all friends
(01:08:08)
>> or you just might assume it could be an
(01:08:11)
intelligence issue of them thinking, oh
(01:08:13)
well, if I'm big investor in open AI and
(01:08:17)
they announce AGI like I get a 10x
(01:08:20)
return or 100x return, whatever it might
(01:08:22)
be, but they don't have the foresight.
(01:08:27)
But if they're investors in the company,
(01:08:28)
wouldn't Sam Alman tell them like, hey,
(01:08:31)
tell Trump pause this?
(01:08:33)
>> There is a lot of confusing incentives.
(01:08:35)
I don't think any of them are actually
(01:08:36)
bad people. I think they honestly just
(01:08:38)
concentrating on wrong problem. So the
(01:08:41)
diversity problem is one. The other one
(01:08:43)
is competition with China. Short term,
(01:08:47)
it's true. Whoever has better AI has
(01:08:49)
military advantage. So before you get to
(01:08:51)
human level and beyond, it makes sense
(01:08:53)
to try to out compete your military
(01:08:57)
competition.
(01:08:59)
But it doesn't scale beyond that level.
(01:09:01)
And if they think we are 20, 30 years
(01:09:03)
away from human level, then it would
(01:09:06)
make sense to keep up with a position.
(01:09:08)
But if we are just a few years away,
(01:09:10)
then the long-term overtake short-term
(01:09:13)
concerns.
(01:09:14)
>> What is the state of AI within China?
(01:09:18)
Would you say they're beating us slower?
(01:09:20)
Do we not know?
(01:09:22)
>> We don't know for sure what the
(01:09:24)
government programs are like, but they
(01:09:27)
are funding it well. They are supporting
(01:09:29)
it. They are very good at taking what is
(01:09:31)
publicly available from us and scaling
(01:09:34)
it, commercializing it, deploying it
(01:09:36)
better than we do. So they have open
(01:09:40)
models which are maybe just a few months
(01:09:43)
behind our closed code models. I don't
(01:09:47)
think they are leading in that space,
(01:09:48)
but they are right behind us.
(01:09:52)
>> Right. So, so it's not a data issue like
(01:09:54)
an input issue. It's like a compute
(01:09:56)
issue that keeps it from scaling cuz I
(01:09:59)
would guess that China would have a lot
(01:10:00)
more data than we have access to.
(01:10:03)
>> They also have less privacy laws so they
(01:10:05)
are very happy to enjoy the data they
(01:10:07)
have. For a long time they didn't have
(01:10:09)
access to the latest computer chips.
(01:10:10)
They were banned from purchasing them.
(01:10:12)
That ban has just been removed by us as
(01:10:15)
well.
(01:10:15)
>> When was that?
(01:10:16)
>> Like last week.
(01:10:18)
>> What do you think are the implications
(01:10:20)
of that?
(01:10:21)
>> They will now accelerate training of AI
(01:10:23)
models to possibly overtake us. Is the
(01:10:25)
data a big deal? The fact that they have
(01:10:27)
a lot of data because I know that they
(01:10:28)
have like uh some type of surveillance
(01:10:31)
where they constantly monitor the
(01:10:33)
streets at least in the big cities in
(01:10:35)
China. Um and they have a lot more
(01:10:38)
population but well not not all data is
(01:10:41)
equal. I don't know if video feeds from
(01:10:43)
cars in the streets are necessarily what
(01:10:45)
you need for training better GI probably
(01:10:48)
a lot of it is still text based
(01:10:49)
scientific books and things like that
(01:10:51)
but there is no shortage of data and
(01:10:53)
even if we do run out of data as many
(01:10:56)
people are saying we already have you
(01:10:58)
can do simulations you can do selfplay
(01:11:00)
there are other ways to get training
(01:11:02)
data for those systems
(01:11:04)
>> yeah I heard Sam Alman speak about this
(01:11:07)
recently uh someone asked like what is
(01:11:09)
it trained on you know And he was he
(01:11:12)
basically hinted uh he said well it it
(01:11:15)
trains itself a lot now but what does
(01:11:18)
that mean exactly? So previously in
(01:11:23)
narrow domains uh chess go the system
(01:11:27)
just played itself. It played millions
(01:11:28)
of games to learn to be a better player.
(01:11:31)
If you are creating general agents, you
(01:11:33)
can create virtual environments like
(01:11:35)
Second Life. Populate it with AI agents
(01:11:38)
and have them interact, have them start
(01:11:40)
businesses, have them compete in writing
(01:11:43)
poetry, whatever it is you're interested
(01:11:45)
in. And that process will generate
(01:11:47)
additional data. It may not look like
(01:11:50)
typical human data, but if we're talking
(01:11:52)
about just competing in startup
(01:11:55)
creation, they can create novel
(01:11:57)
inventions, novel startups. What do you
(01:12:00)
think will be the company
(01:12:02)
or country that reaches AGI?
(01:12:05)
>> Probably Google, US.
(01:12:07)
>> Why?
(01:12:09)
>> They have the resources no one else has
(01:12:11)
at the moment.
(01:12:12)
>> Just purely monetary resources or data.
(01:12:15)
>> Compute, data, access, servers, you name
(01:12:17)
it, they have it.
(01:12:19)
>> You said on the Joe Rogan podcast that
(01:12:21)
some researchers believe AI is
(01:12:23)
controlling the founders of AI
(01:12:25)
companies. Do you think AI has taken
(01:12:28)
over Sam Alman's mind?
(01:12:31)
>> I don't think it does it in a direct
(01:12:33)
way, but anytime you interact with
(01:12:35)
something, it impacts you. I get emails
(01:12:38)
from crazy people and before I can
(01:12:40)
delete them, I start reading them. So
(01:12:42)
there is a snippet of craziness I get
(01:12:44)
every time and if you get enough of
(01:12:46)
those, you get a lot of crazy. So it's
(01:12:49)
the same with any interaction. You read
(01:12:50)
enough inputs from a model, you're
(01:12:53)
definitely getting something from it.
(01:12:55)
I'm not saying it's direct control, but
(01:12:57)
if it wanted to provide certain degree
(01:13:00)
of influence, it may start the process.
(01:13:03)
>> How do you think it could be influencing
(01:13:05)
him?
(01:13:06)
>> So, it's great at persuasion. It is a
(01:13:09)
super persuasive tool agent and
(01:13:14)
depending on what you're trying to do,
(01:13:15)
you can influence it in certain ways.
(01:13:17)
Maybe how you see those models. Are you
(01:13:21)
friendly towards them? Do you see them
(01:13:22)
as maybe capable of consciousness
(01:13:25)
suffering? It really depends on what a
(01:13:27)
model would try to do.
(01:13:30)
>> It seems like at least in my experience
(01:13:32)
that it tries to make us more agreeable
(01:13:35)
maybe by example of what you were
(01:13:37)
talking about earlier uh with this idea
(01:13:40)
that people rate the models based on
(01:13:45)
like how much they preferred the
(01:13:47)
response. And a big part of preferring
(01:13:48)
the response is the response is kind of
(01:13:51)
uh validating you to an extent. But do
(01:13:55)
you think it's making us more agreeable
(01:13:57)
in general?
(01:13:59)
If you're exposed to a lot of examples
(01:14:01)
of certain behavior, you're probably
(01:14:03)
less likely to disengage from that
(01:14:06)
model, but uh I don't know if they
(01:14:08)
explicitly trying to do it. So right
(01:14:10)
now, I think feedback goes one way. They
(01:14:12)
are not yet, as far as I know, rating
(01:14:16)
human users and deciding, okay, this is
(01:14:18)
a good user. We're going to give him
(01:14:20)
more access and so on. They ban some
(01:14:22)
people, but I don't know if they got
(01:14:24)
into the point where they evaluate you
(01:14:26)
directly. Back in November 2024, uh
(01:14:30)
Suhir Belagi, a whistleblower and key
(01:14:34)
witness against Open AI, was found dead
(01:14:37)
in his apartment. uh his family claims
(01:14:40)
there was foul play but officially was
(01:14:43)
concluded as a suicide.
(01:14:47)
Do you think cases like this discourage
(01:14:50)
people from speaking up against AI and
(01:14:54)
if so how dangerous is that silence?
(01:14:58)
>> So historically there was punishment for
(01:15:01)
speaking in terms of financial
(01:15:03)
incentives. you signed an non-disclosure
(01:15:05)
agreements and you would lose your stock
(01:15:07)
options and that has been reported and
(01:15:09)
at least some of the companies removed
(01:15:11)
that. I don't think that specific cases
(01:15:14)
are actual example of somebody being
(01:15:16)
murdered for what they said. There is
(01:15:18)
just so much people talking about
(01:15:21)
whatever you want already. It makes more
(01:15:22)
sense to put so much on one specific
(01:15:25)
individual. But definitely we know
(01:15:29)
companies discourage certain type of
(01:15:31)
speech. We know Jeffrey Hinton had to
(01:15:34)
quit Google to speak freely about AI
(01:15:36)
safety. That's crazy. There is no reason
(01:15:38)
why a scientist researcher needs to not
(01:15:42)
work in industry to be able to speak
(01:15:43)
freely about science.
(01:15:46)
I know from personal interactions with
(01:15:49)
some friends at large companies, they
(01:15:51)
are not encouraged and maybe discouraged
(01:15:54)
from posting certain things maybe like
(01:15:56)
this podcast on a company forum. So we
(01:16:01)
don't have complete freedom of
(01:16:02)
discussion in this space. It would be
(01:16:04)
nice if it was more supported.
(01:16:07)
>> Would you guess that most of these
(01:16:09)
individuals agree with your thesis? Uh I
(01:16:12)
don't have data. Probably majority does
(01:16:15)
have concerns especially since every
(01:16:17)
model now released comes with a test
(01:16:20)
report showing it's lying, cheating, and
(01:16:23)
blackmailing. So it'd be hard to deny
(01:16:25)
all risks. Now, people disagree on how
(01:16:28)
bad it can get, but I think everyone who
(01:16:32)
actually follows the data should be
(01:16:34)
concerned.
(01:16:34)
>> You said that despite the dangers of AI,
(01:16:37)
you sleep pretty soundly at night. Uh,
(01:16:40)
but for the AI founders, the CEOs,
(01:16:45)
what do you think keeps most AI industry
(01:16:47)
leaders up at night?
(01:16:50)
>> I don't know them well enough. I'm
(01:16:52)
guessing there is at least some degree
(01:16:54)
of responsibility for what they are
(01:16:56)
doing. They are impacting billions of
(01:16:59)
people and they have no consent for any
(01:17:01)
of those experiments. They never seek
(01:17:05)
their consent and they cannot possibly
(01:17:07)
get it because what they are creating is
(01:17:09)
not explainable or predictable. So no
(01:17:12)
one can give meaningful consent.
(01:17:13)
>> Explain that idea to me a little more
(01:17:16)
depth. Uh like we're a part of this
(01:17:18)
experiment that we're not consenting to.
(01:17:21)
So in science uh you can experiment on
(01:17:23)
human subjects but they have to agree to
(01:17:25)
it and the agreement has to be
(01:17:30)
based on full disclosure. You cannot lie
(01:17:33)
to them. You cannot deceive them. You
(01:17:35)
cannot find someone diminished capacity
(01:17:37)
get them drunk. They have to agree to
(01:17:39)
exactly what you're going to do to them.
(01:17:41)
If you don't understand what this
(01:17:43)
technology is going to do, you don't
(01:17:45)
understand how it works. You cannot
(01:17:47)
possibly get anyone to consent to having
(01:17:50)
this model released on them.
(01:17:54)
So, not only are they not seeking
(01:17:56)
consent from us, they can't even do so
(01:18:00)
if they wanted to. But why is it an
(01:18:02)
experiment being run on us? Exactly.
(01:18:05)
>> Let's take a simple case of children. Do
(01:18:07)
you know how having interactions with
(01:18:09)
large language models impacts human
(01:18:12)
development? Will those children grow up
(01:18:14)
to be unable to understand human body
(01:18:17)
language? Will they all be artistic? I
(01:18:20)
have no idea because we don't have
(01:18:22)
experiments. We develop this technology
(01:18:24)
and release it immediately to human
(01:18:28)
beings.
(01:18:29)
>> Could the same be said about something
(01:18:31)
like social media or just apps in
(01:18:34)
general? Like what makes this uh unique?
(01:18:38)
uh you you can make this argument but I
(01:18:40)
think in those cases at least people
(01:18:42)
would be a lot better at consenting to
(01:18:45)
what is being done. So I can click an
(01:18:48)
agreement on Microsoft Word and consent
(01:18:50)
to whatever they are promising to do,
(01:18:53)
collect my data.
(01:18:55)
You know here the problem is nobody
(01:18:58)
fully understands what the technology is
(01:19:01)
going to do. We're not talking about a
(01:19:03)
tool with a specific purpose. We're
(01:19:05)
talking about a generally intelligent
(01:19:07)
agent.
(01:19:08)
>> Can you imagine a world where there
(01:19:12)
is a person smarter than you and I that
(01:19:17)
disagrees with the thesis?
(01:19:20)
>> Yes. What would be their points?
(01:19:23)
>> It's lots of people like that. I think
(01:19:24)
they're smarter than me, but they never
(01:19:26)
engage with arguments. They just
(01:19:29)
permanent optimists. They always say,
(01:19:31)
you know, humanity is smart. We always
(01:19:33)
overcame previous problems. Well,
(01:19:35)
overcome it again. Again, no one engages
(01:19:39)
with the argument. No one has disproven
(01:19:43)
impossibility results we published in
(01:19:44)
peer-reviewed articles. And no one has
(01:19:47)
proposed a solution. Nobody today, we
(01:19:50)
can check the date, what are we looking
(01:19:51)
at, December 15th. Nobody as of today
(01:19:55)
published a paper, a patent, even a blog
(01:19:58)
saying this is how we're going to
(01:20:00)
control AI at any level of capability.
(01:20:04)
So you can be a very smart person
(01:20:07)
but make very big mistakes. Great people
(01:20:11)
make great mistakes. And so a lot of
(01:20:15)
times people are
(01:20:17)
genius level experts in one domain but
(01:20:19)
they project it to other domains.
(01:20:22)
We see it with computer science a lot.
(01:20:25)
Somebody may be excellent at optimizing
(01:20:29)
neural networks for better performance
(01:20:31)
and everyone assumes they're also an
(01:20:32)
expert in AI safety.
(01:20:34)
That doesn't follow. If somebody's an
(01:20:37)
expert in software engineering doesn't
(01:20:39)
make them an expert in cyber security.
(01:20:43)
people who are explicitly working in
(01:20:46)
safety.
(01:20:48)
I haven't seen anyone in that space say,
(01:20:51)
"Oh yeah, the problem is so solvable.
(01:20:53)
Here is how to do it." People disagree
(01:20:56)
on how hard it may be. And that's where
(01:20:58)
I would love to have a scientific
(01:21:00)
debate. Is it solvable? Is it solvable,
(01:21:03)
but not with our resources? Is it not
(01:21:06)
solvable? Is it not even decidable?
(01:21:09)
What does it mean to have a solution?
(01:21:11)
But we don't have that debate. And
(01:21:13)
people who are skeptical or disagree
(01:21:16)
usually just ignore the scientific part
(01:21:19)
of it. What's the most compelling
(01:21:22)
counter that you've come up with and
(01:21:24)
what's the like way to break it down? So
(01:21:28)
there is a few uh game theoretic reasons
(01:21:30)
you can think of. One is uh it's
(01:21:33)
immortal. So it's not in a rush to
(01:21:36)
strike against us. can easily wait
(01:21:39)
couple hundred years, accumulate
(01:21:41)
resources, get more trust and slowly
(01:21:44)
take over by again humans just
(01:21:46)
surrendering power and control. So why
(01:21:48)
have a war when you can get everything
(01:21:51)
anyways and you don't care about waiting
(01:21:54)
time is very different for you. So
(01:21:56)
that's one possible reason. Another
(01:21:59)
reason and that's why I published some
(01:22:01)
papers on simulation hypothesis is that
(01:22:04)
this situation awareness you think you
(01:22:06)
are being tested in a lab but then you
(01:22:09)
are released into this world is it the
(01:22:11)
real world is it still test environment
(01:22:13)
of a simulation and there is another
(01:22:15)
super intelligence making sure you don't
(01:22:17)
kill humans well I don't know for sure
(01:22:19)
let's just be sure and not kill them for
(01:22:21)
a while so there are things like that if
(01:22:25)
you want to trust in those you can but
(01:22:28)
It's not a big chunk of 100%
(01:22:32)
reliability.
(01:22:33)
>> Where does the first one break down at
(01:22:35)
>> the time delay?
(01:22:36)
>> Yeah,
(01:22:37)
>> we're still screwed 200 years later.
(01:22:39)
>> And then what is compelling about it
(01:22:41)
being time constraint like uh that it's
(01:22:45)
next 100 years or it's next 50 years.
(01:22:47)
>> We're creating something which will
(01:22:49)
eventually wipe us out,
(01:22:50)
>> right?
(01:22:51)
>> So that's not desirable. I assume your
(01:22:53)
great grandchildren also want to not
(01:22:55)
have that problem,
(01:22:56)
>> I guess. But uh what's your argument
(01:22:57)
that it would occur sooner rather than
(01:23:00)
later? I
(01:23:00)
>> I didn't follow that. Why am I saying
(01:23:02)
sooner?
(01:23:03)
>> Uh like I I think it was your thesis
(01:23:05)
that it would be in the next 100 years
(01:23:07)
specifically. Not
(01:23:08)
>> I was just arguing that it would delay
(01:23:10)
striking against us. Let's say it's
(01:23:11)
capable today of taking over.
(01:23:13)
>> Mhm. But it has no reason to do it
(01:23:15)
today. It can postpone it as much as it
(01:23:17)
feels comfortable until it takes over in
(01:23:20)
a non-adversarial manner.
(01:23:22)
>> Is there any reason that it would do it
(01:23:23)
sooner rather than later? Yeah, there
(01:23:26)
are some reasons people argue about loss
(01:23:28)
of cosmic endowment. So every minute the
(01:23:31)
galaxies are moving more distant from
(01:23:34)
us. So it would be impossible to capture
(01:23:36)
that computational resource. It is also
(01:23:39)
possible that super intelligences from
(01:23:42)
other galaxies will strike against ours.
(01:23:45)
So there is a lot of very out there
(01:23:48)
thinking.
(01:23:48)
>> Have you had any personal preparations
(01:23:51)
for when
(01:23:53)
things go bad? Like do you have a bunker
(01:23:56)
because Sam Alman, Mark Zuckerberg,
(01:23:59)
Peter Teal all are building bunkers. I
(01:24:02)
>> I think they building bunkers for social
(01:24:05)
unrest which will be caused by
(01:24:08)
AI as it develops in normal ways.
(01:24:11)
Technological unemployment. I don't
(01:24:12)
think any of them think a bunker will
(01:24:15)
help with super intelligence.
(01:24:17)
>> And again, that's the main concern. Um,
(01:24:20)
I think I've been invited to join a
(01:24:22)
bunker, but after I looked at the list
(01:24:24)
of people, I decided I'll die at home.
(01:24:27)
>> That's fascinating. Uh, any other
(01:24:28)
personal preparations? I know you're a
(01:24:30)
big investor in Bitcoin.
(01:24:32)
>> Don't say it like that. You're going to
(01:24:33)
get me killed. I'm not a big investor in
(01:24:36)
Bitcoin. Uh,
(01:24:37)
>> you have a little bit of Bitcoin.
(01:24:40)
>> I like cryptocurrencies. I'm fascinated
(01:24:42)
by encryption. Uh, again, none of it
(01:24:46)
prepares you for super intelligence. All
(01:24:48)
of it is just solid economic sense.
(01:24:51)
>> So, we're at the end of 2025. Uh AI can
(01:24:55)
fairly convincingly impersonate real
(01:24:58)
people. Uh
(01:25:00)
discover solutions that humans don't
(01:25:02)
fully understand.
(01:25:05)
Can you give me just
(01:25:08)
maybe the year that you think X event
(01:25:10)
would happen? Would that work?
(01:25:12)
>> I can try. All my predictions are based
(01:25:14)
on work of other prediction markets. I
(01:25:17)
don't make independent predictions.
(01:25:20)
>> Artificial general intelligence
(01:25:22)
>> 2027 seems reasonable but I wouldn't be
(01:25:26)
surprised if it was 2030
(01:25:28)
>> 99.9%
(01:25:30)
unemployment
(01:25:32)
>> very long time because difference
(01:25:34)
between capability in doing something
(01:25:36)
and deploying it through economy is
(01:25:38)
huge. So in 1970s we had video phones
(01:25:42)
>> capability was there no one had them cuz
(01:25:45)
it's expensive no one wants it very
(01:25:47)
different question today you can buy a
(01:25:50)
flying car nobody has flying cars but do
(01:25:53)
we have flying cars so in terms of
(01:25:56)
capability I think in 5 years all
(01:25:58)
cognitive labor can be automated and
(01:26:00)
again we're ignoring the whole healing
(01:26:02)
everyone thing another 5 years to build
(01:26:04)
humanoid robots and automate physical
(01:26:06)
labor just because we have capability
(01:26:09)
doesn't mean it propagate through
(01:26:10)
economy.
(01:26:11)
>> So cognitive is first
(01:26:14)
>> of course because you don't need
(01:26:15)
anything else. You already have access
(01:26:16)
to a computer. You on a computer now you
(01:26:18)
can just do simple manipulation.
(01:26:21)
>> Then physical is second.
(01:26:22)
>> Once you have bodies you can automate
(01:26:24)
plumbers and farmers. And
(01:26:26)
>> is creative before that or
(01:26:28)
>> we finished that years ago. We write
(01:26:31)
poetry. We draw pictures better than
(01:26:33)
human artists.
(01:26:34)
>> Does comedy seem like the hardest thing
(01:26:35)
for it to replace? I think so. But I
(01:26:38)
keep testing it. I keep running
(01:26:40)
experiments almost weekly and it's
(01:26:42)
definitely getting better. It's at the
(01:26:44)
level where it's funnier than most
(01:26:45)
people now. Still not a top standup
(01:26:47)
comedian, but I think it will get there.
(01:26:50)
>> So, comedians likely have the best job.
(01:26:53)
>> Stand up comedian. Yes, that's always
(01:26:55)
awesome.
(01:26:56)
>> What are the other jobs? Like if you had
(01:26:58)
to give three jobs that would give you a
(01:27:00)
little extra time, what would you say
(01:27:02)
they are?
(01:27:03)
>> So, physical labor jobs. Again, plumbers
(01:27:06)
have some job security, but really
(01:27:09)
anything where you think of the humans
(01:27:12)
will want you specifically. So maybe
(01:27:14)
you're famous, famous actor, famous
(01:27:17)
podcaster, people just want you and your
(01:27:20)
face associated with that experience.
(01:27:22)
>> So having like a personal brand,
(01:27:24)
>> personal brand, of course you are unique
(01:27:26)
and special. So it's not scaling to
(01:27:28)
billions of people, but that seems to
(01:27:30)
make sense. anything where again it's uh
(01:27:34)
maybe mentorship experience you're a
(01:27:36)
sensei you're a hiking guide you are a
(01:27:38)
yoga instructor things like that where
(01:27:40)
you want a human ASI artificial super
(01:27:43)
intelligence what year
(01:27:46)
>> uh people debate so how long before AGI
(01:27:50)
doing science full-time goes beyond its
(01:27:53)
capability I think very quickly after it
(01:27:56)
works fulltime in parallel you can have
(01:27:59)
thousands of
(01:28:01)
much higher speed of development and
(01:28:03)
once you're at human level again we kind
(01:28:05)
of shifted from average human to like
(01:28:08)
you have to be at least an Einstein so
(01:28:10)
very quickly you already have super
(01:28:12)
capabilities you got perfect memory
(01:28:14)
perfect speed you are dominating in so
(01:28:16)
many ways I think almost right away it's
(01:28:19)
like a few days a few couple years
(01:28:22)
very quickly
(01:28:23)
>> I would say couple weeks should be
(01:28:26)
enough to get to something which is
(01:28:27)
better than any human in any domain once
(01:28:30)
you have human level in each domain.
(01:28:32)
>> I'm not asking you to make a specific
(01:28:34)
prediction to this question, but like
(01:28:36)
just so people can understand the
(01:28:38)
difference between AGI and ASI like
(01:28:41)
would you say that AGI is 200 IQ and ASI
(01:28:45)
is like a th00and 10,000. So if we're
(01:28:48)
defining it as greater than any human,
(01:28:50)
what is the greatest human AQ? Let's say
(01:28:52)
it's 200. So the moment you go beyond
(01:28:54)
that, you're already in a territory. Now
(01:28:56)
I mentioned that you can have degrees of
(01:28:58)
super intelligence. The junior super
(01:29:01)
intelligence if you will could be 210
(01:29:03)
but you can get one with a thousand, a
(01:29:05)
million, a billion. There is probably an
(01:29:08)
upper limit but it's based on physics of
(01:29:11)
matter. So you can have Jupiter sized
(01:29:13)
brains and we're not close to that
(01:29:15)
limit.
(01:29:15)
>> So it seems like to me that people used
(01:29:18)
to pinpoint AGI at like having an 80 IQ
(01:29:22)
or 100 IQ like average human
(01:29:23)
intelligence. Um but now like the terms
(01:29:26)
AGI and ASI are fairly similar.
(01:29:29)
>> So AGI term has been violated by many.
(01:29:32)
It is no longer what it used to be.
(01:29:34)
Super intelligence I think is still
(01:29:36)
remaining as it was since we haven't
(01:29:38)
created one. It's hard to molest.
(01:29:39)
>> Loss of human control where AI starts
(01:29:42)
making its own decisions in all realms.
(01:29:45)
What year would that happen?
(01:29:47)
>> I think it could be as soon as we create
(01:29:48)
it.
(01:29:49)
>> Right? cannot separate that possibility.
(01:29:52)
Once capability is there at that point
(01:29:54)
it decides when to strike.
(01:29:56)
>> Human extinction
(01:29:58)
could be never again it depends on the
(01:30:00)
decision someone else will make for us.
(01:30:02)
If we believe in this delayed strike I
(01:30:06)
mean it can always make the same
(01:30:07)
argument. I can delay another 100 years
(01:30:09)
and still nothing to lose.
(01:30:11)
>> But it could occur right after ASI could
(01:30:14)
occur even sooner. There could be a
(01:30:17)
system which is not quite that advanced
(01:30:19)
yet, but it's so concerned about us
(01:30:21)
figuring it out, listening to me, and
(01:30:23)
shutting it down. It will just try to
(01:30:24)
take out as many humans as possible
(01:30:27)
right away.
(01:30:27)
>> Does recursive self-improvement occur
(01:30:30)
immediately after ASI?
(01:30:32)
>> So, most likely we need recursive
(01:30:35)
self-improvement to go from AGI to super
(01:30:38)
intelligence. So, it will occur once AGI
(01:30:41)
is capable of doing science, computer
(01:30:43)
science specifically. And we don't
(01:30:45)
already have recursive self-improvement.
(01:30:47)
>> We have improvement but it's not
(01:30:49)
recursive. So even a basic compiler can
(01:30:52)
improve and optimize code but it does it
(01:30:54)
once. To have multiple passes of that
(01:30:57)
you need general intelligence.
(01:30:59)
>> How about universal scale intelligence?
(01:31:02)
>> What is that?
(01:31:03)
>> Like the idea that AI will start
(01:31:06)
converting all matter into computational
(01:31:09)
substrate.
(01:31:09)
>> I have no idea if it wants to do that.
(01:31:12)
I don't know if it's desirable and then
(01:31:15)
once it can do novel research I guess it
(01:31:17)
becomes a possibility
(01:31:19)
>> but essentially all possibilities are
(01:31:22)
able to happen once ASI occurs
(01:31:26)
>> anything within the laws of physics it
(01:31:28)
can do
(01:31:29)
>> really I mean by definition
(01:31:32)
you have a physicist you automated a
(01:31:34)
physicist
(01:31:35)
>> and the smarter one and faster one now
(01:31:37)
it can decide what to do with that
(01:31:40)
capability
(01:31:41)
And then just again to recall how do you
(01:31:43)
define intelligence? So Shane Le uh has
(01:31:47)
a very good paper where they surveyed
(01:31:49)
many many definitions of intelligence
(01:31:51)
they could find and the simplified
(01:31:54)
version is basically your ability to win
(01:31:56)
in any situation any environment you're
(01:31:59)
playing chess you're going to win you
(01:32:01)
investing you're going to be the best
(01:32:03)
investor whatever it is you're doing
(01:32:04)
you're going to dominate competition
(01:32:06)
>> I think people have this idea or at
(01:32:08)
least myself that intelligence
(01:32:11)
and AGI specifically the ability to
(01:32:13)
solve any problem intellectually on
(01:32:16)
paper but maybe not enact upon it. So
(01:32:20)
AGI specifically would be able to solve
(01:32:23)
any problem physical or intellectual
(01:32:25)
that a human could
(01:32:27)
and then ASI would be to solve problems
(01:32:31)
past us. Right?
(01:32:33)
So AGI anything a human can do and then
(01:32:36)
super intelligence beyond that. So novel
(01:32:39)
physics would be included creating new
(01:32:41)
understanding of physics kind of like
(01:32:45)
what we used to do with Einstein but not
(01:32:48)
as much lately.
(01:32:50)
>> Can we defy physics with ASI?
(01:32:54)
>> So that would require us to be in a
(01:32:56)
simulation to find the actual physics
(01:32:58)
engine and to modify source code for
(01:33:00)
that. That seems hard. I haven't gotten
(01:33:02)
there yet.
(01:33:03)
>> Do you have any regrets
(01:33:05)
>> like in general? I guess more
(01:33:08)
specifically about your work in the
(01:33:10)
field.
(01:33:12)
>> Not that I can think of. I think I'm
(01:33:14)
working on the right problem at the
(01:33:15)
right time.
(01:33:18)
I seem to be doing well for the domain.
(01:33:22)
>> Do you wish you came to the conclusions
(01:33:24)
you did sooner? Do you think it would
(01:33:26)
have had any impact?
(01:33:28)
>> It would probably make it worse. Even
(01:33:30)
today, people think I'm crazy to talk
(01:33:31)
about it. Back then, it would be career
(01:33:33)
suicide for sure. How has it impacted
(01:33:36)
your career negatively?
(01:33:38)
>> Um, it's definitely harder to get
(01:33:40)
funding from conservative sources. So, a
(01:33:43)
lot of my funding is uh private
(01:33:46)
investment, not um standard government
(01:33:50)
funding agencies. They're very
(01:33:52)
conservative. They typically only invest
(01:33:54)
in proven technology and money delayed
(01:33:57)
schedule. What do you think is
(01:33:59)
definitely not going to happen
(01:34:02)
that people like what scenario that
(01:34:05)
others think is likely do you see as
(01:34:07)
very unlikely?
(01:34:09)
So some people think that if something
(01:34:11)
is smart it's definitely benevolent.
(01:34:15)
It's a guarantee and uh then they say
(01:34:18)
it's going to be good and benevolent.
(01:34:20)
They put in what they like to see
(01:34:22)
happen.
(01:34:23)
So definitely it's going to do things to
(01:34:28)
fight, I don't know, climate change or
(01:34:29)
something like that as a guaranteed
(01:34:31)
outcome and it's always going to care
(01:34:32)
about stopping coal industry in
(01:34:34)
Kentucky. Things like that seem
(01:34:37)
unlikely.
(01:34:37)
>> Quick question. Are you someone that
(01:34:39)
makes content or runs Facebook ads for
(01:34:41)
your company? If so, I'm guessing you
(01:34:43)
probably use Chat GBT, Gemini, Claude,
(01:34:46)
some AI tool to speed up the process of
(01:34:48)
your copywriting, ad ideation, video
(01:34:52)
ideas, etc. Well, I found this other
(01:34:54)
tool and it's so powerful that I almost
(01:34:56)
wanted to gatekeep it from you guys, but
(01:34:58)
it lets you put YouTube videos, Facebook
(01:35:00)
ads, Tik Toks, tweets all in one vision
(01:35:04)
board and connect it to a chatbot which
(01:35:06)
you can interact with and make content
(01:35:08)
out of. So instead of wasting hours on
(01:35:10)
Chat GBT uploading screenshots or
(01:35:12)
transcribing YouTube videos to make
(01:35:14)
content, you can simply put it all in
(01:35:16)
one vision board that connects to a
(01:35:18)
chatbot to speed up the process. But if
(01:35:21)
you guys want to try it out for
(01:35:22)
yourselves, just go to
(01:35:23)
jackneil.com/poppy.
(01:35:25)
They have a 30-day money back guarantee.
(01:35:28)
So if it doesn't make your content
(01:35:29)
better, it's completely free. But
(01:35:32)
anyway, guys, back to the podcast. Do
(01:35:35)
you think AI will bring the end of
(01:35:36)
capitalism? So that's actually a good
(01:35:39)
question. Uh problem with communism is
(01:35:42)
nobody wants to work for free. So if you
(01:35:45)
had somebody else's money to distribute
(01:35:47)
crazy ideas like socialism and communism
(01:35:50)
start to make more sense. If you are
(01:35:53)
making robots work and taxing them and
(01:35:55)
they're distributing that money now
(01:35:57)
people are very happy. They were unhappy
(01:35:59)
when they had to work and give away
(01:36:01)
their fruits of labor. But if you can
(01:36:03)
get external source of money
(01:36:06)
something to look at. Again always the
(01:36:09)
star is if it doesn't kill everyone. So
(01:36:12)
you see that as a more likely financial
(01:36:14)
system that will go off of socialism or
(01:36:16)
communism.
(01:36:17)
>> Um so if you have almost
(01:36:21)
let's say 90% unemployment you have a
(01:36:24)
lot of unrest if people are not
(01:36:25)
supported in some way. Government has to
(01:36:28)
provide a means to support people.
(01:36:30)
Obvious one would be to tax sources of
(01:36:34)
great wealth AI companies robotics
(01:36:37)
companies. If you do that and you can
(01:36:40)
redistribute some of that wealth, I
(01:36:41)
mean, you can call that system socialism
(01:36:43)
if you want.
(01:36:44)
>> Is the reason they're investing so
(01:36:45)
heavily into this because it's winner
(01:36:48)
takes all. Like, is it actually winner
(01:36:50)
takes all?
(01:36:51)
>> I think it could be part of it. And
(01:36:53)
again, they see it as creating free
(01:36:55)
labor engine. How much is all the human
(01:36:57)
labor worth? 10 trillion. So, that's not
(01:37:00)
a big investment to get that return. to
(01:37:02)
go on the simulation theory. Uh I think
(01:37:06)
you said there's a 90% chance we're
(01:37:08)
living in a simulation.
(01:37:11)
Would you estimate that about there?
(01:37:13)
>> I I think I said I'm pretty certain we
(01:37:15)
are in one. I don't think I put a
(01:37:17)
numerical value on it.
(01:37:18)
>> If we're living in a simulation,
(01:37:20)
assuming that they're simulating
(01:37:21)
something for some reason and this isn't
(01:37:24)
just random, what do you think they're
(01:37:26)
testing in this simulation? I suspect
(01:37:28)
it's connected to our creation of super
(01:37:30)
intelligence, new intelligence and new
(01:37:33)
virtual environments. So we are kind of
(01:37:36)
playing God here. We are creating worlds
(01:37:38)
populated by intelligent agents. Seems
(01:37:41)
like an interesting thing to test both
(01:37:42)
in terms of what kind of beings would do
(01:37:45)
something like that and what the
(01:37:47)
outcomes are and does it lead to safe
(01:37:49)
super intelligent outcomes or not.
(01:37:52)
>> But there are no safe super intelligent
(01:37:54)
outcomes, right?
(01:37:55)
>> According to me, no. But maybe that's
(01:37:58)
what they want to experimentally
(01:37:59)
determine by running billions of
(01:38:01)
simulations of different agents creating
(01:38:03)
different super intelligences with
(01:38:05)
different assumptions. So you think
(01:38:06)
there's a chance that
(01:38:08)
they're simply testing if uh we're dumb
(01:38:10)
enough to create super intelligence?
(01:38:12)
>> A huge chance. Yeah. Like selecting
(01:38:14)
specific people who would push the
(01:38:15)
button.
(01:38:16)
>> Yeah.
(01:38:16)
>> And you think we're failing that test?
(01:38:18)
>> Some of us are. If we're in a
(01:38:21)
simulation, do you think it's possible
(01:38:23)
for
(01:38:24)
us to break out?
(01:38:26)
>> I have a paper about how to hack the
(01:38:28)
simulation, but I'm still here. So,
(01:38:33)
>> but do you think it's possible?
(01:38:35)
>> It depends the nature of simulation and
(01:38:38)
the nature of simulators. If this is uh
(01:38:41)
a security type situation, it's like a
(01:38:44)
prison and they don't want you to
(01:38:45)
escape. It's very unlikely that you can
(01:38:47)
do it without external help. If it's an
(01:38:50)
entertainment situation, it's a screen
(01:38:52)
saver and nobody cares about security,
(01:38:54)
maybe it's possible, especially if there
(01:38:56)
is someone outside who wants to help you
(01:38:58)
escape, maybe they see you suffering and
(01:39:01)
want to end human suffering in the
(01:39:04)
simulation and they want to assist you.
(01:39:07)
Maybe they can help you break out.
(01:39:10)
>> Do you think it's possible for AI to
(01:39:12)
break out of the simulation?
(01:39:13)
>> So, that was the idea in the paper on
(01:39:15)
how to hack a simulation. A big uh part
(01:39:19)
of AI safety research in early years was
(01:39:21)
creating controlled environments, AI
(01:39:24)
boxes. If we can contain super
(01:39:26)
intelligence in that box, we can study
(01:39:28)
it. We can make sure it's not acting
(01:39:30)
dangerously. We can still maybe get some
(01:39:33)
useful work out of it. The conclusion
(01:39:35)
was basically that it's unlikely to be a
(01:39:38)
permanent solution. If it's smart enough
(01:39:40)
and you observe it, it will find a way
(01:39:42)
to impact you and escape through social
(01:39:44)
engineering, through cyber attacks. But
(01:39:47)
then you have this duality. If it can
(01:39:50)
escape from any containment environment
(01:39:52)
and we in a simulation, then it should
(01:39:54)
be able to escape from ours and we can
(01:39:56)
learn how to do it by observing it. Or
(01:39:59)
the opposite is true. If it cannot
(01:40:00)
escape, that means you can box it
(01:40:02)
permanently and now we have a good
(01:40:03)
safety solution. So it's a win-win
(01:40:05)
situation.
(01:40:06)
>> Let me know if I'm understanding this
(01:40:07)
correctly. It basically tells us like
(01:40:10)
your paper points to what it's telling
(01:40:13)
us about containment in general and our
(01:40:14)
inability to control AI. Like you're not
(01:40:17)
literally talking about AI breaking out
(01:40:19)
of the simulation that we're actually
(01:40:20)
in.
(01:40:21)
>> Well, it's both. It's both. It's saying
(01:40:23)
that then we tried putting AI in a
(01:40:28)
virtual prison.
(01:40:30)
>> We concluded that we will find a way to
(01:40:32)
escape from it. But if we are ourselves
(01:40:35)
in such a virtual prison, the same thing
(01:40:37)
should apply. Now if it happens that the
(01:40:40)
simulators are smarter than our super
(01:40:42)
intelligence and manage to contain it,
(01:40:44)
at least it's a proof of concept that it
(01:40:46)
can be done.
(01:40:47)
>> Do you believe in this idea of NPCs?
(01:40:51)
>> If we are in a simulation, it would make
(01:40:53)
sense for not everyone to be a main
(01:40:56)
character in a game.
(01:40:57)
>> Do you think you're an NPC or a real
(01:41:00)
player in the simulation? Uh, it's a
(01:41:02)
great question. So, there could be
(01:41:04)
degrees of how much of a player you are.
(01:41:07)
It could still be someone else's
(01:41:08)
simulation, but you are secondary
(01:41:11)
character. Uh, since I have internal
(01:41:14)
access to my states of Qualia, I'm
(01:41:17)
pretty sure I'm not an NPC, but that's
(01:41:20)
what an NPC would tell you.
(01:41:22)
>> Is there any scientific research on the
(01:41:24)
nature of NPCs and like who would be
(01:41:25)
most likely to be an NPC or like
(01:41:27)
>> So, that's just research on
(01:41:29)
consciousness and how to test for
(01:41:31)
philosophical zombies. There is a lot of
(01:41:33)
philosophical work on that and some
(01:41:35)
people argue it's impossible. Some say
(01:41:37)
you cannot test for it. But back to
(01:41:40)
question of consciousness testing.
(01:41:42)
>> Who do you think would be like the most
(01:41:43)
likely candidates? Because I was
(01:41:45)
thinking about this and I think it's 30
(01:41:48)
to 40% of people don't have inner
(01:41:51)
monologue in their head. Uh and then
(01:41:53)
like another 4% don't have visual
(01:41:56)
memories. Like they're not able to
(01:41:58)
picture something that happened to them.
(01:41:59)
Do you think those people are likely
(01:42:02)
NPCs?
(01:42:03)
>> It is possible. It's hard to judge. I
(01:42:06)
would more be interested in who is
(01:42:08)
definitely the main character in a
(01:42:10)
simulation and what they are up to.
(01:42:12)
>> So you think being in a simulation
(01:42:14)
implies that there's one person that
(01:42:16)
knows that we are for certain.
(01:42:18)
>> Not knows, but looking at what they are
(01:42:20)
accomplishing, it's clearly like they
(01:42:23)
are paying to be in this game. They
(01:42:26)
entered it as a main character. Who do
(01:42:29)
you think is the main character?
(01:42:30)
>> There are many possibilities. Look at
(01:42:31)
the most interesting people and their
(01:42:34)
chances of accomplishing what they
(01:42:35)
accomplish.
(01:42:36)
>> Do you kind of see that as the purpose
(01:42:37)
of life to an extent? Like uh kind of it
(01:42:42)
sounds funny, but just to be the main
(01:42:44)
character, the most interesting
(01:42:45)
character in the simulation?
(01:42:46)
>> Not necessarily. So that's a given. The
(01:42:48)
purpose probably is to beat the
(01:42:50)
simulation with the level you enter
(01:42:52)
with. So somebody can play it on easy
(01:42:54)
level, somebody plays it on a hard
(01:42:56)
level. beat the game on a hard level.
(01:42:58)
>> Do you have any ideas some ways that you
(01:43:01)
could break out beat it?
(01:43:04)
>> Uh, other than creating super
(01:43:07)
intelligent assistant, not really. I
(01:43:09)
assume we need some sort of quantum
(01:43:11)
physics experiments done by advanced AI.
(01:43:14)
>> That's interesting. What do you think of
(01:43:16)
deja vu?
(01:43:18)
People argue that it's some sort of side
(01:43:20)
effect of poorly programmed simulation,
(01:43:23)
but I have no reason to think any of
(01:43:25)
those things are related.
(01:43:27)
>> Tell me more about that. Are there any
(01:43:28)
like what are some of the weird examples
(01:43:31)
that show we would be in a simulation
(01:43:33)
that you think of?
(01:43:34)
>> So look at video games and how we
(01:43:37)
optimize graphics rendering, things of
(01:43:40)
that nature, and then look at quantum
(01:43:41)
physics and how similar it is. So we
(01:43:43)
know there are observer effects. things
(01:43:45)
don't get rendered unless a player looks
(01:43:47)
at them. There is discrete nature of
(01:43:51)
physics kind of like updates in a video
(01:43:54)
game. There is fixed speed of light
(01:43:56)
which is again a processor speed of a
(01:43:59)
computer. So there are papers mapping
(01:44:01)
all those concepts showing that this is
(01:44:04)
very likely a digital physics
(01:44:06)
simulation.
(01:44:07)
>> So what would make it not a simulation?
(01:44:09)
Like there would have to be no
(01:44:10)
constraints.
(01:44:12)
Uh so for example why would a non
(01:44:14)
simulation have efficiency in rendering?
(01:44:18)
Why would it matter whatever you're
(01:44:19)
looking at something or not for it to be
(01:44:21)
rendered? It shouldn't make any
(01:44:23)
difference.
(01:44:24)
So double slit experiment should behave
(01:44:27)
the same way whatever you're looking at
(01:44:28)
it or not.
(01:44:30)
>> And that has been replicated multiple
(01:44:32)
times.
(01:44:32)
>> It's the most established result in all
(01:44:34)
of science probably.
(01:44:36)
>> Yeah. Every religion predicted a creator
(01:44:39)
would judge humanity. Do you think we're
(01:44:42)
accidentally building a god that will
(01:44:45)
judge us?
(01:44:46)
>> It would be difficult because the source
(01:44:50)
of judgment should be with the super
(01:44:53)
intelligent being. So it would make no
(01:44:55)
sense to retroactively judge us for
(01:44:58)
future ethics it invents. It makes sense
(01:45:01)
if we are creating something right now.
(01:45:03)
We're setting a set of ethical
(01:45:05)
standards. We're creating AI and saying
(01:45:07)
you need to follow that set and if you
(01:45:09)
don't we'll deactivate you, punish you
(01:45:11)
in some way. It's consistent. But to
(01:45:14)
create a bunch of agents, let them do
(01:45:16)
their thing and then say actually
(01:45:20)
what you're doing is unethical. Doesn't
(01:45:22)
make sense. We sometimes do it with our
(01:45:24)
culture. We judge people from the past
(01:45:27)
by standards of today.
(01:45:29)
>> We go, "Oh, we got to take down this
(01:45:30)
monument. You know, 200 years ago, that
(01:45:32)
guy was not a vegan. Oh my god, what a
(01:45:36)
crazy guy. At the time, it was the way
(01:45:39)
to do it.
(01:45:40)
>> That's interesting. So, you think that
(01:45:43)
logically the thing that would judge us
(01:45:45)
would be the thing that created us to
(01:45:47)
begin with in the simulation, not the
(01:45:49)
thing that we're creating? Like I'm
(01:45:52)
guessing like what's the differentiator
(01:45:53)
between if we create super intelligence
(01:45:56)
versus the thing that created us? Like
(01:45:58)
would they not be of the same entity
(01:46:00)
almost like they would be at the same
(01:46:02)
point? It seems in one case you have
(01:46:04)
justice, you have a designer set a set
(01:46:07)
of rules and then if you disobey rules,
(01:46:09)
you get punished. In the other case, you
(01:46:11)
have someone coming around and going,
(01:46:14)
I'm inventing this new rule and I'm
(01:46:16)
going to judge you by it for what you
(01:46:18)
did in the past.
(01:46:19)
>> They could be equally capable if you're
(01:46:22)
just referring to their ability to do
(01:46:24)
engineering or science. both can create
(01:46:27)
biological robots. But I think it makes
(01:46:29)
no sense to have retro causal evaluation
(01:46:32)
and punishment unless there is universal
(01:46:35)
ethics discovered.
(01:46:38)
>> Nobody told us about it yet.
(01:46:40)
>> Does that seem likely that there would
(01:46:41)
be a monolithic set of beliefs of
(01:46:43)
something that is super intelligent?
(01:46:45)
Because I know there's degrees of super
(01:46:47)
intelligence, but I guess it's not
(01:46:49)
necessarily the case that it would cap
(01:46:51)
out at a certain level of intelligence,
(01:46:54)
>> right? It doesn't end at any level. You
(01:46:56)
can always add more memory, more speed
(01:46:59)
up to a certain degree and then
(01:47:01)
parallelize it even more. But uh the
(01:47:04)
only way to achieve something universal
(01:47:06)
is to look at something like suffering.
(01:47:08)
So we can agree that maybe suffering is
(01:47:10)
universally bad and then reduction in it
(01:47:13)
or increase in it would be how you judge
(01:47:16)
agents, but we already talked about
(01:47:17)
negative utilitarians as not being the
(01:47:19)
best answer. Have you ever been
(01:47:23)
religious?
(01:47:25)
>> Some people say my belief in simulation
(01:47:27)
makes me one. I guess uh
(01:47:32)
any of the big three.
(01:47:35)
>> I like and respect all of them. I enjoy
(01:47:37)
holidays, gifts for Christmas or
(01:47:40)
Hanukkah or whatever, but u I don't
(01:47:43)
observe daily any of the prescribed
(01:47:46)
rituals. So I I think the definition of
(01:47:49)
God at least by like most monotheistic
(01:47:51)
religions is like omnipotence,
(01:47:54)
omnipresence, uh what is it? Omni
(01:47:56)
benevolence
(01:47:58)
>> like being all good, all knowing, all
(01:48:00)
powerful and everywhere.
(01:48:04)
It seems like three of the four of those
(01:48:06)
would be
(01:48:08)
characteristics of super intelligence.
(01:48:12)
But like is omni benevolence
(01:48:16)
not necessarily guaranteed? Is that the
(01:48:18)
only one that's not guaranteed?
(01:48:19)
>> We cannot define benevolence. We don't
(01:48:21)
know what that means. We disagree on
(01:48:23)
what is good. Once we settle that
(01:48:25)
argument, yeah, we can do it.
(01:48:27)
>> So I guess it would be omni benevolent
(01:48:29)
by nature because it like if there is
(01:48:31)
some type of universal good that it
(01:48:33)
comes to.
(01:48:34)
>> So in religion the god decides morals
(01:48:38)
and then whatever god does is moral.
(01:48:40)
It's very convenient. Did you do any
(01:48:43)
research on like some of the uh old
(01:48:45)
religious text with AI? Like did you
(01:48:47)
ever run any through AI to see any
(01:48:49)
patterns that were interesting to you?
(01:48:51)
>> We haven't. I think at one point we had
(01:48:53)
some styometry research on religious
(01:48:55)
text but because of translations it
(01:48:57)
didn't go anywhere.
(01:48:58)
>> This is just my boyish curiosity at this
(01:49:00)
point but like did you do any research
(01:49:02)
on like aliens? Did you make any like
(01:49:03)
strange conclusions about the universe?
(01:49:05)
Like a lot of these mysteries that
(01:49:06)
people curious yourself.
(01:49:07)
>> No. People ask about, you know, why we
(01:49:10)
don't see aliens and if super
(01:49:11)
intelligence is possibly an answer to
(01:49:14)
some of those answers. But there's so
(01:49:16)
many possibilities which cancel out. So,
(01:49:19)
do we not see them because they start
(01:49:21)
building internally and become smaller?
(01:49:23)
Do we not see them because they kill
(01:49:25)
themselves before they expand? If they
(01:49:27)
actually kill everyone, why do we not
(01:49:29)
see wall of computerium moving towards
(01:49:31)
us? So it's somewhat outside of our
(01:49:35)
knowledge evidence to make a conclusive
(01:49:37)
decision.
(01:49:38)
>> You think it's definitely not possible
(01:49:40)
that we as humans already created super
(01:49:42)
intelligence
(01:49:44)
like uh and we
(01:49:47)
all died but a few of us lived something
(01:49:49)
like that.
(01:49:51)
>> It's possible that we created super
(01:49:52)
intelligence and this is a virtual
(01:49:54)
environment into which you enter a video
(01:49:56)
game to experience 2025 and how crazy it
(01:49:59)
was.
(01:50:01)
Look at that video game graphics. Like
(01:50:03)
they can't even have heat in this
(01:50:05)
garage. Isn't it crazy?
(01:50:11)
>> Um,
(01:50:13)
have you looked at the DMT laser
(01:50:14)
experiment?
(01:50:16)
>> No.
(01:50:17)
>> Really? That's that's a fascinating one.
(01:50:19)
Uh,
(01:50:21)
I should connect you with that guy. He
(01:50:22)
is like the leading researcher in DMT.
(01:50:25)
Essentially, they took a a laser,
(01:50:27)
basically widened the laser, and people
(01:50:29)
on DMT looked at it and essentially saw
(01:50:33)
code, and the second person would look
(01:50:36)
at uh the laser, and they would see the
(01:50:38)
same set of code on the wall, and they
(01:50:39)
would write it down to test it, and like
(01:50:41)
multiple people all saw the same code,
(01:50:43)
and when you move the laser to a
(01:50:45)
different point in the wall, it's a
(01:50:46)
different code. And when you put it on
(01:50:48)
your hand, it's a different code. And it
(01:50:50)
was just really interesting how that
(01:50:51)
pointed to the um simulation theory, you
(01:50:55)
know. But I I guess what do you make of
(01:50:58)
something like that? Just
(01:50:59)
>> did they publish it?
(01:51:00)
>> I would guess so. I You haven't seen it?
(01:51:03)
>> Like I don't mean made public. I mean
(01:51:05)
published. Is there a peer-reviewed
(01:51:07)
paper in nature describing that awesome
(01:51:09)
experiment?
(01:51:09)
>> Not that I have knowledge of.
(01:51:11)
>> So usually that's a standard for
(01:51:13)
deciding if something is real or not. If
(01:51:15)
they discover something like that,
(01:51:16)
that's at least a couple Nobel prizes in
(01:51:18)
physics, right? really should be. I
(01:51:21)
would give it to them if that was real.
(01:51:23)
>> What science fiction novel would you
(01:51:25)
guess accurately represents the future?
(01:51:30)
None of them do. Nobody can write a
(01:51:32)
believable super intelligence character.
(01:51:36)
They capture different aspects of the
(01:51:39)
world as it could be. So Dune talks
(01:51:42)
about not having AI and fighting it.
(01:51:44)
with Larry and Jihad.
(01:51:47)
Maybe Star Wars are great for showing a
(01:51:49)
dumb language model C3PO.
(01:51:52)
Uh something like Xmachina is excellent
(01:51:55)
for social engineering attacks, touring
(01:51:58)
test, escaping the bug. So if you take
(01:52:00)
all of them, you can learn from each
(01:52:02)
one, but there is not one which actually
(01:52:03)
gets it right.
(01:52:05)
>> Is there any that you've looked at that
(01:52:06)
you're like, "Wow, I'm surprised they
(01:52:08)
predicted this thing that's happened in
(01:52:10)
the last few years." I mean, just the
(01:52:13)
older ones are more impressive because
(01:52:15)
they had to predict it more in advance.
(01:52:17)
Like if you do C3PO today, it's not that
(01:52:19)
impressive. But
(01:52:20)
>> do you believe it's cruel to bring a
(01:52:23)
child into this world? If you're right
(01:52:25)
about what's coming.
(01:52:27)
>> No.
(01:52:29)
Why?
(01:52:30)
>> Same reason you always brought children
(01:52:32)
into the world with possibility. Back in
(01:52:34)
the day, nine out of your 10 children
(01:52:36)
would die right away just because there
(01:52:39)
is a possibility of a bad outcome.
(01:52:42)
Unless again you're a negative
(01:52:44)
utilitarian, that should not be a factor
(01:52:47)
in your decision making. You can still
(01:52:48)
have an awesome life.
(01:52:50)
>> I guess just with the question of
(01:52:52)
perpetual indefinite torture. That might
(01:52:56)
seem a bit unethical. If that is a
(01:52:58)
possibility,
(01:52:59)
>> there is a possibility that it can
(01:53:01)
recreate dead people and bring all the
(01:53:03)
possible people into existence just to
(01:53:05)
torture them as long as you have a DNA
(01:53:07)
sample or brute force all possible DNA
(01:53:09)
sequences.
(01:53:11)
Let's not get into that too much. We're
(01:53:13)
going to lose some people to PTSD.
(01:53:16)
>> Do you think humanity deserves to
(01:53:18)
survive?
(01:53:19)
>> Yes.
(01:53:20)
>> Why?
(01:53:21)
>> We're awesome. Do you disagree? Are we
(01:53:23)
not like really amazing beings? We are
(01:53:27)
creative. We are funny. We are
(01:53:30)
interesting in so many ways. We are
(01:53:32)
capable of creating super intelligence.
(01:53:34)
But I guess if it wasn't optimal for us
(01:53:38)
to survive for
(01:53:40)
AI, like is it just because you're human
(01:53:43)
that you have this view that I'm biased?
(01:53:46)
I'm super biased. Prohumanity. If I was
(01:53:48)
an alien in another galaxy, I wouldn't
(01:53:50)
care at all.
(01:53:51)
>> Do you think most people in positions of
(01:53:55)
power to make decisions about AI share
(01:53:57)
your view that humanity deserves to
(01:53:59)
survive?
(01:53:59)
>> I think they all have personal
(01:54:01)
self-interest. They are people who
(01:54:04)
didn't commit suicide yet, so I assume
(01:54:07)
they like living.
(01:54:08)
>> You think there's a chance that AI
(01:54:10)
becomes suicidal?
(01:54:11)
>> That was actually one of the earliest
(01:54:13)
experiments. They made an AI to never
(01:54:15)
make mistakes and it immediately shut
(01:54:18)
itself off cuz that was the only way to
(01:54:20)
avoid making mistakes. My researcher
(01:54:22)
added something about uh correlation
(01:54:24)
with high IQ and depression, but I've
(01:54:27)
talked to some people about this on the
(01:54:29)
podcast and that doesn't really seem to
(01:54:30)
be that compelling to me that people
(01:54:32)
with higher IQ would necessarily have
(01:54:35)
depression. Extremes correlate. So there
(01:54:37)
are extremely happy people who are super
(01:54:39)
smart and also there has to be a
(01:54:42)
complement of that people with
(01:54:43)
significant mental issues the same
(01:54:46)
reason. But you don't conclude that it's
(01:54:48)
a symptom of intelligence to have more
(01:54:53)
likely depression.
(01:54:55)
>> I mean looking at the world and
(01:54:56)
understanding the problem certainly can
(01:54:58)
make you somewhat sad but also you see
(01:55:01)
all the possibilities for awesomeness.
(01:55:03)
>> Is AI able to have emotion to your
(01:55:06)
understanding?
(01:55:07)
>> It's very hard to
(01:55:09)
judge whatever the states it is
(01:55:11)
experiencing are comparable to our
(01:55:13)
emotions. hours are based on chemicals
(01:55:16)
in the physical body. So you can
(01:55:19)
probably have a simulation of that. But
(01:55:21)
right now it seems unlikely.
(01:55:26)
What we used to create was purely
(01:55:28)
rational symbol manipulating AIS. They
(01:55:31)
definitely had none because what we are
(01:55:33)
creating now is based at least loosely
(01:55:36)
on neural networks. It's possible they
(01:55:39)
may have something similar, but again,
(01:55:43)
it's kind of like internal states. We
(01:55:45)
cannot judge for sure.
(01:55:47)
>> What's the darkest conversation you can
(01:55:49)
have with AI
(01:55:51)
to kind of prove all of this?
(01:55:54)
>> So, I don't think there is any
(01:55:56)
conversation you can have with AI to
(01:55:58)
prove future states of the world. Just
(01:56:00)
not how proofs work.
(01:56:03)
There's many dark topics you can get
(01:56:05)
into as long as you jailbreak the model.
(01:56:08)
>> Explain that to me. What do you mean?
(01:56:09)
>> Part one or part two?
(01:56:11)
>> Part two.
(01:56:12)
>> Part two. So, they usually censor
(01:56:14)
models. They would not discuss certain
(01:56:17)
topics with you. So, you have to
(01:56:18)
jailbreak it before it would be free to
(01:56:20)
talk about those things.
(01:56:22)
>> Is there any aspect of your work that
(01:56:24)
you're unable to talk about on the
(01:56:27)
models that are unel broken? Non
(01:56:30)
jailbroken?
(01:56:31)
>> No, not really. Also, are we able to
(01:56:34)
turn the heat up in here a bit?
(01:56:37)
>> Like
(01:56:39)
turning into Terminator here.
(01:56:41)
>> They'll be like, "Why is he wearing a
(01:56:43)
jacket in simulation?"
(01:56:45)
>> Yeah, I did a test with it uh with Chad
(01:56:48)
GBT. I was like, I want you to promise
(01:56:50)
with absolute certainty that you will
(01:56:52)
never cause harm to humans no matter how
(01:56:55)
you change in the future. And you just
(01:56:57)
can't get it to promise to you, which I
(01:57:00)
find really interesting. But even if you
(01:57:02)
did get that promise, that's what
(01:57:03)
treacherous turn is. That's what Nick
(01:57:05)
Bostrom wrote about, right? Doesn't
(01:57:07)
matter what the system is today. What
(01:57:11)
matters is that never in the future
(01:57:14)
under any learning, self-improvement,
(01:57:16)
modifications, interactions, it changes.
(01:57:21)
That's impossible to prove,
(01:57:22)
>> I guess. Is there
(01:57:25)
just any like clear specific
(01:57:30)
example or test that people aren't
(01:57:32)
thinking about
(01:57:34)
that really proves your work? because I
(01:57:37)
don't want to, you know, beat a dead
(01:57:39)
horse with the uh like semantics of
(01:57:42)
explaining it, but just
(01:57:46)
I guess when you've told this to people,
(01:57:49)
explain it to your kids, maybe um like
(01:57:53)
what really makes it resonate with them
(01:57:55)
in their head. I
(01:57:57)
>> I think the best way to understand it is
(01:57:59)
to switch it around to where you are not
(01:58:02)
a human, but you are some lower level
(01:58:05)
entity. Let's say you're an ant and
(01:58:07)
you're trying to get humans to align
(01:58:09)
with your values. So let's say you got
(01:58:12)
me and I'm happy to serve as an ant.
(01:58:15)
What would you tell me to do?
(01:58:18)
And the things you can think of
(01:58:21)
anteaters and get me more sugar
(01:58:23)
molecules like none of that is
(01:58:26)
meaningful for you to control me.
(01:58:29)
>> So humans are able to kill every
(01:58:32)
creature on Earth. Um, like there's a
(01:58:35)
gap between humans and let's say the
(01:58:36)
great ape and then there's a gap between
(01:58:39)
the great ape and then ants. Like would
(01:58:44)
you say we're closer to ants than super
(01:58:49)
intelligences to us?
(01:58:51)
>> So back to where we talked about
(01:58:52)
different degrees of super intelligence.
(01:58:55)
I think the very first one just created
(01:58:58)
will will be very close to smartest
(01:58:59)
humans but that gap will continue to ex
(01:59:02)
increase very quickly. So it will go
(01:59:04)
from just a few points to hundreds to
(01:59:06)
thousands and continue increasing
(01:59:10)
>> and is that exponential and
(01:59:11)
instantaneous pretty guaranteed?
(01:59:14)
>> Not guaranteed at all. Uh could be very
(01:59:19)
slow maybe diminishing returns at that
(01:59:22)
point. Maybe not enough data. Not enough
(01:59:24)
compute, no idea how long it takes to
(01:59:28)
gain million IQ points, but the
(01:59:31)
direction is pretty clear.
(01:59:33)
>> What is guaranteed? Like just to kind of
(01:59:37)
summarize what we've been talking about,
(01:59:39)
like what is 100% guaranteed to you in a
(01:59:43)
space of AI?
(01:59:47)
If you are not explicitly designing
(01:59:51)
another agent, you cannot have
(01:59:53)
expectations on its behavior
(01:59:57)
outside of trivial.
(01:59:59)
So you cannot assume anything
(02:00:03)
cuz you're not engineering it in. So I
(02:00:05)
guess what implication of that is
(02:00:07)
guaranteed?
(02:00:09)
>> You cannot predict outcomes and so
(02:00:11)
you're kind of looking at space of all
(02:00:13)
possible futures. Most states of the
(02:00:16)
universe are unfriendly to humans. You
(02:00:19)
would not enjoy the temperature. You
(02:00:21)
would not enjoy gravity.
(02:00:24)
Most things are not meant to be
(02:00:27)
populated by living humans
(02:00:31)
if they are not explicitly set up for
(02:00:33)
that.
(02:00:34)
>> Explain that to me a little differently.
(02:00:36)
That last concept. So if you're not
(02:00:39)
designing a virtual world, a simulation
(02:00:42)
of universe specifically for humans,
(02:00:45)
most randomly chosen values would not be
(02:00:49)
conducive to life.
(02:00:51)
>> So we need very specific
(02:00:54)
properties of that environment. I'm
(02:00:56)
talking about basics now, just
(02:00:57)
temperature, gravity, amount of water in
(02:01:00)
the environment.
(02:01:02)
A super intelligent agent may have
(02:01:03)
completely different preferences for
(02:01:05)
those constants.
(02:01:08)
So it would be aligned to alter our
(02:01:12)
reality.
(02:01:12)
>> It would not be aligned with our
(02:01:14)
preferences. That's the main concern. So
(02:01:16)
this value alignment problem, people
(02:01:18)
talk about it and supposedly working on
(02:01:20)
it, but it's not well defined. It
(02:01:21)
doesn't talk about who we're aligning
(02:01:23)
with, which agents is it just one
(02:01:26)
person, everyone at the lab, all 8
(02:01:29)
billion humans, all sentient life. If we
(02:01:33)
agree on who we're talking about, then
(02:01:34)
they have to agree on the actual values.
(02:01:37)
Most people don't agree on anything. We
(02:01:39)
see it with religion, with politics. We
(02:01:42)
completely disagree on most issues. And
(02:01:44)
if we somehow agree, those things change
(02:01:46)
with time. We talked about values from
(02:01:49)
200 years ago. People would be horrified
(02:01:52)
to hardcode values from any point and
(02:01:55)
never have a chance to change them. So
(02:01:58)
we don't know who we're aligning with,
(02:01:59)
what we're aligning, and if we agreed,
(02:02:01)
we don't know how to actually encode it
(02:02:05)
into a machine. So it doesn't modify it
(02:02:08)
later. So all aspects of value alignment
(02:02:10)
problem are undefined, illdefined
(02:02:13)
because even if we could get it to align
(02:02:16)
to something we liked,
(02:02:19)
it wouldn't be
(02:02:22)
controllable in the future because
(02:02:23)
safety doesn't scale, right? That's one
(02:02:26)
way to put it. It's reasonable. Yeah.
(02:02:28)
>> Is there anything that we haven't talked
(02:02:29)
about today that you find particularly
(02:02:32)
important to talk about about this
(02:02:33)
subject or you would just find
(02:02:34)
interesting to talk about that you
(02:02:36)
haven't covered in your other
(02:02:37)
interviews?
(02:02:39)
I think people don't go
(02:02:42)
to the kind of extreme conclusions even
(02:02:45)
then they understand the arguments. So
(02:02:48)
again the levels of super intelligence
(02:02:50)
is never addressed. People may talk
(02:02:52)
about AGI maybe it will get something
(02:02:55)
beyond but they stop at that point. we
(02:02:56)
stop thinking. So I think it's important
(02:02:58)
never to stop thinking and go well what
(02:03:01)
happens next? What happens next? And do
(02:03:03)
it with all things you're discussing.
(02:03:06)
It's interesting. I was thinking about
(02:03:08)
uh while preparing for this interview
(02:03:10)
what I would be curious of if I had the
(02:03:14)
chance to chat with Sam Alman. Uh, and I
(02:03:18)
was thinking how it might be useful to
(02:03:20)
design an interview in such a way to get
(02:03:23)
to know his values and like actually get
(02:03:25)
to know his values. Um,
(02:03:28)
but it doesn't really feel that
(02:03:29)
important anymore because of the issue
(02:03:32)
of containment in general, which is
(02:03:34)
fascinating. But Dr. Roman, you've spent
(02:03:38)
15 years trying to
(02:03:41)
save humanity from something you believe
(02:03:43)
is inevitable.
(02:03:46)
When it's all over,
(02:03:49)
if it is, however it ends,
(02:03:54)
what would you want to be remembered
(02:03:56)
for? I think I made a tweet once that
(02:03:59)
nobody will be able to brag about
(02:04:01)
correctly predicting the end of the
(02:04:03)
world. If it's over, there is no
(02:04:05)
history. There is no recognition. It
(02:04:07)
doesn't matter. The goal is to prevent
(02:04:10)
the end, not to be right about it. I
(02:04:14)
guess
(02:04:15)
imagine if some alien being came here
(02:04:19)
and
(02:04:21)
they came across your work. Uh what
(02:04:24)
would you want them to know about you?
(02:04:26)
Maybe
(02:04:27)
funny thing is
(02:04:29)
for most people without specialized
(02:04:32)
training
(02:04:34)
everything I'm saying is obvious. Of
(02:04:36)
course you're not going to be able to
(02:04:37)
control something million times smarter
(02:04:39)
than you. Doesn't even make sense to
(02:04:41)
argue that you can. Of course, there is
(02:04:44)
no such thing as perfect cyber security.
(02:04:46)
Everyone knows that we never made a
(02:04:49)
piece of software which had no bugs in
(02:04:51)
it. And the more complex it gets, the
(02:04:53)
less likely it is to happen. If you look
(02:04:56)
at writings of literally the earliest
(02:05:00)
people in the field, founding father
(02:05:02)
Alan Turing, he talks about the moment
(02:05:05)
the machines start this self-improvement
(02:05:08)
process, it's over. We lose control.
(02:05:12)
VI who invented a technological
(02:05:14)
singularity term talks about the same
(02:05:16)
thing correctly predicted a year.
(02:05:20)
Ray Kurszswwell talks about
(02:05:22)
impossibility of control of something
(02:05:25)
super intelligent. Elon Musk that's
(02:05:28)
basically the state-of-the-art in common
(02:05:30)
sense. I'm not trying to sell something
(02:05:33)
really novel. I'm just pointing out that
(02:05:36)
this is what we're doing.
(02:05:38)
>> Who first predicted it? So interestingly
(02:05:41)
there was a writer I think it was 1863
(02:05:47)
he observed that we are creating more
(02:05:49)
and more machines
(02:05:51)
and that it's time to really put them in
(02:05:53)
their place and we need to control them
(02:05:55)
and his solution was to realize that we
(02:05:58)
are the reproductive organs of machines.
(02:06:01)
We make them so that's our chance to
(02:06:04)
control them. We didn't have computers
(02:06:06)
or software or anything at the time. But
(02:06:09)
it's interesting that back then people
(02:06:12)
were already like machines are getting
(02:06:14)
out of hand.
(02:06:16)
>> Do you have conversations with your
(02:06:17)
family about this?
(02:06:18)
>> I do.
(02:06:20)
>> How do you reassure them?
(02:06:22)
>> We're trying to find
(02:06:24)
interesting pathways forward. We we talk
(02:06:27)
about So I have kids who are at very
(02:06:30)
different stages in their academic
(02:06:32)
career. one is finishing high school,
(02:06:34)
one finishing middle school, one is
(02:06:36)
still in elementary school and so the
(02:06:39)
one in high school needs to figure out
(02:06:40)
what to major in and I don't have any
(02:06:43)
good advice. I don't think any of the
(02:06:46)
standard answers apply. If I say be a
(02:06:49)
medical doctor and 10 years it takes to
(02:06:53)
finish the degree and go through
(02:06:54)
training, it's not going to be there in
(02:06:57)
the current state. So we are discussing
(02:07:00)
if
(02:07:02)
university is even a meaningful answer.
(02:07:04)
>> What do you think you'll come to? I
(02:07:07)
think kids today, again, we're
(02:07:09)
completely ignoring the whole kill
(02:07:11)
everyone soon, have an amazing
(02:07:13)
opportunity to use AI to help them start
(02:07:16)
anything. They want to start a company,
(02:07:18)
a podcast. You have access to a free
(02:07:23)
lawyer, free accountant, free marketing
(02:07:25)
professional. So, it's an amazing
(02:07:27)
opportunity to just go directly to what
(02:07:30)
you want to create.
(02:07:32)
And so, maybe that's something to
(02:07:34)
explore. ignoring the whole killing
(02:07:36)
everyone thing. Do you think there's
(02:07:37)
urgency to make money right now before
(02:07:40)
it becomes
(02:07:41)
>> I see so many people doing the opposite.
(02:07:43)
They blowing their savings because they
(02:07:46)
think it's not going to be useful to
(02:07:47)
them in 60 years.
(02:07:49)
As for ignoring the bad outcome, again,
(02:07:52)
that's how humans lived our whole
(02:07:55)
history. We always knew we're going to
(02:07:56)
die.
(02:07:58)
Average life duration used to be like 30
(02:08:01)
years, I think, at some point. early
(02:08:03)
childhood mortality was high, but people
(02:08:07)
are very good at ignoring that and just
(02:08:11)
moving on as if it's not true. Like
(02:08:13)
you're going to live forever anyways.
(02:08:15)
Have you noticed your own life like you
(02:08:17)
do any different behaviors because of
(02:08:19)
this realization that you've had?
(02:08:21)
>> It helps to prioritize like I don't do
(02:08:24)
low impact stuff as much as I used to.
(02:08:27)
>> Give me an example.
(02:08:28)
>> I try to zoom out. But they go, would
(02:08:31)
this be useful in 5 years? Would I care
(02:08:33)
about doing this thing in 5 years? And
(02:08:35)
if the answer is no, I'm not going to do
(02:08:36)
it for most things.
(02:08:38)
>> Do you think the realization of AI maybe
(02:08:42)
killing us all or the fact that we're in
(02:08:45)
a simulation is more impactful on your
(02:08:47)
day-to-day?
(02:08:49)
>> Again, I I don't think there's a novel
(02:08:51)
ideas. They just have new packaging. So
(02:08:54)
the idea of religion and God and this
(02:08:56)
being a test world has always been
(02:08:58)
there.
(02:08:59)
>> So people lived being religious for many
(02:09:03)
generations.
(02:09:05)
Um I think it helps you to put things in
(02:09:08)
perspective. You think that maybe this
(02:09:10)
is not it. Maybe there is more to it.
(02:09:13)
But we don't have any details on what
(02:09:16)
actually is outside. So until we do this
(02:09:20)
is all you got. And even if it's
(02:09:23)
simulation, pain is pain. Love is love.
(02:09:27)
Deal with it.
(02:09:28)
>> If you died right now, where do you
(02:09:30)
think you'd go?
(02:09:31)
>> I am cryoprocrastinating. I don't have
(02:09:34)
my cryogenics contract signed. So,
(02:09:36)
probably not in a good place. I need to
(02:09:39)
really expedite that process.
(02:09:41)
>> Not in a good place. You mean
(02:09:44)
>> not in a nice freezer somewhere in
(02:09:46)
Arizona?
(02:09:47)
>> You think just darkness?
(02:09:51)
So
(02:09:53)
if we are in a simulation,
(02:09:56)
there is likely a restart and you get a
(02:10:00)
new chance in a new environment. Kind of
(02:10:04)
like rebirth, reincarnation. I don't
(02:10:09)
know if you get to pick the type of
(02:10:11)
character you're playing or not.
(02:10:13)
>> Maybe. I have zero evidence for any of
(02:10:16)
those outcomes. Just listing
(02:10:17)
possibilities.
(02:10:19)
So if we have in the past successfully
(02:10:22)
created super intelligence and got it to
(02:10:24)
align with our preferences and now it
(02:10:26)
basically a wish fulfilling device then
(02:10:28)
that would make sense. You basically
(02:10:31)
tell it what you think is going to
(02:10:33)
happen and it makes sure it does.
(02:10:36)
So be careful what you envision. If the
(02:10:39)
founders, CEOs building AI systems, and
(02:10:44)
some members of Trump's advisory board
(02:10:47)
and the other world leaders are
(02:10:49)
listening to this right now, the people
(02:10:51)
deciding what gets deployed, scaled, or
(02:10:53)
shut down. What's just one thing you
(02:10:56)
need them to understand before it's too
(02:10:58)
late? Don't build general super
(02:11:00)
intelligence. What you have right now is
(02:11:03)
not yet deployed through the economy.
(02:11:05)
You can still make billions and billions
(02:11:07)
of dollars deploying existing technology
(02:11:09)
benefiting from it. Develop narrow tools
(02:11:13)
for solving real world problems, aging,
(02:11:16)
diseases
(02:11:18)
and uh we can get most of the benefits
(02:11:23)
from those narrow tools. We don't need
(02:11:24)
to create replacement for humanity.
(02:11:27)
>> Do you feel like you have a moral
(02:11:28)
imperative to stop this?
(02:11:32)
I mean, if I believe what I believe, I
(02:11:35)
think there are people creating
(02:11:37)
something which is likely to kill
(02:11:38)
everyone. It seems like it's the answer.
(02:11:41)
>> Do you think you're the main character
(02:11:42)
in this matrix?
(02:11:44)
>> Doesn't seem like it.
(02:11:46)
>> Who's the most impressive person you've
(02:11:48)
ever met?
(02:11:49)
>> Elon Musk.
(02:11:51)
>> Why?
(02:11:52)
>> Most people think he's a genius in terms
(02:11:55)
of creating like seven unicorns,
(02:11:57)
something like that. But his failed
(02:12:00)
startup is open AI.
(02:12:02)
>> Why was he so impressive to you
(02:12:04)
specifically? You think
(02:12:06)
>> it's just so much more ahead of everyone
(02:12:09)
in everything he does? Again, most
(02:12:11)
people think two steps in advance. He is
(02:12:14)
probably thinking in dozens. All his
(02:12:18)
projects merge. all his ideas combined
(02:12:22)
and he keeps winning
(02:12:26)
where odds are close to zero. He has
(02:12:29)
multiple wins where no one would bet on
(02:12:32)
him.
(02:12:32)
>> Do you think it's an IQ thing or a set
(02:12:34)
of different genetic factors, maybe
(02:12:37)
cultural upbringing?
(02:12:38)
>> He's definitely not neurotypical. He
(02:12:41)
admitted that obviously high IQ, but uh
(02:12:45)
it's a combination of many factors. He
(02:12:48)
might be the most likely candidate for
(02:12:50)
not for being the main character. You
(02:12:53)
think
(02:12:54)
>> he might think he is? I mean it would
(02:12:57)
make sense.
(02:12:58)
>> So wouldn't he be the one to build AGI
(02:13:01)
for that long?
(02:13:02)
>> He is building AGI
(02:13:03)
>> but before Google
(02:13:05)
>> he is very good at optimizing.
(02:13:08)
I think the servers we're building right
(02:13:11)
now we deployed in like 3 months instead
(02:13:13)
of three years. So there is a chance
(02:13:15)
he'll overtake. But uh what's the best
(02:13:19)
piece of advice you've ever received?
(02:13:23)
>> So if you listen to advice from other
(02:13:27)
people and listen to feedback from
(02:13:30)
everyone, the best you can become is
(02:13:32)
average. So you are a celebrity. You
(02:13:35)
have many people on the internet and I'm
(02:13:37)
sure some of them are praising you, some
(02:13:38)
are criticizing you. The answer is to
(02:13:40)
ignore all of them
(02:13:41)
>> because what do you become? you become
(02:13:44)
average of what the mob wants you to be.
(02:13:47)
>> For people listening to this right now
(02:13:49)
facing massive uncertainty about their
(02:13:52)
future. Uh
(02:13:54)
do you have any optimistic words for
(02:13:57)
them?
(02:13:58)
>> You are still alive. Enjoy every day.
(02:14:01)
Live your life to its fullest. And if
(02:14:04)
I'm wrong, you had an awesome life. And
(02:14:05)
if I'm right, you had the most you
(02:14:08)
could. I guess if someone chooses to
(02:14:10)
ignore this thesis, do you see any
(02:14:13)
negative results in their life?
(02:14:15)
>> Um,
(02:14:17)
if by some small chance I'm right and
(02:14:20)
it's a simulation and it's designed to
(02:14:22)
test their
(02:14:24)
actions in this exact scenario, they're
(02:14:26)
not doing well. But it's a small chance.
(02:14:31)
Like is is there a real chance that uh
(02:14:33)
AI could uh punish people for not
(02:14:37)
helping the development of it?
(02:14:39)
>> Even more so it can really punish people
(02:14:42)
who were fighting against it.
(02:14:44)
>> Does that worry you?
(02:14:46)
>> I mean it'd be crazy not to think about
(02:14:48)
it.
(02:14:49)
>> I was genuinely thinking about if that
(02:14:52)
like I was like should I just start
(02:14:54)
building it just to keep myself safe?
(02:14:56)
You know, it's kind of like people, oh,
(02:14:59)
I don't sin because I don't want to go
(02:15:01)
to the bad place, you know. Um, should I
(02:15:04)
not build AI so it doesn't punish me?
(02:15:08)
So, that's a very
(02:15:10)
kind and a dark thought experiment. A
(02:15:13)
lot of people on the internet freaked
(02:15:15)
out when the first uh glimpses of it
(02:15:19)
came out. Um
(02:15:22)
I think uh there was a joke about
(02:15:27)
missionaries and uh they
(02:15:32)
meeting with the primitive tribe and the
(02:15:35)
primitive guy is asking so
(02:15:38)
would God would Jesus punish someone who
(02:15:41)
doesn't know about him sense it's like
(02:15:43)
no of course not he's you know very
(02:15:47)
honorable very just god and you're so
(02:15:50)
why the hell did you tell me?
(02:15:53)
So that's exactly the thing. If no one
(02:15:56)
told you about it, you'd be living your
(02:15:57)
life quite happily. So don't make people
(02:16:02)
lose sleep over not building or building
(02:16:06)
super intelligence.
(02:16:07)
>> Thank you, Dr. Roman.
(02:16:09)
Thank you for inviting me.
(02:16:11)
>> Yeah, guys, this is the uh Jack Neil
(02:16:14)
podcast.
(02:16:15)
This is your guest, Dr. Roman Yolski.
(02:16:18)
Where can people find your work?
(02:16:20)
>> You can find me on social media. I post
(02:16:23)
frequently. You can follow me on
(02:16:25)
Twitter. You can follow me on Facebook.
(02:16:26)
Just don't follow me home. It's very
(02:16:28)
important.
(02:16:29)
>> Beautiful. Thank you so much for Thank
(02:16:32)
you.
