↔
Title: Open AI Chief Outlines His AI Thesis in 9 Minutes
Duration: 00:09:01
Total Correct Answers:
Current Caption
Correct
Learning Modes
YouTube Video Transcript Hide
Ask AI:
Export as:
Ask AI Result
The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here
(00:00:00)
So many people are trying to invest in
(00:00:01)
AI now. This is it's like there's
(00:00:03)
literally I this just feels like it's
(00:00:05)
like it's like a bubble time again. Like
(00:00:07)
there's $100 million rounds here.
(00:00:08)
There's $200 million rounds. There's
(00:00:10)
everyone we pitch who pitches us.
(00:00:11)
There's the AI part of the pitch. All my
(00:00:13)
old SAS companies, one of them was a
(00:00:15)
legal tech company. You probably don't
(00:00:16)
even know this. That's partnered with
(00:00:17)
OpenAI and suddenly sold for like eight
(00:00:19)
times the last round valuation to
(00:00:21)
Thompson Reuters because it was all of a
(00:00:22)
sudden like doing really useful things
(00:00:23)
for parallegals. I mean this is just
(00:00:25)
it's changed a lot of things. like in
(00:00:26)
terms of like other people doing AI like
(00:00:28)
what else is useful that people are
(00:00:30)
working on that you're excited that
(00:00:31)
you're seeing people do because this
(00:00:32)
this is outside of open AI now is there
(00:00:34)
infrastructure you like is there is
(00:00:35)
there a need for infrastructure outside
(00:00:36)
of open AI like like how do you see this
(00:00:38)
>> yeah I think I I'm always a little
(00:00:39)
worried about the infrastructure work
(00:00:41)
because I think you know you're solving
(00:00:42)
the problems as they exist today and
(00:00:44)
when GPD 4.5 and GPD 5 come out you know
(00:00:47)
they're going to have fundamentally
(00:00:47)
different use cases and fundamentally
(00:00:49)
different infrastructure what I like to
(00:00:50)
see is people who are using AI to solve
(00:00:53)
something that wasn't possible with AI
(00:00:54)
>> you like the application layers Yeah.
(00:00:56)
And what I what I think about is AI now
(00:00:59)
it's like having an infinite number of
(00:01:02)
interns with very short attention spans.
(00:01:05)
And so anything you can you could have
(00:01:07)
solved with interns. You know GPD3 was
(00:01:09)
sort of like a high school student.
(00:01:11)
GBD3.5 is maybe a college freshman. GBD4
(00:01:13)
is maybe a college junior. GBD5 is going
(00:01:17)
to be something else.
(00:01:18)
>> Bigger intern.
(00:01:18)
>> And so you know if you have these
(00:01:20)
interns, what can you do with them? And
(00:01:22)
what new business does that allow you to
(00:01:24)
create? Like that's the thing I'm really
(00:01:25)
excited about.
(00:01:26)
>> Lots of free interns.
(00:01:27)
>> Yeah,
(00:01:29)
that's cool. Well, the app layers, you
(00:01:31)
know, where I've built a ton of things.
(00:01:32)
So, I I'm getting pitched tons of
(00:01:33)
infrastructure. I'm getting I tons of my
(00:01:35)
friends are like, you know, getting
(00:01:37)
billions of dollars of loans to get like
(00:01:39)
A100s and H100s and like doing like
(00:01:41)
crazy amounts of like hardware
(00:01:43)
infrastructure. Is that is that going to
(00:01:44)
be needed still? I assume that's going
(00:01:46)
to be needed still. Or how do you think
(00:01:47)
about that side?
(00:01:48)
>> Yeah, I think it's really hard to say. A
(00:01:50)
lot of people have tried building new
(00:01:51)
chips and you know I think what what
(00:01:54)
everybody has seen so far is that Nvidia
(00:01:56)
has just been able to continually do
(00:01:58)
better and better and better because
(00:01:59)
they have you know the big market and a
(00:02:01)
lot of capital to throw at it. So I
(00:02:02)
think I think it's pretty hard to bet
(00:02:04)
against Nvidia. On the other hand, a
(00:02:05)
small chance of a really big thing uh is
(00:02:07)
also something worthwhile.
(00:02:08)
>> Yeah. You know when I was at Stanford in
(00:02:10)
computer science doing graphics I was
(00:02:11)
obsessed with Nvidia at the time. That's
(00:02:12)
where I wanted to work and then PayPal
(00:02:14)
and Peter but that was
(00:02:15)
>> you probably did okay.
(00:02:16)
>> Probably did fine. But it's pretty funny
(00:02:18)
how the company I was really into ended
(00:02:19)
up pivoting and doing doing this.
(00:02:21)
>> So my PhD before I started Palunteer was
(00:02:23)
in AI and I left uh because I felt AI
(00:02:26)
wasn't happening and this was 2005 and
(00:02:28)
in fact AI was not happening in 2005.
(00:02:30)
>> Yes. So it would have been very hard to
(00:02:31)
work on AI 20 years ago.
(00:02:32)
>> The the key thing about AI was actually
(00:02:34)
just big data and you didn't need AI to
(00:02:36)
unlock it.
(00:02:37)
>> The lesson was that everyone who got
(00:02:38)
went to work on this just never it was
(00:02:39)
just like a bad choice forever. It was
(00:02:41)
exciting but then it was never the right
(00:02:42)
choice.
(00:02:43)
>> Yeah. until it was
(00:02:43)
>> until it was which which so how do you
(00:02:45)
know that it was like how do you how do
(00:02:46)
you have the intuition that it might be
(00:02:48)
>> well so what happened is uh in I think
(00:02:50)
2011 uh there was a paper that was
(00:02:52)
published by Ilia Sutzkver and some
(00:02:54)
other people who IAS are chief
(00:02:55)
scientists at openai where they
(00:02:58)
basically reinvented uh neural networks
(00:03:01)
and they showed that if you ran them on
(00:03:02)
GPUs you could plug in huge amounts much
(00:03:04)
bigger amounts of compute and much
(00:03:06)
bigger amounts of data and neural
(00:03:08)
networks went from being oh that one
(00:03:10)
thing that you can kind of use for
(00:03:11)
handwriting to something that was
(00:03:13)
actually the best way to identify
(00:03:14)
images.
(00:03:14)
>> And for some of our listeners who aren't
(00:03:15)
as technical, we're talking about a
(00:03:17)
neural network. That's like that's
(00:03:18)
something that gives that gives
(00:03:19)
iterative feedback on things. Explain
(00:03:21)
explain how would you explain neural
(00:03:22)
network?
(00:03:22)
>> So a neural network um the the analogy
(00:03:25)
that most people use is that it's like
(00:03:26)
the brain and it is but at a very very
(00:03:28)
high level. So a neuron is basically um
(00:03:32)
you can think of it as as having
(00:03:34)
connections to other neurons and the
(00:03:36)
strength of those connections determines
(00:03:38)
when you know the first neuron the lower
(00:03:40)
level neuron fires then that makes the
(00:03:42)
the higher level neuron fire. And if you
(00:03:45)
train these um you do it by showing them
(00:03:48)
the right answer and then basically
(00:03:50)
doing what's called back propagating the
(00:03:51)
error, you know, from sort of the top of
(00:03:54)
the network where the answer is all the
(00:03:56)
way back down um through the whole
(00:03:58)
network.
(00:03:58)
>> So these things are learning patterns
(00:04:00)
and and is it a pattern detection type
(00:04:02)
of situation then or what is it with the
(00:04:03)
with the flashing what they're
(00:04:04)
detecting?
(00:04:04)
>> Yeah, you're building hierarchical
(00:04:06)
representation. So at the very bottom,
(00:04:07)
think about think about a vision network
(00:04:08)
because it's easy to visualize. So at
(00:04:09)
the very bottom you're the neurons are
(00:04:11)
detecting edges and then you they go up
(00:04:13)
a little bit higher and they're
(00:04:14)
detecting corners and if you go high
(00:04:16)
enough they're detecting wheels and then
(00:04:18)
above that they're detecting cars
(00:04:19)
>> quarters in this way then it's just then
(00:04:20)
it's this a square if this and this this
(00:04:22)
could be a wheel and then it could be a
(00:04:23)
car. Got it.
(00:04:23)
>> Yeah. And and at some point there's a
(00:04:24)
Joe Londale neuron
(00:04:26)
>> um that you know is out there you know
(00:04:28)
and and individual people you can
(00:04:30)
actually find these neurons in the
(00:04:31)
networks.
(00:04:31)
>> I remember reading on intelligence by
(00:04:33)
Hawkins who didn't actually end up
(00:04:34)
solving all these problems but he
(00:04:35)
believe they said there's like six
(00:04:36)
layers of the neoortex that is part of
(00:04:38)
the vision system. Is that is that right
(00:04:39)
or is there actually a lot more than
(00:04:40)
that?
(00:04:41)
>> You know, I'm not I'm not a cognitive
(00:04:42)
scientist. So I the other thing is I
(00:04:44)
think these the analogy between the
(00:04:46)
brain and the neural networks it it's
(00:04:47)
it's very inspirational but you can take
(00:04:49)
it too far.
(00:04:49)
>> It's very imprecise. What OpenAI did is
(00:04:51)
say but you basically didn't base it on
(00:04:52)
the brain. You based it on based it on
(00:04:54)
like building it up from scratch from
(00:04:55)
from just first principles.
(00:04:56)
>> So I think that the thesis behind open
(00:04:58)
AI when I joined in 2017 was that neural
(00:05:01)
networks were the final architecture
(00:05:03)
that could take AI all the way to human
(00:05:05)
level intelligence or AGI. But wasn't
(00:05:08)
there like a trans wasn't there a
(00:05:09)
transformer breakthrough that was really
(00:05:10)
important though right around 2017?
(00:05:12)
>> Yeah, that's right. So um you know the
(00:05:14)
first couple years of of open AI we were
(00:05:17)
using sort of standard neural networks
(00:05:19)
and then there was a a paper at Google
(00:05:21)
some people came up with an idea called
(00:05:22)
transformers that allow you to um take
(00:05:26)
better understanding of the context. Uh
(00:05:29)
so for example you could look at
(00:05:31)
documents and now you could really
(00:05:34)
understand the preceding three or four
(00:05:36)
pages when the prior architectures had
(00:05:38)
struggled just to understand like the
(00:05:40)
the previous 10 words.
(00:05:41)
>> What's the intuition for why
(00:05:42)
transformers worked better? Like what's
(00:05:44)
going on there?
(00:05:45)
>> It's basically because that they have a
(00:05:46)
notion of attention. So you can pay
(00:05:48)
attention to particular words in a
(00:05:51)
document versus uh the previous
(00:05:53)
architecture had a notion of memory. So
(00:05:56)
you could sort of me remember the words
(00:05:58)
that you had seen, but it's much easier
(00:06:00)
to be able to look at a sheet of paper
(00:06:02)
and sort of see the different words and
(00:06:04)
think about, you know, reading those
(00:06:05)
words.
(00:06:05)
>> Pay attention to what matters.
(00:06:06)
>> Pay attention to what matters,
(00:06:07)
>> which is kind of just how the brain
(00:06:09)
works. Things jump out at us obviously
(00:06:10)
when we look at things.
(00:06:11)
>> Yeah, it's very intuitive. The the
(00:06:13)
intuition, I know this is not based,
(00:06:15)
you're not a cognitive scientist, but I
(00:06:16)
want to try to build intuition here for
(00:06:18)
people. My understanding, the big
(00:06:20)
breakthrough uh we had for how the brain
(00:06:21)
works is you're constantly predicting
(00:06:22)
what you're going to see next. like this
(00:06:24)
maybe why we see ghosts but either way
(00:06:26)
your brain's constantly looking at what
(00:06:27)
it thinks it expects to see and then if
(00:06:29)
something is not unexpected it kind of
(00:06:30)
jumps out at you is there anything like
(00:06:32)
that with transformers or there's
(00:06:33)
nothing there's nothing like that where
(00:06:34)
it's trying to predict things
(00:06:35)
>> well that's exactly how you train a
(00:06:36)
model like GPT4 so um you know if you
(00:06:40)
take it to to train a model like GB4
(00:06:42)
basically we take all of the text
(00:06:43)
everything we can scrape off the
(00:06:44)
internet um and
(00:06:48)
what the model is trying to do is it
(00:06:49)
goes character by character and it's
(00:06:51)
trying to predict the next character So,
(00:06:54)
you know, uh if if it sees, you know,
(00:06:56)
the rain in Spain falls mainly on the
(00:06:58)
it's going to guess, oh, that's a P and
(00:06:59)
it's going to be plain. And it does this
(00:07:01)
over and over again for huge amounts of
(00:07:04)
documents, like trillions of characters.
(00:07:06)
And over time, it seems as though that
(00:07:10)
yields something that looks like
(00:07:11)
intelligence.
(00:07:11)
>> The prediction seems very very similar
(00:07:12)
to intelligence, which which may which
(00:07:14)
may be what we're doing, too.
(00:07:15)
>> Yeah. I I at this point, I think we
(00:07:18)
understand neural networks a lot better
(00:07:19)
than we understand the brain. So
(00:07:20)
whenever someone talks to me about the
(00:07:21)
brain I always think well what does a
(00:07:23)
neural network do and then probably
(00:07:24)
actually the brain does that.
(00:07:26)
>> Do you think of yourself as a neural
(00:07:27)
network now?
(00:07:28)
>> I do actually. Um I actually think of my
(00:07:30)
kids as neural networks.
(00:07:31)
>> How has that changed your interactions
(00:07:33)
with them?
(00:07:35)
>> You know as I was starting at OpenAI
(00:07:36)
first I worked on robotics and then I
(00:07:38)
worked on I we did some problems with
(00:07:39)
math and then programming and uh early
(00:07:42)
on you know the kids would always get
(00:07:44)
the problems before the robots were and
(00:07:46)
now it's the other way around. Now the
(00:07:48)
neural networks are better are you know
(00:07:50)
better at solving problems than the
(00:07:51)
kids.
(00:07:52)
>> Has it given you any intuition on how to
(00:07:53)
train young minds?
(00:07:55)
>> Uh yeah, you just show them lots and
(00:07:57)
lots of things and you just have them
(00:07:58)
read.
(00:07:59)
>> Show them three trillion things.
(00:08:00)
>> I mean that's that's probably how you
(00:08:02)
and I learned, right?
(00:08:03)
>> Just just read a lot of books when we
(00:08:05)
were kids.
(00:08:05)
>> I guess there's a thing about attention,
(00:08:07)
too. I've always found that kids do
(00:08:08)
better when they're confident. I don't
(00:08:10)
know if the computers need to be given
(00:08:11)
confidence, although maybe that's a
(00:08:12)
signal for something else like
(00:08:13)
attention. Well, that's actually one of
(00:08:14)
the cool things that we've discovered um
(00:08:17)
with GBD3 and GBD4 is that uh the model
(00:08:21)
so the models are trained on the
(00:08:22)
internet and um they're reading
(00:08:26)
everything, right? Then they try to act
(00:08:28)
like what they've seen, right? They're
(00:08:29)
predicting and so they don't know
(00:08:31)
whether to predict being someone who's
(00:08:33)
dumb or someone who's really smart.
(00:08:35)
>> That's why you have to tell them they're
(00:08:36)
really smart when you're talking to
(00:08:37)
them. It's so funny.
(00:08:38)
>> Exactly. Like you should you're a
(00:08:39)
confident, you know, powerful physicist
(00:08:41)
who knows everything. the best physicist
(00:08:43)
in the world. Now answer my question.
(00:08:45)
>> And you sound like Shakespeare because
(00:08:46)
it's fun if you make them sound cool.
(00:08:47)
>> That's right. Or or you you have to do
(00:08:49)
everything in in a limick.
