↔
Title: Jeff Dean Says AI’s Biggest Opportunity Is Still Largely Untouched
Duration: 00:25:05
Total Correct Answers:
Current Caption
Correct
Learning Modes
YouTube Video Transcript Hide
Ask AI:
Export as:
Ask AI Result
The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here
(00:00:01)
[music]
(00:00:08)
Jeff Dean, thank you for joining us here
(00:00:10)
in Sunday San Diego, right in front of
(00:00:12)
the nerves conference center. You are
(00:00:14)
the chief scientist at Google, Kotech
(00:00:16)
lead, and you all recently made an
(00:00:20)
announcement about a version of the new
(00:00:22)
TPU chip.
(00:00:23)
>> Yeah.
(00:00:24)
>> Let's talk about it. The seventh
(00:00:25)
generation of TPUs.
(00:00:27)
>> Yeah. Y
(00:00:27)
>> what's special about it? Uh I mean like
(00:00:30)
every next generation of TPU it's better
(00:00:33)
than the previous one and you know it
(00:00:35)
has uh quite a lot of new capabilities
(00:00:37)
uh it has you know it's connected
(00:00:39)
together into these very large
(00:00:40)
configurations that we call pods um I
(00:00:43)
think it's 9216 chips or something like
(00:00:46)
that per pod um and it has you know much
(00:00:49)
higher performance especially for lower
(00:00:51)
precision floatingoint formats like FP4
(00:00:53)
um so that's going to be really useful
(00:00:55)
for training large models uh for
(00:00:57)
inference for a lot of things like that.
(00:00:58)
So, we're pretty excited about it.
(00:01:00)
>> Nice. If zooming out, Google started
(00:01:02)
building TPUs for their own internal
(00:01:04)
needs. Google's pre-minent AI
(00:01:07)
applications company, AI research
(00:01:09)
organization in the world, and the need
(00:01:12)
for control of the full vertically
(00:01:14)
integrated stack was the original
(00:01:16)
motivation as I understand it, if I've
(00:01:18)
read about it, and then eventually
(00:01:19)
externalizing access to those to be a
(00:01:21)
global competitor in ecosystem of
(00:01:24)
accelerator people who build and sell
(00:01:26)
accelerators. And now there's a lot of
(00:01:28)
people excited about the opportunity for
(00:01:31)
there to be a a a massive market for
(00:01:34)
TPUs. How do you relate with your role
(00:01:37)
at Google to the objectives of Google's
(00:01:40)
internal use of TPUs versus the
(00:01:43)
marketplace that you're competing in to
(00:01:44)
to kind of compete and enable millions
(00:01:47)
and billions of people outside of Google
(00:01:48)
to get the advantages through reselling
(00:01:50)
TPUs in the competitive space? Yeah, I
(00:01:52)
mean uh the origin of the TPU program
(00:01:55)
was really for our own internal needs
(00:01:57)
initially focused on inference. So in
(00:01:59)
even in as far back as 2013, you know,
(00:02:02)
we saw that uh kind of deep learning
(00:02:04)
methods were going to be very successful
(00:02:06)
and every time we trained a slightly
(00:02:08)
larger model with more data, the results
(00:02:10)
got better in things like speech and
(00:02:12)
vision. And uh I started to do some back
(00:02:15)
of the envelope calculations of like
(00:02:17)
what would happen if we actually wanted
(00:02:18)
to serve this much better speech model
(00:02:20)
that's more comput intensive to say 100
(00:02:23)
million users for a few minutes a day.
(00:02:25)
And [snorts] the compute requirements
(00:02:27)
got quite scary. Uh we would actually
(00:02:28)
need to double the number of computers
(00:02:30)
Google had overall in order to just roll
(00:02:33)
out this improved speech model. Wow.
(00:02:35)
>> Um if we wanted to do it on CPUs. And so
(00:02:38)
that was really the the genesis of hey
(00:02:41)
if we build specialized hardware that is
(00:02:43)
tailored for these kinds of ML uh
(00:02:46)
computations you know essentially dense
(00:02:48)
low precision linear algebra um we could
(00:02:51)
actually do be way more efficient and
(00:02:54)
that was borne out the first TPU ended
(00:02:57)
up being 30 to 70 times more energy
(00:03:00)
efficient than contemporary CPUs or GPUs
(00:03:03)
and 15 to 30 times faster
(00:03:05)
>> and that was 2015 you said
(00:03:07)
>> yeah So we started uh the thought
(00:03:09)
experiment was 2013. The chips that
(00:03:11)
landed in our data center in 2015 and we
(00:03:14)
wrote a paper about
(00:03:14)
>> pre-transformer architecture
(00:03:15)
>> pre-transformer. Yeah. So uh we actually
(00:03:18)
were focused on speech recognition and
(00:03:20)
kind of vision convolutional models at
(00:03:23)
the time. We squeezed in a little bit of
(00:03:25)
design change at the last minute into
(00:03:27)
the TPUv1 to make it support LSTMs as
(00:03:30)
well. uh which were kind of a in vogue
(00:03:32)
at the time for language modeling and
(00:03:34)
that also enabled us to support uh
(00:03:36)
language translation tasks and then
(00:03:38)
subsequent versions of TPUs have focused
(00:03:40)
much more on much larger scale systems
(00:03:42)
that are not just a single PCIe card but
(00:03:44)
are you know a whole machine learning
(00:03:46)
supercomputer including the latest
(00:03:48)
Ironwood one
(00:03:50)
>> um and you know every generation has
(00:03:52)
been a big improvement in both energy
(00:03:54)
efficiency performance per dollar all
(00:03:56)
these things that we that we care about
(00:03:59)
um and enable us to scale much larger
(00:04:02)
training jobs much you know more serving
(00:04:05)
of of requests to lots of users
(00:04:08)
>> and of course the transformer
(00:04:09)
architecture itself born at Google
(00:04:12)
pretty similar timeline but with the TPU
(00:04:16)
invented before that and then transform
(00:04:19)
architecture happening do you think
(00:04:20)
there was serendipity in terms of co- uh
(00:04:23)
design between the applications of the
(00:04:25)
transform architecture as they've grown
(00:04:27)
up to change the world as we know it now
(00:04:29)
and Google's access to this vertically
(00:04:31)
integrated hardware stack.
(00:04:33)
>> Yeah, I mean every every generation of
(00:04:34)
TPU we really try to take advantage of
(00:04:37)
the code design opportunities we have
(00:04:39)
with, you know, having a lot of
(00:04:41)
researchers thinking about where are,
(00:04:43)
you know, ML computations we're going to
(00:04:45)
want to run, you know, two and a half to
(00:04:48)
six years from now going, which is the
(00:04:50)
the exercise you have as a hardware
(00:04:52)
designer is like trying to predict a
(00:04:55)
very fastmoving field. It's not a very
(00:04:56)
easy thing, but having a lot of people
(00:04:59)
kind of seeing where the field is going
(00:05:01)
or you know this kind of thing might be
(00:05:03)
interesting. We're not quite sure yet,
(00:05:04)
but we could put in this kind of
(00:05:07)
hardware feature or this particular kind
(00:05:09)
of capability and if we did that and
(00:05:12)
this turned out to be important, then we
(00:05:13)
could have the hardware support there
(00:05:15)
ready when that thing, you know,
(00:05:17)
hopefully bears out, you know, that it
(00:05:19)
is an important thing. And if it doesn't
(00:05:21)
pay off, then sometimes you've just
(00:05:23)
devoted maybe a small area piece of the
(00:05:25)
chip area to this thing that turned out
(00:05:27)
to be less important than you thought.
(00:05:29)
But you really do want to be prepared
(00:05:31)
for if this thing matters a lot, your
(00:05:34)
hard work can support it.
(00:05:35)
>> Yeah.
(00:05:36)
>> So it's a interesting forecasting
(00:05:38)
exercise is forecasting the whole ML
(00:05:41)
field and trying to guess what we want.
(00:05:43)
Well, if we could let one person do it,
(00:05:45)
uh, Chuck Norris of Computer Sciences
(00:05:47)
would get my vote and has has obviously
(00:05:50)
enough votes that that you are doing it
(00:05:52)
at Google. And with your track record at
(00:05:54)
Google, there's a legacy of inventing
(00:05:57)
things for Google's internal needs.
(00:05:59)
Google being the world's best systems
(00:06:02)
building company for the applications
(00:06:04)
Google's built, which have now become
(00:06:05)
many head-ed map producing Google file
(00:06:07)
system, something you did inside, you
(00:06:09)
co-invented or invented inside of
(00:06:10)
Google. And then eventually you have
(00:06:13)
been able to witness that what Google
(00:06:16)
built and demonstrates to the world as
(00:06:18)
value and then publishes with the TPU
(00:06:20)
architecture obviously the transformer
(00:06:23)
the ideas in the transformer are paper
(00:06:24)
themselves but now do you think that
(00:06:26)
there's a tipping point with iron wood
(00:06:28)
for the rest of the world to sort of
(00:06:30)
have access to the advantages that
(00:06:32)
Google has had and I would imagine if I
(00:06:33)
put myself in your shoes this experience
(00:06:35)
where it's like that was awesome and we
(00:06:36)
did it at Google and we paved the way
(00:06:37)
and now like a holy moly the rest of the
(00:06:40)
world is also getting all the benefits
(00:06:42)
researchers think about impact and that
(00:06:44)
feels like the moment we live for to be
(00:06:47)
able to have it and if you're if you
(00:06:48)
feel like you're at the tipping point on
(00:06:50)
the TPU moment. Yeah, I mean I think
(00:06:52)
obviously we've been using TPUs now for
(00:06:54)
more than a decade or about a decade and
(00:06:57)
been really happy with them and the
(00:06:59)
codeesigned properties really make them
(00:07:01)
sort of useful for the kinds of machine
(00:07:03)
learning computations we want to run and
(00:07:05)
we've also been renting them externally
(00:07:07)
through our cloud TPU program for a
(00:07:08)
number of years and so many many
(00:07:11)
customers are using them for all kinds
(00:07:13)
of things. Uh we've built a bunch of
(00:07:15)
software layers on top of TPUs that make
(00:07:18)
them sort of quite convenient and easy
(00:07:20)
to use. So you have I mean the most
(00:07:22)
well-worn path for TPUs is Jacks on top
(00:07:26)
of Pathways which is an internal system
(00:07:28)
we've built that uh we're sort of
(00:07:30)
working to see if cloud customers would
(00:07:33)
want uh access to
(00:07:34)
>> on top of XLA which is a compiler ML
(00:07:38)
compiler with a TPU backend and so what
(00:07:40)
this um tends to mean at least for
(00:07:43)
pathways you know all of our Gemini
(00:07:46)
development and research and training
(00:07:47)
large scale training jobs run on top of
(00:07:50)
that stack and pathways is this nice
(00:07:52)
system that we we built u I guess
(00:07:55)
starting about seven years ago that
(00:07:58)
gives you the illusion of a single
(00:08:01)
system image across you know thousands
(00:08:04)
or tens of thousands of chips and so you
(00:08:06)
can have like a single Python process
(00:08:08)
running your Jax code and instead of it
(00:08:11)
showing up as four devices where you're
(00:08:13)
running on a single TPU node it shows up
(00:08:16)
as your Jax process has access to 20,000
(00:08:19)
devices
(00:08:20)
And you it just sort of naturally works
(00:08:22)
and figures out underneath the covers
(00:08:24)
exactly what transfer mechanisms to use
(00:08:27)
and which you know which network to use.
(00:08:29)
It should use the within a pod a TPU pod
(00:08:32)
it should use the high-speed
(00:08:33)
interconnect and across pod boundaries
(00:08:34)
it'll use the data center network across
(00:08:37)
metropolitan areas it'll use
(00:08:38)
longdistance links and so on. Um, so we
(00:08:41)
actually run, you know, very large scale
(00:08:43)
training jobs where we have a single
(00:08:44)
Python process driving multiple TPU pods
(00:08:48)
in multiple cities.
(00:08:49)
>> Nice. Great. Well, maybe we can shift
(00:08:51)
topics. Sure. You've been talking a lot,
(00:08:53)
I think, lately about the state of
(00:08:55)
funding for academic research.
(00:08:58)
>> What's your message?
(00:08:59)
Yeah, I mean uh actually my colleagues
(00:09:02)
Hoza and Partha Rangadath and and I
(00:09:05)
along with Magda Balazinski at the
(00:09:07)
University of Washington recently
(00:09:09)
published uh one article in a whole
(00:09:12)
special issue of the computer
(00:09:13)
communications ACM that was devoted to
(00:09:16)
you know uh the impact of um you know
(00:09:20)
academic research and in in our section
(00:09:23)
we discussed all the academic research
(00:09:26)
that Google as a company was built on
(00:09:29)
you know all the things that we relied
(00:09:32)
on in terms of like TCP IP and you know
(00:09:36)
uh you know advanced risk processors uh
(00:09:39)
and you know the internet and uh the
(00:09:42)
Stanford digital library project which
(00:09:44)
is what sort of uh provided the funding
(00:09:46)
for the original version of page rank at
(00:09:48)
Stanford.
(00:09:49)
>> Oh yeah. And my colleague Dave Patterson
(00:09:51)
also had an article on that uh that
(00:09:54)
issue about all the amazing things that
(00:09:56)
have come out of his um and his Berkeley
(00:09:59)
colleagues many different five-year
(00:10:01)
labs. And so it's just really important
(00:10:03)
I feel to have you know a vibrant
(00:10:06)
academic uh research uh ecosystem uh in
(00:10:10)
the US and also in the world because
(00:10:12)
that often those early stage creative
(00:10:15)
ideas are the things that lead to major
(00:10:18)
major uh breakthroughs and innovations.
(00:10:20)
You know the whole of the deep learning
(00:10:22)
revolution actually built on academic
(00:10:24)
research from 30 40 years ago. you know,
(00:10:27)
the inventions of neural networks and
(00:10:29)
back propagation and things like that
(00:10:31)
are all, you know, central to what what
(00:10:33)
we're doing even today and have been
(00:10:36)
really important in the world. So, you
(00:10:38)
know, I I advocate that we should have a
(00:10:41)
vibrant uh academic funding model for
(00:10:44)
academic research because the returns
(00:10:46)
are quite large to society.
(00:10:47)
>> Yeah. Excellent. And you and I and Dave
(00:10:50)
Patterson and Joel Pin know are on the
(00:10:52)
board of law institute which was born in
(00:10:55)
part out of a paper that you and Dave
(00:10:57)
and I and a bunch of seven other authors
(00:10:59)
published called shaping AI's impact on
(00:11:01)
billions of lives where we advocated for
(00:11:04)
the ways that AI research might impact
(00:11:06)
society in areas like civic discourse
(00:11:09)
and healthcare and science and job
(00:11:12)
reskilling and journalism and more
(00:11:14)
policy. And then we also advocated that
(00:11:18)
there in addition to things like 10xing
(00:11:21)
down on NSF style funding, we can
(00:11:25)
explore and uh prototype other types of
(00:11:28)
funding. So L institute raises money
(00:11:31)
from successful technologists who donate
(00:11:35)
to a non law institute which is a
(00:11:37)
nonprofit 501c3 which then in turn is
(00:11:39)
running a moonshot grant program
(00:11:41)
specifically dedicated to funding
(00:11:43)
research labs 3 to 5year research labs
(00:11:46)
with 3 to 5 PIs 30 to 50 PhD students
(00:11:49)
targeting uh AI's impact on society in
(00:11:52)
those areas I just said scientific
(00:11:53)
progress healthcare job reskilling and
(00:11:56)
civic discourse and you've been an an
(00:11:58)
advocate hit for these alternative
(00:12:00)
funding models as well in addition to
(00:12:02)
the traditional ones.
(00:12:03)
>> It was a lot of fun working on that
(00:12:05)
paper with you and and Dave and uh the
(00:12:07)
many other co-authors we had. You know,
(00:12:09)
I think the um the the thing I liked
(00:12:12)
about that paper is we looked at a bunch
(00:12:14)
of different areas where AI would have
(00:12:15)
an impact and some of them you know if
(00:12:18)
we get it right will be amazingly
(00:12:20)
positive impact and other areas you know
(00:12:23)
is a little less clear. there might be
(00:12:24)
some uh negative consequences of AI and
(00:12:27)
what can we do overall across all these
(00:12:30)
different areas to maximize the
(00:12:32)
potential upside of AI both from a
(00:12:36)
technical computer science research ML
(00:12:39)
perspective but also in con conjunction
(00:12:42)
with policy makers and with you know
(00:12:45)
people in those fields like health or
(00:12:47)
education or scientists and then also
(00:12:49)
looked at the way in which we could all
(00:12:51)
work together to sort of maximize those
(00:12:53)
benefits and and minimize the downside
(00:12:55)
>> and specifically with research efforts
(00:12:57)
that are in the 3 to 5 year time horizon
(00:12:59)
that fit into a lab which is in contrast
(00:13:01)
to a lot of the hype we hear in AI right
(00:13:04)
now that like this like pursuing AGI or
(00:13:06)
super intelligence contrasted to trying
(00:13:09)
to help with medical you know like
(00:13:11)
success with frontline healthcare can
(00:13:12)
you mitigate the the drudgery that
(00:13:14)
typical doctors feel or eliminate
(00:13:17)
obstacles that radiologists might have
(00:13:19)
to actually using the technology that
(00:13:21)
already exists so I think it made made
(00:13:23)
it feel very much much more real and
(00:13:25)
specific and achievable.
(00:13:26)
>> I really like the 3 to 5 year time
(00:13:29)
horizon kind of thing with a ambitious
(00:13:31)
sort of set of people around a
(00:13:33)
particular kind of thing they're trying
(00:13:34)
to achieve because I feel like um often
(00:13:37)
that gets lots of different people
(00:13:39)
working together with a a mix of skills
(00:13:41)
in order to sort of really push forward
(00:13:43)
something. uh and it's not so distant
(00:13:46)
that it won't have impact, but it's not
(00:13:49)
so short a time period that you can't
(00:13:51)
conceive of doing something ambitious,
(00:13:53)
right? Even in my own career, I've
(00:13:55)
tended to think of like when I start on
(00:13:57)
a new project, what could we do in 3 to
(00:13:59)
5 years? And I think that's a a
(00:14:01)
delightful time range to to consider.
(00:14:04)
>> Nice. Yeah. And I'm wondering if you
(00:14:05)
could share maybe some of your
(00:14:06)
favorites. One thing I found while
(00:14:08)
working with you on that paper was
(00:14:09)
always delightful is just how well
(00:14:11)
connected you are to seems like dozens
(00:14:13)
and dozens of bleeding edge projects by
(00:14:15)
some of the most innovative thinkers and
(00:14:17)
researchers and builders in the world.
(00:14:19)
You both angel invest in them and you
(00:14:22)
you know are generous in donating your
(00:14:23)
time and energy to advise ambitious
(00:14:26)
impactful research projects that want to
(00:14:28)
go make a difference from climate to
(00:14:30)
science and specific discourse in
(00:14:32)
healthcare. I think healthcare is one of
(00:14:33)
your passions on the program committee
(00:14:34)
that we're buil we we built for the
(00:14:36)
moonshot grant program which we now have
(00:14:38)
all of our applications including
(00:14:39)
touring award winners and Nobel
(00:14:40)
laureates and coverage from the top
(00:14:42)
universities. So everything's working
(00:14:44)
according to plan so far for funding
(00:14:46)
some research that actually moves the
(00:14:48)
needle on these areas of society.
(00:14:49)
Curious with your background and with so
(00:14:51)
exposure to so many active projects if
(00:14:53)
you could just share some of your like
(00:14:54)
one or two of your favorites. Yeah, I
(00:14:55)
mean I think I am quite passionate about
(00:14:57)
the application of AI to health in
(00:14:59)
various ways and I think the the
(00:15:01)
moonshot if you like would be how can we
(00:15:04)
as society use every past decision
(00:15:07)
that's been made in health to inform
(00:15:09)
every future decision, right? And that's
(00:15:11)
a super hard goal because there's all
(00:15:14)
kinds of uh impediments to doing that.
(00:15:17)
There's like very real privacy concerns.
(00:15:19)
There's complica complicated regulatory
(00:15:21)
requirements that differ for every
(00:15:24)
jurisdiction. But I think if we kind of
(00:15:26)
aspirationally try to say what could we
(00:15:29)
do so that we can learn from every past
(00:15:31)
decision that's been made in a way that
(00:15:34)
helps us have every clinician and every
(00:15:38)
person themselves be informed and make
(00:15:40)
better decisions in the future. That
(00:15:43)
would be like a awesome amazing goal.
(00:15:47)
And I think you know a three to five
(00:15:49)
year moonshot around that might be able
(00:15:51)
to make some progress to that. Probably
(00:15:53)
can't get all the way there but it would
(00:15:54)
be pretty amazing even if it made made
(00:15:56)
it partway to that. Is your sense that
(00:15:58)
with the current capabilities of AI
(00:16:00)
systems, the challenge for that would be
(00:16:03)
more in the fitting the the adapting the
(00:16:08)
existing health medical health records,
(00:16:10)
legal considerations and what the
(00:16:12)
lawyers for insurance providers and the
(00:16:15)
comp the hospitals themselves. That
(00:16:18)
might make it all sound sounds very hard
(00:16:20)
like more of an implementation challenge
(00:16:21)
than the capabilities. or do you think
(00:16:22)
that the the capabilities have a ways to
(00:16:25)
go before we would get the benefits?
(00:16:28)
>> Yeah, I mean I think there's a bunch of
(00:16:30)
interesting technical researchy
(00:16:31)
questions in there, but there are a
(00:16:32)
bunch of kind of grungy how would you
(00:16:35)
get the data in the right form to be
(00:16:38)
able to learn from it because it's in
(00:16:40)
every different healthare system. It's
(00:16:42)
in slightly different forms and so on.
(00:16:44)
You probably have to use things like
(00:16:47)
privacy preserving machine learning or
(00:16:49)
federated learning or things like that.
(00:16:51)
So, how would you make that work on a
(00:16:52)
technical perspective? Um, because
(00:16:54)
you're not going to be able to move
(00:16:57)
healthcare data from where it sits.
(00:16:59)
Instead, you're going to need to be able
(00:17:00)
to learn on the data in a privacy
(00:17:03)
preserving way in a whole bunch of
(00:17:05)
different, you know, environments. So,
(00:17:08)
there are real technical challenges, but
(00:17:09)
there's also, as you, as you say, legal
(00:17:11)
and regulatory kinds of challenges as
(00:17:13)
well. But, you know, I think that's part
(00:17:16)
of why you want to have a whole group of
(00:17:20)
people thinking about these issues with
(00:17:21)
different kinds of expertise, right?
(00:17:23)
Like you need some people with machine
(00:17:25)
learning expertise and, you know,
(00:17:26)
computer systems building expertise as
(00:17:28)
well as legal and policy and regulatory
(00:17:31)
expertise.
(00:17:32)
>> Yeah, makes a lot of sense. Any other
(00:17:34)
projects that come to mind as a as a
(00:17:36)
favorite? You know, I'm kind of enamored
(00:17:38)
these days about how can we make our
(00:17:41)
computing systems even more efficient
(00:17:43)
than the late latest cutting edge TPUs
(00:17:46)
or GPUs. I feel like there's room there
(00:17:48)
for interesting and innovative
(00:17:50)
approaches for you know much lower cost
(00:17:53)
uh say inference which seems like it's
(00:17:56)
going to be a a major thing in the world
(00:17:59)
more than it already is. going back to
(00:18:01)
even to the original 2013 napkin sketch
(00:18:04)
for why TPUs should be born in the first
(00:18:06)
place.
(00:18:06)
>> Yeah. I mean, if you redo that napkin
(00:18:08)
sketch now, you're going to realize that
(00:18:11)
we want, you know, first much lower
(00:18:14)
latency systems than we have today. Uh,
(00:18:17)
as well as much more throughput and
(00:18:19)
performance per watt is going to be a
(00:18:21)
really important thing. So what can we
(00:18:22)
do that would make way lower uh you know
(00:18:26)
energy systems that still provide the
(00:18:28)
the same quality and performance. Mhm.
(00:18:30)
How do you see the relationship between
(00:18:33)
all of the like the massive amount of
(00:18:34)
research happening inside the Gemini
(00:18:36)
team, inside deep mind more at large and
(00:18:40)
in the now zooming out one more layer to
(00:18:42)
the AI ecosystem beyond Google the
(00:18:44)
relationship for the academic research
(00:18:47)
and research happening beyond Google's
(00:18:49)
bounds and what happens inside Google.
(00:18:51)
Traditionally things like the
(00:18:52)
transformer paper like map produce
(00:18:54)
you've had these channels of exporting
(00:18:56)
innovation outside of Google. I imagine
(00:18:58)
you also have you imported innovation
(00:19:00)
and built on the shoulders of the giants
(00:19:02)
outside and that's why like you gave
(00:19:03)
examples already. Um have you do you see
(00:19:05)
that evolving these days as Google with
(00:19:08)
such a massive investment and such a
(00:19:10)
leadership position in Gemini and in the
(00:19:13)
hardware kind of up and down the entire
(00:19:16)
stack. has it evolved and does it need
(00:19:17)
to continue to evolve especially as we
(00:19:20)
as we continue to face the we're trying
(00:19:22)
to innovate for funding models for the
(00:19:24)
others but it it's not looking good
(00:19:26)
[laughter] in my opinion and I can't
(00:19:27)
speak for you but uh curious your
(00:19:30)
thoughts on that that dynamic that
(00:19:31)
relationship at the bound of Google and
(00:19:33)
innovation happening in traditional
(00:19:35)
mechanisms
(00:19:36)
>> besides I mean I I think there's
(00:19:38)
obviously continual evolution about uh
(00:19:41)
you know publishing models and so on or
(00:19:43)
publishing uh you know characteristics
(00:19:46)
ICS. So I think in this current
(00:19:48)
competitive dynamic we tend to not
(00:19:50)
publish the secret sauce inside our
(00:19:53)
architecture of our Gemini model say but
(00:19:55)
we do publish a lot of stuff in the sort
(00:19:58)
of earlier stage research uh aspects of
(00:20:01)
you know here are interesting new kinds
(00:20:03)
of model architectures that we haven't
(00:20:05)
proven out but we've experimented with
(00:20:06)
at small scale to publish them so that
(00:20:09)
the rest of the ecosystem can you know
(00:20:11)
pick up those ideas and explore them as
(00:20:13)
well or build on them. And we also kind
(00:20:15)
of look at the broader publishing
(00:20:17)
happening in the rest of the the
(00:20:19)
community and sort of look at uh you
(00:20:22)
know how could we adapt some of those to
(00:20:24)
some of the problems we're seeing. Um
(00:20:26)
and I also don't think publishing has to
(00:20:27)
be a we publish it or we don't kind of
(00:20:30)
thing. There's really a continuum there
(00:20:32)
about when do we publish and what do we
(00:20:34)
publish. So I'll give you an example in
(00:20:36)
the computational photography work that
(00:20:38)
Google research has been doing for many
(00:20:39)
many years. We have awesome researchers
(00:20:43)
in that field. Uh they often well almost
(00:20:46)
annually come up with a really cool new
(00:20:49)
thing that can go into the pixel camera
(00:20:51)
pip software pipeline. So things like
(00:20:54)
night sight or astrophotography or magic
(00:20:57)
eraser where you can erase like that
(00:20:59)
person who wandered in front of your
(00:21:00)
photo that you didn't want in the photo
(00:21:02)
>> in the first place. Um, and so what we
(00:21:04)
tend to do there is we put it out into
(00:21:08)
the Pixel the next Pixel N plus1 phone
(00:21:11)
that's coming out and then we sort of
(00:21:14)
wait a little while and then we submit a
(00:21:16)
SIGRAPH paper about the innovations that
(00:21:18)
went into that feature. So it's sort of
(00:21:20)
a little bit of a delay. We take
(00:21:22)
advantage of it in our products first
(00:21:24)
and then we sort of let the rest of the
(00:21:25)
community know about, you know, what is
(00:21:27)
happening underneath the covers um and
(00:21:29)
they can build on it. So I think that's
(00:21:31)
a pretty nice thing and there's this
(00:21:32)
nice continuum of you know not just the
(00:21:36)
end points being being choices. Can you
(00:21:38)
think of any off the top of your head
(00:21:39)
examples of papers or ideas in that
(00:21:43)
category you just mentioned of kind of
(00:21:45)
the earlier more experimental stuff
(00:21:46)
where you are publishing is either here
(00:21:48)
at Nurips or happened recently that
(00:21:49)
you're finding exciting and have been
(00:21:51)
engaging with the the non like outside
(00:21:54)
Google and I'd love to talk a little bit
(00:21:55)
more about how it's been how challenging
(00:21:57)
it has it been or has it been to
(00:21:58)
organize inside Google because it's it's
(00:22:00)
a it's an extremely impressive large
(00:22:02)
organization and I can imagine internal
(00:22:05)
conferences even happening that would be
(00:22:07)
a little tiny any version of Nurips or
(00:22:08)
something like that. But before you
(00:22:09)
answer that one, yeah, I'm curious if
(00:22:10)
you have any specific examples.
(00:22:12)
>> Yeah, I mean just one off the top of my
(00:22:13)
head. Uh there was a paper published by
(00:22:15)
some Google research Google researchers
(00:22:17)
here
(00:22:18)
>> uh on kind of a hybrid between the
(00:22:20)
transformer and recurrent models that's
(00:22:22)
called Titan I think is the name. So
(00:22:25)
it's sort of looking at how can you have
(00:22:27)
much longer context by using a
(00:22:29)
recurrence relation uh but using chunks
(00:22:32)
of tokens rather than individual uh
(00:22:35)
small tokens and learning to kind of
(00:22:36)
compress the the sort of very porky
(00:22:39)
representation of every token into
(00:22:41)
something that's a little more compact
(00:22:42)
and then have a whole sequence of those
(00:22:44)
that you use recurrent steps on. So
(00:22:46)
that's just a good example of you know
(00:22:48)
that is not in our Gemini models. It it
(00:22:50)
could be in the future, but it does seem
(00:22:52)
like an interesting idea for to explore.
(00:22:54)
Uh, one more thing on the uh internal
(00:22:58)
every every
(00:22:58)
>> Somebody told me a call about a
(00:22:59)
conference. They're like going to a
(00:23:00)
Google conference and I was like, "Oh,
(00:23:02)
that sounds so cool."
(00:23:02)
>> We have a Google research conference. It
(00:23:05)
has like 6,000 attendees every year. uh
(00:23:07)
and I know there's a sentiment if you
(00:23:09)
talk to the PhD students here that the
(00:23:11)
Google research conference might have
(00:23:13)
papers that feel a year ahead of the
(00:23:16)
papers you're seeing at at Nurips just
(00:23:18)
because there there is a gap between
(00:23:20)
what's happening in the open and what's
(00:23:21)
happening at Google. So, um I'm
(00:23:24)
wondering besides making a conference,
(00:23:26)
how have you found it to sort of be able
(00:23:28)
to build an organization that is so
(00:23:30)
innovative and is able to generate the
(00:23:33)
the frontier uh the state-of-the-art
(00:23:35)
progress that that we're all
(00:23:36)
>> Yeah. I mean, I think one of the one of
(00:23:38)
the reasons the internal research
(00:23:40)
conference might feel a little bit like
(00:23:41)
that is often, you know, for an external
(00:23:44)
thing, you have to be quite far along in
(00:23:47)
your research idea to get it accepted
(00:23:49)
and published. and the internal
(00:23:51)
conference, you know, there's a whole
(00:23:52)
range of of maturity of the work. And so
(00:23:55)
people are perfectly willing to have
(00:23:57)
lightning sessions of like cool early
(00:24:00)
stage results that aren't really fully
(00:24:01)
baked yet. And I you get like 10 of
(00:24:03)
those in an hour session. So I think
(00:24:05)
part of that is yes, it hasn't been
(00:24:07)
published externally, but also part of
(00:24:09)
it is is just trying to, you know,
(00:24:12)
circulate some of the ideas that are
(00:24:14)
being explored with your colleagues and
(00:24:16)
it has to be a little less fully poly.
(00:24:19)
>> Yeah. No, I'm I'm inspired by that. I
(00:24:20)
feel like Nurips is really impressive,
(00:24:23)
>> very large, and maybe there's room for
(00:24:26)
that architecture of a conference to be
(00:24:28)
exported as well in terms of innovation.
(00:24:30)
>> I mean, the workshop stays here feel a
(00:24:32)
little bit more like that because it's
(00:24:33)
earlier stage work and so on, but it's
(00:24:36)
>> still a fairly traditional thing of like
(00:24:39)
a PDF of some paper-like artifact. And
(00:24:43)
here are often these things are just
(00:24:45)
talks with a few slides or not
(00:24:47)
necessarily a full paper that someone
(00:24:49)
had to write up.
(00:24:50)
>> Cool. Okay. Well, I think that's a wrap
(00:24:51)
for us. Thanks for taking the time.
(00:24:52)
Appreciate all your thoughts.
(00:24:53)
>> Thank you. Appreciate it.
(00:24:54)
>> Enjoy the rest of Ner. And
(00:24:56)
>> it's beautiful here. It is.
