↔
Title: The Future of Search: Inside Perplexity’s $20B Bet on AI | Aravind Srinivas, CEO of Perplexity
Duration: 00:30:40
Total Correct Answers:
Current Caption
Correct
Learning Modes
YouTube Video Transcript Hide
Ask AI:
Export as:
Ask AI Result
The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here
(00:00:00)
An AI company did what no one thought
(00:00:02)
was possible. It made people stop using
(00:00:05)
Google and the person behind it,
(00:00:06)
Aravinduvenaz, the CEO of Perplexity AI.
(00:00:10)
>> We try to like prioritize being truth
(00:00:12)
seeking and correct here. Our product
(00:00:14)
needs to be fast otherwise search should
(00:00:16)
never be slow. It should feel like a
(00:00:18)
really premium product.
(00:00:19)
>> Now the entire world is paying attention
(00:00:21)
to perplexity. Even Cristiano Ronaldo
(00:00:23)
used Perplexity to help him write his
(00:00:25)
award speech. In just a few short years,
(00:00:27)
Perplexity has gone from an idea to a
(00:00:29)
multibillion dollar company, redefining
(00:00:32)
how we find information online.
(00:00:34)
>> Perplexity is an answer machine. You can
(00:00:36)
ask whatever question you want. Why
(00:00:38)
should academics be the only ones who
(00:00:40)
are allowed to ask questions? The
(00:00:42)
smartest people have always been
(00:00:43)
curious. They don't have to be
(00:00:45)
academics, but most people don't have
(00:00:47)
the platform or even the tools to like
(00:00:50)
engage in asking questions. We want to
(00:00:52)
give that power that billionaires have
(00:00:54)
access to to everybody.
(00:00:56)
>> That curiosity mindset shaped everything
(00:00:58)
they're building next, including Comet.
(00:01:00)
>> The next set of things we're working on,
(00:01:02)
which is Comet is running on the
(00:01:04)
background, even as you sleep, and you
(00:01:05)
don't even have your MacBook or Windows
(00:01:07)
computer open.
(00:01:08)
>> Perplexity is not just changing how we
(00:01:10)
searched, it's changing who gets to be
(00:01:12)
discovered. So, I sat down with Aravind
(00:01:15)
to ask, is this the start of a new kind
(00:01:17)
of internet?
(00:01:24)
How would you explain Comet to someone
(00:01:27)
who has never used it before? Because I
(00:01:29)
have trouble doing that because it's so
(00:01:31)
unique.
(00:01:32)
>> Of course, I I iterate on this multiple
(00:01:34)
times in terms of what is the best
(00:01:36)
oneliner. Uh for perplexity, it was very
(00:01:39)
obvious. It's perplexity is an answer
(00:01:41)
machine. That's it. Like, you know, you
(00:01:42)
can ask it whatever question you want. I
(00:01:44)
think the way I think about comet is
(00:01:46)
it's a personal assistant. The first
(00:01:48)
time you can have a personal assistant
(00:01:50)
for yourself or the other way of
(00:01:52)
thinking about it is you know uh the
(00:01:54)
second brain and so second brain doesn't
(00:01:57)
mean a dump of data.
(00:01:59)
>> Yeah.
(00:01:59)
>> Uh I'm not talking about a memory store
(00:02:02)
like brain is not just meant to be a
(00:02:05)
dump store for memory. Honestly, a brain
(00:02:07)
that actually can think with you,
(00:02:09)
another brain that can think with you,
(00:02:11)
and you can delegate what your first
(00:02:14)
brain, the core brain finds boring and
(00:02:16)
mundane and like tiring,
(00:02:18)
>> delegated to the second brain. So stuff
(00:02:21)
like booking reservations, uh moving
(00:02:23)
around meetings, scheduling a common
(00:02:26)
time for like four people to meet, um
(00:02:28)
sending reminders to people, uh getting
(00:02:31)
prepared for the day, you know, like
(00:02:33)
identifying like let's say you're going
(00:02:34)
to interview someone, u prepare like
(00:02:36)
looking reading all their past uh
(00:02:38)
episodes and podcasts and um trying to
(00:02:41)
ask questions that are new and
(00:02:42)
different. Like these are things that
(00:02:44)
you honestly need. It's not that you
(00:02:46)
don't enjoy the aspect of doing your
(00:02:49)
work, but the actual aspect of opening
(00:02:51)
all these tabs and like going through
(00:02:53)
all these transcripts and like manually
(00:02:55)
sorting out like what is different and
(00:02:56)
new and like documenting all that in a
(00:02:59)
new document, you know, and then
(00:03:01)
preparing a fresh set of questions like
(00:03:02)
that is boring.
(00:03:03)
>> Yes. Yeah.
(00:03:04)
>> The the part of the curiosity of like
(00:03:07)
asking new things, that is not boring.
(00:03:09)
>> So that's what your first brain is meant
(00:03:11)
to do. So our first brain can be truly
(00:03:13)
at our natural best selves if we can
(00:03:15)
just be curious and explore and interact
(00:03:18)
and and and meet people and like you
(00:03:20)
know uh strategize
(00:03:21)
>> while the second brain takes care of
(00:03:23)
like all the boring mundane workflows.
(00:03:25)
So that's how we think about comet
(00:03:28)
giving that power to people and note
(00:03:30)
that this power is already available to
(00:03:33)
billionaires or like people who are very
(00:03:35)
well off
(00:03:36)
>> elite society because they have
(00:03:39)
assistants who do this for them. They
(00:03:41)
have employees who do this for them.
(00:03:42)
They have like they have a team of
(00:03:43)
people working with them doing this for
(00:03:45)
them.
(00:03:46)
>> But the normal person doesn't have
(00:03:47)
access to all this. They still have to
(00:03:49)
do it all themselves. They have to
(00:03:50)
schedule their own hospital
(00:03:52)
appointments. They have to find the best
(00:03:54)
doctors that are covered in their
(00:03:55)
insurance. They have to find like local
(00:03:57)
experiences when they plan a trip. Uh
(00:04:00)
they have to find a good flight deal or
(00:04:02)
hotel deal. Like like very basic things
(00:04:05)
that you're doing on a daily basis. You
(00:04:06)
don't even think about it and you spend
(00:04:08)
hours and hours on it. I still vividly
(00:04:10)
remember going to a haircut once here in
(00:04:12)
San Francisco and another old man walked
(00:04:15)
in and he said, "Hey, I just I'm
(00:04:18)
frustrated with my morning. I I spend
(00:04:20)
like 3 hours just looking for a new
(00:04:21)
washing machine to buy because my
(00:04:23)
existing one doesn't work."
(00:04:25)
>> Like like imagine these are the kind of
(00:04:26)
things that Yeah. 3 hours because it's a
(00:04:29)
big volume purchase for most people. So
(00:04:31)
I think we want to give that power that
(00:04:33)
billionaires have access to uh to
(00:04:36)
everybody and um honestly like you know
(00:04:39)
of course it's done in a capitalistic
(00:04:41)
way. We are we are a profit-minded
(00:04:43)
company but it's one of the most uh
(00:04:45)
equalizing things you can do in terms of
(00:04:47)
making something very special and
(00:04:49)
exclusive and elite and like giving it
(00:04:51)
in the hands of normal people so that
(00:04:53)
like everyone can be the best version of
(00:04:55)
themselves like like if if you had your
(00:04:56)
first brain to just be yourself and
(00:04:58)
engage and and you know go deep into
(00:05:00)
things that you care about and you're
(00:05:02)
interested in how could life be for you
(00:05:04)
like what are the kind of questions you
(00:05:05)
would ask and what are the kind of
(00:05:06)
journeys you would discover? Your
(00:05:08)
research background is rooted in open
(00:05:10)
science and open source. How much of
(00:05:13)
that plays into what you are building
(00:05:16)
today with especially around you know
(00:05:18)
open source and and researching
(00:05:20)
different things
(00:05:21)
>> more than the uh open source aspect or
(00:05:24)
open science aspect like I I go back to
(00:05:25)
my roots. The way I think about it is I
(00:05:28)
was an academic during my academia. I've
(00:05:30)
always thought like you know let me give
(00:05:32)
you a personal example. My dad um kind
(00:05:36)
of wanted to be an academic like like he
(00:05:38)
realized this very late in his career
(00:05:39)
like he uh engaged more in in accounting
(00:05:42)
and finance and like you know getting a
(00:05:44)
job and all that stuff that most middle
(00:05:47)
class men go through
(00:05:49)
>> and then by the time you you know you
(00:05:52)
truly discover what you wanted to do
(00:05:53)
it's pretty late
(00:05:54)
>> right um and I've always thought like
(00:05:57)
why should academics only be the be the
(00:05:59)
only ones who are like allowed to ask
(00:06:01)
questions and think about like you know
(00:06:04)
what what is potentially possible and
(00:06:06)
engage in like deep scientific research
(00:06:08)
on that like like why should it be
(00:06:10)
restricted to just universities. Um all
(00:06:13)
the smartest people have always been
(00:06:15)
curious. They don't have to be academics
(00:06:17)
but most people don't have the platform
(00:06:20)
to or or even the tools to like engage
(00:06:23)
in asking questions.
(00:06:24)
>> The moment they have a question um
(00:06:26)
they're either like shut down saying
(00:06:28)
like you know your job's not ask that
(00:06:30)
your job is to go do this.
(00:06:31)
>> Yes. And even if they are allowed to ask
(00:06:33)
the questions, they probably don't have
(00:06:35)
the exposure to like the right set of
(00:06:37)
people who could answer them. And
(00:06:39)
they're definitely not allowed to ask
(00:06:41)
questions that don't have answers yet.
(00:06:43)
>> Yeah. And so at least having tools like
(00:06:45)
perplexity that give you like accurate
(00:06:48)
answers to almost anything out there, at
(00:06:50)
least the questions for which answers
(00:06:52)
exist, which is plentiful already and
(00:06:56)
and making sure you can trust the answer
(00:06:59)
because of the sources it provides and
(00:07:01)
and so every accurate answer is the
(00:07:04)
foundation for the next question, right?
(00:07:06)
And so the way we believe that uh you
(00:07:09)
know like encourage the question work
(00:07:12)
>> uh and allow anyone to ask questions and
(00:07:14)
hopefully that turns the whole idea of
(00:07:16)
being an academic something that is no
(00:07:18)
longer a luxury kind of coming comes
(00:07:20)
back to the whole equalizing aspect. So
(00:07:23)
I I really enjoyed my time in Berkeley
(00:07:26)
as a PhD student. Um and I I thought
(00:07:29)
like why why can't normal people just
(00:07:31)
have that kind of experience where you
(00:07:33)
know they could also like engage in
(00:07:35)
asking questions and getting back
(00:07:36)
answers and if a paper is pretty hard to
(00:07:39)
understand um you don't need to have
(00:07:41)
access to other elite Berkeley or
(00:07:43)
Stanford PhDs or professors you can just
(00:07:46)
ask a tool like perplexity to explain it
(00:07:48)
to you and it'll do that for you. I'm
(00:07:50)
really curious, you know, as you you're
(00:07:54)
you're an academic and and thinking
(00:07:56)
about how this
(00:07:58)
knowledge is now accessible to everyone,
(00:08:02)
right?
(00:08:03)
>> How do you think academia will change?
(00:08:05)
Yeah.
(00:08:06)
>> With this knowledge now being accessible
(00:08:08)
to everyone.
(00:08:08)
>> Yeah.
(00:08:09)
>> Um so I have a lot of um opinions on
(00:08:12)
this. I don't know how it's going to pan
(00:08:14)
out, but um have you ever seen movies
(00:08:17)
like how they portray
(00:08:19)
>> uh academic like like for example the
(00:08:20)
Oppenheimer movie or
(00:08:22)
>> uh the the the Stephen Hawking movie?
(00:08:24)
>> It's the job of the adviser is not
(00:08:26)
really to help you figure out answers
(00:08:30)
>> or or or um like kind of like come and
(00:08:33)
explain your doubts.
(00:08:34)
>> Yeah. you know, like like that that you
(00:08:36)
do your coursework for that and and you
(00:08:38)
know, that's part of the coursework, but
(00:08:40)
actually,
(00:08:41)
>> you know, you know that uh that scene in
(00:08:43)
the uh Stephen Hawking movie where he
(00:08:45)
knock he knocks on the door and like
(00:08:47)
time that's my thesis, right?
(00:08:49)
>> Yes. Yeah.
(00:08:51)
>> Time on time. That's your subject.
(00:08:55)
>> Why is that a big deal? Or or let me
(00:08:58)
give you another example from um the
(00:09:00)
days of Google like which I've really
(00:09:02)
studied a lot and uh Larry Page where
(00:09:06)
>> you know we got to go study the web like
(00:09:08)
that was the thesis
(00:09:10)
>> and and like there is no um fundamental
(00:09:13)
question there. It's just literally
(00:09:14)
being curious about something.
(00:09:16)
>> Yeah, it's true.
(00:09:16)
>> Right. Like you're curious about what is
(00:09:18)
even time.
(00:09:19)
>> Yeah.
(00:09:19)
>> Or or Einstein was curious about space
(00:09:21)
and time, the relativity of that.
(00:09:24)
>> Yes. or or Larry was curious about the
(00:09:26)
web and that all led to great
(00:09:28)
>> discoveries on top. Yes.
(00:09:30)
>> But it's the foundation is like just
(00:09:31)
being curious. So I I feel like an
(00:09:34)
academic adviser in in an ideal world
(00:09:36)
should be the one who really encourages
(00:09:39)
you to be curious about topics that most
(00:09:42)
people are like you know uh ridiculed
(00:09:44)
for being curious about like why oh what
(00:09:45)
is their question about
(00:09:47)
>> Newton's physics like you know Newton
(00:09:49)
just gave it away for you in this book
(00:09:52)
like just take it for granted and build
(00:09:53)
on top.
(00:09:54)
>> Yes. No, no. I'm I'm going to question
(00:09:56)
the foundations of it like and and say
(00:09:59)
every great discovery has come from
(00:10:01)
people uh being curious and and
(00:10:03)
relentless about questioning the status
(00:10:05)
quo and and and not taking it for
(00:10:08)
granted and seeing if there's
(00:10:09)
fundamentally a better way to do things.
(00:10:11)
And um and so uh that's the kind of
(00:10:14)
atmosphere that academic universities
(00:10:15)
should truly encourage. But I I I got to
(00:10:18)
say at least during my times at
(00:10:20)
Berkeley, it was not exactly like that.
(00:10:23)
uh people were definitely like after you
(00:10:25)
know publishing a certain number of
(00:10:26)
papers you know building a profile for
(00:10:29)
yourself so that you go get a job in
(00:10:32)
another university as a professor or
(00:10:33)
postto or you get hired at Google or
(00:10:35)
open AAI or whatever and I you know I
(00:10:38)
definitely had to do some of those
(00:10:39)
things too u and I feel like that should
(00:10:42)
probably die if you want to just do a
(00:10:45)
job at one of these labs I think you can
(00:10:47)
get it even without a PhD now
(00:10:49)
>> uh as long as you're good at writing
(00:10:51)
code and training models.
(00:10:53)
>> But if you're truly interested in
(00:10:54)
questioning, okay, why are we even
(00:10:56)
training transformers? Like, let me go
(00:10:57)
and look at the foundations of it.
(00:11:00)
>> Um, I hope there's a new academic
(00:11:03)
environment that stimulates that kind of
(00:11:05)
thinking.
(00:11:06)
>> And with tools like ours, like you don't
(00:11:08)
need someone else to answer your
(00:11:10)
questions. So, you actually need someone
(00:11:12)
to guide you to asking the right
(00:11:14)
questions and really teaching you how to
(00:11:16)
think. I think it will based on what you
(00:11:18)
said too with the examples and and now
(00:11:20)
with tools like like what you're putting
(00:11:22)
out there being able to to start asking
(00:11:25)
those questions, get the knowledge and
(00:11:27)
then continuing to build upon that.
(00:11:28)
>> Yeah, exactly.
(00:11:29)
>> Instead of framing AI as say replacing
(00:11:32)
human work, I can tell you're a very big
(00:11:34)
advocate, not only from this
(00:11:36)
conversation thus far, but even from
(00:11:38)
your past conversations you've had and
(00:11:40)
interviews with really making
(00:11:43)
information and knowledge more
(00:11:44)
accessible to others.
(00:11:46)
>> Yeah. What are some surprising ways that
(00:11:48)
you've actually seen now that people
(00:11:50)
have access to this information people
(00:11:52)
use comet?
(00:11:53)
>> Three or four things I can point out. Uh
(00:11:55)
since these were publicly shared with
(00:11:57)
the user I'm I'm I'm sharing that. Um
(00:12:00)
one is like a user was frustrated
(00:12:03)
talking to the customer support of
(00:12:05)
FedEx.
(00:12:05)
>> Okay.
(00:12:06)
>> Uh he just um had Comet talk on his
(00:12:09)
behalf to to the customer support which
(00:12:11)
could have been a bot too. You don't
(00:12:13)
know.
(00:12:13)
>> Yeah. uh like like which had so comet
(00:12:15)
got access to the tracking ID and all
(00:12:17)
that and and comet is like filing the
(00:12:20)
complaint on his behalf and then going
(00:12:21)
back and forth with the customer support
(00:12:22)
agent on the other end of FedEx. That
(00:12:25)
that was very interesting. Uh another
(00:12:27)
user actually like like figured out a
(00:12:30)
way to uh run a marketing campaign
(00:12:33)
>> on uh Facebook ads platform for a new
(00:12:36)
product that uh they were going to put
(00:12:38)
out on you know for a small business
(00:12:40)
that they were running.
(00:12:41)
>> That was very interesting. Some people
(00:12:43)
like to unsubscribe from spam emails and
(00:12:45)
like Google doesn't quite know to detect
(00:12:48)
if something is spam or not. So you can
(00:12:50)
literally have Comet look at the uh the
(00:12:53)
the username and and and the domain name
(00:12:55)
and and the the the email and tell you
(00:12:59)
if it's potentially spam and if it is
(00:13:00)
then you can say messages similar to
(00:13:02)
this should be flagged the spam for me
(00:13:04)
and you unsubscribe me from these kind
(00:13:06)
of email lists and it'll just do it for
(00:13:08)
you. So this makes you believe like
(00:13:10)
there is a world where even if the
(00:13:12)
software that you're forced to use
(00:13:14)
because you're in the you know like like
(00:13:16)
these softwares are built by different
(00:13:17)
companies even if they're imperfect
(00:13:20)
because they're not designed for you.
(00:13:21)
>> Yeah.
(00:13:22)
>> You can make it work for you and that
(00:13:24)
extra work that the developer ideally
(00:13:26)
has to do but might not prioritize
(00:13:28)
because it's a it's a problem for like
(00:13:30)
end of one user it's no longer a problem
(00:13:32)
because you can just c personalize the
(00:13:34)
software for you. So, comet provides
(00:13:36)
that bridge between what the software
(00:13:38)
that works in a general way and the
(00:13:39)
software that works for you could be
(00:13:42)
>> and there are endless applications like
(00:13:44)
this. Like for example, um people are
(00:13:46)
using it to like you know just keep up
(00:13:48)
to date on like the stock price like you
(00:13:50)
can just say anytime like I don't want
(00:13:53)
to keep looking at the S&P all the time
(00:13:55)
but
(00:13:56)
>> anytime there's a crazy movement just
(00:13:59)
let me know.
(00:13:59)
>> Yeah. and and and and that way you're
(00:14:02)
like delegate. It's all the second brain
(00:14:03)
concept like whatever you you know
(00:14:05)
frustrates you or like to waste your
(00:14:07)
time and we want to hopefully do things
(00:14:09)
like you know when the tickets for this
(00:14:11)
concert open up
(00:14:12)
>> uh make sure like you book it for me you
(00:14:14)
have access to my wallet everything but
(00:14:16)
I don't want to be the one checking when
(00:14:17)
it opens up and be at the you know wake
(00:14:20)
up early morning just just for that all
(00:14:23)
these sort of things like we want to
(00:14:24)
like figure it out for the usering
(00:14:26)
>> and and um for example you might have
(00:14:28)
booked a flight and like while you're
(00:14:30)
sleeping if the flight price actually
(00:14:31)
reduces
(00:14:33)
and comet is running on the background
(00:14:34)
and it's like going and rebooking the
(00:14:36)
flight for you canceling your existing
(00:14:38)
booking and saving you like a,000 bucks
(00:14:41)
>> then it's worth it right
(00:14:42)
>> major
(00:14:43)
>> so yeah so these are the things that we
(00:14:44)
want comment so the last thing that I
(00:14:47)
said does not exist today
(00:14:49)
>> uh so if a user tries out comet today
(00:14:51)
that's not going to work but um that's
(00:14:53)
the next set of things we're working on
(00:14:55)
which is comet is running on the
(00:14:57)
background on the server uh even as you
(00:14:59)
sleep and you don't even have to have
(00:15:01)
your MacBook for our Windows computer
(00:15:03)
like open with the browser open like
(00:15:04)
it's it should just be running on the
(00:15:06)
background and uh that's how uh
(00:15:09)
essentially the second brain becomes
(00:15:10)
like an OS that helps make your life
(00:15:13)
more efficient
(00:15:14)
>> so much more efficient because at the
(00:15:15)
end of the day you know first brain
(00:15:18)
second brain it all comes down to time
(00:15:21)
is the most valuable asset correct and
(00:15:23)
>> so we we think about it more from not
(00:15:26)
just like saving time and giving you
(00:15:28)
back time it's more like removing
(00:15:31)
removing time spent on activities that
(00:15:33)
you just don't enjoy.
(00:15:34)
>> Yeah. Yeah.
(00:15:35)
>> Right. Exactly.
(00:15:36)
>> Uh like we're okay if you spend more
(00:15:38)
time browsing um or got more time spent
(00:15:41)
with your uh family
(00:15:44)
>> um you know like a lot of people who
(00:15:46)
write code or like work in tech
(00:15:48)
companies. One of their biggest
(00:15:50)
questions they ask how is work life
(00:15:51)
balance and it's hard to have a work
(00:15:53)
life balance when you work in a fast
(00:15:55)
growing company. But why is that?
(00:15:57)
Because a lot of time is being spent on
(00:15:58)
doing things inefficiently. Mhm.
(00:16:00)
>> So we think about it less as
(00:16:02)
productivity
(00:16:03)
uh and and more as like making uh time
(00:16:07)
spent more pleasant and enjoyable
(00:16:10)
>> like like you know just do things you
(00:16:12)
enjoy
(00:16:13)
>> and you're naturally going to be more
(00:16:14)
creative in that aspect then you're
(00:16:16)
happier and can do things you enjoy and
(00:16:18)
that's where real innovation typically
(00:16:20)
comes out.
(00:16:20)
>> Exactly. Yeah. So removing the
(00:16:22)
unregrettable time spent in work is
(00:16:25)
great. I know we focus a lot on the
(00:16:28)
software side of AI. Lately, I've been
(00:16:30)
really curious though about the
(00:16:31)
hardware. Is that something that
(00:16:34)
Perplexity has to keep in mind is the
(00:16:37)
hardware side and if so, how to what
(00:16:39)
degree?
(00:16:40)
>> Fundamentally, the the part of hardware
(00:16:42)
that matters the most to us or or or for
(00:16:45)
other AI companies too is inference.
(00:16:47)
Mhm.
(00:16:47)
>> So all these models are
(00:16:50)
like like like most of the value of the
(00:16:52)
software uh that comes in in something
(00:16:55)
like perplexity is coming through the
(00:16:57)
model here. Not not 100% of it.
(00:17:00)
>> Mhm.
(00:17:01)
>> And again like just because the model
(00:17:03)
matters a lot like doesn't mean the
(00:17:04)
other things don't matter but most of it
(00:17:06)
is coming through the model and so um
(00:17:09)
and those models are running on GPUs.
(00:17:11)
>> Uh and so that's you know any
(00:17:13)
innovations there will automatically
(00:17:15)
matter to us. Interestingly, like the
(00:17:17)
the last 50 years have all been about
(00:17:19)
like making computing more efficient at
(00:17:22)
at the chip layer. Of course, you know,
(00:17:24)
the M's law, but then GPUs don't
(00:17:27)
necessarily follow the MS law, but they
(00:17:29)
they've had this sort of amazing uh next
(00:17:31)
generation chips coming all the time
(00:17:34)
that make these chips even more suited
(00:17:36)
for the transformer architecture. And
(00:17:38)
why is that? Because the transformer
(00:17:40)
architecture is essentially just a lot
(00:17:42)
of matrix multiplications. M
(00:17:44)
>> um it's it's it's heavily optimized for
(00:17:46)
parallel computation and so uh anything
(00:17:49)
that can increase the memory bandwidth
(00:17:52)
uh the speed at which like like you know
(00:17:54)
data is communicated between different
(00:17:56)
registries um how the chip is even like
(00:18:00)
built the next generation chip
(00:18:02)
>> in terms of like how much HPM you have
(00:18:04)
access to
(00:18:06)
>> all that like helps you to package
(00:18:07)
longer context
(00:18:09)
>> like the sequence length
(00:18:10)
>> bigger models
(00:18:12)
>> um and also So like throughput in terms
(00:18:14)
of how fast you can decode the output
(00:18:16)
tokens.
(00:18:17)
>> Yeah.
(00:18:17)
>> Um and like the the latency the time to
(00:18:20)
the first token like the the moment when
(00:18:22)
the answer starts streaming.
(00:18:23)
>> Yes.
(00:18:24)
>> And especially when you're doing agent
(00:18:25)
stuff like it has to do uh multiple
(00:18:28)
sequences of uh thinking and actions and
(00:18:31)
like
(00:18:32)
>> it shouldn't be too slow and of course
(00:18:34)
the the the the stochasticity of all
(00:18:36)
this
(00:18:37)
>> you know the more deterministic you can
(00:18:38)
make with existing hardware. the low
(00:18:40)
precision as you go low precision like
(00:18:42)
stoasticity increases.
(00:18:44)
>> So so these are things that uh will
(00:18:46)
impact how the software works. So you do
(00:18:48)
need to understand the implications of
(00:18:50)
the next generation chip.
(00:18:51)
>> Yes.
(00:18:52)
>> Uh and how it makes things more
(00:18:54)
efficient for you. Should we like stop
(00:18:56)
using H100s and go to the GB200s? Okay.
(00:18:59)
Is there a new rival to Nvidia? Should
(00:19:01)
we look at that? Um you know are they
(00:19:03)
offering like much better speeds? Like
(00:19:05)
what is the catch? Like there's always a
(00:19:07)
catch.
(00:19:07)
>> Yeah. you know when someone comes and
(00:19:09)
offers oh I have something thousandx
(00:19:10)
better than Nvidia
(00:19:12)
>> there's always a catch uh and it doesn't
(00:19:14)
work for all the models it only works
(00:19:15)
for certain architecture it's not like
(00:19:17)
it's going to scale the trillion primary
(00:19:18)
models don't work like you got to
(00:19:20)
understand the consequences
(00:19:22)
>> um and then the other side of it is
(00:19:24)
personal hardware like
(00:19:26)
>> can there be a future where uh all these
(00:19:29)
models can just run on your MacBook
(00:19:31)
>> um you know would that disrupt the
(00:19:34)
Nvidia's data center uh like like
(00:19:36)
economy uh where you know right now all
(00:19:38)
the models are living on the servers and
(00:19:40)
like all your AI software is just
(00:19:42)
sending requests to those endpoints and
(00:19:44)
getting back the answer and streaming
(00:19:46)
all that. It can be way faster the
(00:19:48)
models are just locally running on your
(00:19:50)
device
(00:19:50)
>> cuz that round tripping just goes away
(00:19:53)
>> but then uh your battery is going to die
(00:19:55)
if the models are running on your phone.
(00:19:57)
So, but Apple's making continual
(00:19:59)
progress
(00:20:01)
in its chips that that you know it's
(00:20:04)
kind of standardizing the chips that run
(00:20:06)
on the MacBook, the iPad and and the
(00:20:08)
phone.
(00:20:09)
>> And so, uh at least in a year or two
(00:20:11)
from now, there is a possibility that a
(00:20:13)
GPT4.1 or uh uh you know like like
(00:20:17)
Gemini or the best models today, a a
(00:20:20)
model to that class running on your
(00:20:23)
MacBook in a year or two from now is
(00:20:26)
possible. Yeah.
(00:20:27)
>> And so we're already creating a lot of
(00:20:29)
value uh with AI today
(00:20:31)
>> and imagine it just runs on your local
(00:20:32)
device. That could be pretty
(00:20:33)
interesting.
(00:20:34)
>> It would be.
(00:20:34)
>> Yeah. So we think about it from from the
(00:20:37)
consequences of what can happen at the
(00:20:39)
application layer.
(00:20:40)
>> Uh what are all the new things we can do
(00:20:42)
for the user? For example, in in the
(00:20:43)
case of comet uh if the if the
(00:20:46)
intelligence the model can run locally
(00:20:49)
on your computer
(00:20:51)
>> um you we can guarantee full privacy.
(00:20:53)
>> Yeah. Yeah, the privacy I was just think
(00:20:54)
about that
(00:20:54)
>> all your data that gets
(00:20:56)
>> used by the agent.
(00:20:58)
>> Yes.
(00:20:59)
>> Uh which is essentially a reasoning
(00:21:00)
model
(00:21:02)
>> does not have to go to any server and
(00:21:04)
and everything else on the browser your
(00:21:06)
passwords everything else is encrypted
(00:21:08)
and living locally. Your history is
(00:21:10)
local. So the model is also local like
(00:21:12)
it's a it's an end toend private browser
(00:21:15)
and uh we can take advantage of all the
(00:21:17)
hardware that Apple's going to build for
(00:21:19)
uh their computers and and ship the AI.
(00:21:21)
So that's that's exciting. And then
(00:21:23)
maybe 5 years from now like it can all
(00:21:25)
run on the phone or maybe there'll be
(00:21:26)
like a you know like a way to share the
(00:21:30)
computer across the MacBook and the
(00:21:32)
phone and the glass. So there's like all
(00:21:34)
sorts of like ways in which the hardware
(00:21:36)
can play out over time.
(00:21:37)
>> Yeah.
(00:21:37)
>> Yeah.
(00:21:38)
>> And it's true then it does impact your
(00:21:40)
to some degree your decisions for for
(00:21:43)
how you're building things as well as to
(00:21:45)
where it is today but also where it's
(00:21:46)
headed.
(00:21:46)
>> Yeah. We we so we focus a lot of our
(00:21:49)
inference team efforts today on
(00:21:52)
>> uh data center inference. So uh
(00:21:55)
especially we we we've looked into
(00:21:57)
multi-node inference where it's not just
(00:21:59)
like one node of eight hers but like
(00:22:02)
>> two or three nodes and we we showed that
(00:22:04)
the throughput is even higher when you
(00:22:06)
work like that
(00:22:07)
>> and we are currently like benchmarking
(00:22:09)
the black wolves versus hopper and
(00:22:11)
seeing like you know the next generation
(00:22:12)
Nvidia GPUs are giving us even better
(00:22:14)
throughput and latency but we are you
(00:22:16)
know like if if there is a time there is
(00:22:18)
an inflection point for local compute
(00:22:21)
like like which is honestly bottleneck
(00:22:23)
by whether there exists a model that
(00:22:25)
that's good
(00:22:26)
>> and of course M1 chip progress
(00:22:29)
>> but I'm sure we'll get there and then
(00:22:31)
we'll focus a lot on like the MLX
(00:22:33)
compiler
(00:22:34)
>> uh and and help like ship models locally
(00:22:36)
on our like we have a we have obviously
(00:22:39)
comet is a local desktop application we
(00:22:41)
have the perplexity desktop app
(00:22:43)
>> so we're going to ship models that are
(00:22:44)
local
(00:22:45)
>> and and it can integrate with your local
(00:22:48)
files and users are not going to feel
(00:22:50)
scared about it because it's going to
(00:22:51)
run on your phones and you own it Yeah.
(00:22:53)
>> And then you could imagine a new
(00:22:55)
innovation on top of that which is like
(00:22:57)
something like a you know I wouldn't
(00:22:59)
call it necessarily entire model
(00:23:01)
finetuning but
(00:23:02)
>> a few weights in the model could get
(00:23:04)
tuned for you personally.
(00:23:06)
>> So that's your personal intelligence. So
(00:23:08)
then you get to own your you get to own
(00:23:10)
that model. It runs on your device
(00:23:13)
>> um and and all the training and it's
(00:23:14)
updates
(00:23:15)
>> is update.
(00:23:16)
>> Yeah. It's it's it's on your data.
(00:23:18)
>> Uh and and and that never goes anywhere
(00:23:20)
to any other server. So that would be a
(00:23:22)
great even if we can achieve that with
(00:23:24)
just prompt engineering or context
(00:23:26)
engineering that's also fine.
(00:23:27)
>> Yes.
(00:23:28)
>> Um and so that that way like everything
(00:23:30)
feels like it's your thing.
(00:23:31)
>> Exactly.
(00:23:32)
>> With the pace of AI development, how do
(00:23:34)
you prioritize what to ship now versus
(00:23:37)
what to hold off on for long term?
(00:23:39)
Honestly, uh
(00:23:42)
you know the number one change most
(00:23:45)
developers or like engineers who come
(00:23:47)
work in our company or other AI
(00:23:48)
companies have to make is
(00:23:50)
>> adaptation. Um the world changes so
(00:23:53)
fast.
(00:23:54)
>> Um 6 month plans don't even make sense.
(00:23:57)
>> So we kind of work here with like
(00:23:58)
quarterly plans and even that we're not
(00:24:01)
rigid about it like we we we are very
(00:24:03)
flexible in terms of changing our world
(00:24:06)
views. M the one thing that's been
(00:24:07)
constant like like usually like you know
(00:24:09)
I like Jeff Bezos's paradigm of this
(00:24:10)
where when the world is changing fast
(00:24:13)
you have to ask the inverse question
(00:24:15)
which is what is not guaranteed to
(00:24:16)
change
(00:24:17)
>> ah
(00:24:18)
>> right like like you know like like in
(00:24:19)
his case he asked like
(00:24:21)
>> in 10 years from now would people want
(00:24:23)
slower package delivery or people want
(00:24:25)
worse customer support
(00:24:27)
>> or people want less selection of choices
(00:24:30)
>> no they they only want more
(00:24:32)
>> so uh work on those problems
(00:24:34)
>> I like that
(00:24:34)
>> and they get and they nailed it So in in
(00:24:36)
in our case, people are always going to
(00:24:38)
want faster answers. People are always
(00:24:40)
going to want more accurate answers.
(00:24:41)
People are always going to want the AIS
(00:24:43)
to like do things for them, not just
(00:24:45)
like answer stuff, but actually go do
(00:24:47)
stuff for them.
(00:24:48)
>> And um so you got to work on these
(00:24:50)
problems regardless of what happens in
(00:24:52)
terms of whether the models are getting
(00:24:54)
cheaper or like more expensive. These
(00:24:56)
are not the customer's problems. Like
(00:24:58)
customers don't care about these
(00:24:59)
problems. And again like interestingly
(00:25:02)
one thing that's been true fortunately
(00:25:03)
in AI is the cost of running inference
(00:25:06)
has been going down.
(00:25:08)
>> Um it's not clear how long it'll
(00:25:10)
continue but it's definitely like going
(00:25:12)
down
(00:25:13)
>> and more intelligence gets packed more
(00:25:16)
compactly into smaller and smaller like
(00:25:18)
like more efficient models.
(00:25:20)
>> Uh this of course all models are sparse
(00:25:23)
but like still the way we run inference
(00:25:25)
of sparse models is really really
(00:25:27)
improving. uh and the chips are still
(00:25:29)
improving like you know so I think
(00:25:31)
there's like plenty of competition at
(00:25:33)
the model layer to like keep continuing
(00:25:35)
this a lot of investments being poured
(00:25:36)
into like efficiency gains
(00:25:39)
>> and that's benefiting the application
(00:25:41)
layer like ours so we we get to like
(00:25:44)
reap the benefits of all this so I make
(00:25:46)
my decisions based on like not what's
(00:25:48)
likely to change
(00:25:49)
>> but what is more likely to remain the
(00:25:52)
same
(00:25:53)
>> and so you can like take concentrated
(00:25:56)
bets on that. Yes, exactly. I think
(00:25:59)
that's really smart cuz it is changing
(00:26:00)
so quickly, you know, throughout I found
(00:26:02)
it really interesting throughout our our
(00:26:04)
our conversation you
(00:26:06)
>> you reference you I can tell that you've
(00:26:08)
really studied uh other successful
(00:26:11)
individuals and I think that's really
(00:26:13)
really brilliant. What is your how has
(00:26:16)
that helped you uh by studying? You
(00:26:19)
know, you mentioned Jeff Bezos. How has
(00:26:20)
that helped you and and how much do you
(00:26:22)
take of what you study from them and
(00:26:24)
actually apply it to your decision-m?
(00:26:27)
>> I've studied all the all the successful
(00:26:29)
entrepreneurs, you know, all of them are
(00:26:31)
great in their own ways. Everyone has
(00:26:33)
like one common aspect like resilience.
(00:26:36)
>> Yeah. you know um and so that's the most
(00:26:39)
important characteristic that I I try to
(00:26:41)
take from that
(00:26:42)
>> because things will not always go well
(00:26:43)
and I've had like so many days in which
(00:26:45)
like
(00:26:46)
>> just waking up felt like so miserable
(00:26:48)
like I wanted to go back to bed but then
(00:26:50)
you know the whole company's like
(00:26:52)
working here uh the investors have put a
(00:26:54)
lot of faith in me you know the
(00:26:56)
employees even though they're doing
(00:26:57)
their job that's scoped out for them
(00:26:59)
fundamentally like they have a lot of
(00:27:01)
stock and they're relying on me to like
(00:27:04)
stay there so resilience is probably the
(00:27:06)
most important character istic um being
(00:27:08)
curious obviously uh you got to like
(00:27:11)
constantly keep questioning
(00:27:13)
>> you got to ask the right questions
(00:27:14)
that's how I think about it
(00:27:16)
>> so that requires the right frame of mind
(00:27:19)
>> you know um so so fundamentally if
(00:27:21)
you're not a curious person you wouldn't
(00:27:22)
even be asking questions leave alone the
(00:27:24)
right ones
(00:27:25)
>> so um so staying resilience thinking
(00:27:27)
curious and moving fast
(00:27:29)
>> that that's you know like like a quality
(00:27:32)
that I force myself to like try to be
(00:27:34)
decisive don't Don't don't try to like
(00:27:38)
drag along just because you have you'll
(00:27:40)
never have perfect information to make
(00:27:42)
decisions.
(00:27:42)
>> Yeah.
(00:27:43)
>> You know, and then and this is again
(00:27:44)
like a framework the basos framework of
(00:27:46)
oneway door versus two-way door
(00:27:48)
decisions helps you a lot like
(00:27:49)
>> you know like sure like if you're wrong
(00:27:52)
what's going to happen.
(00:27:52)
>> Yeah.
(00:27:53)
>> Right.
(00:27:54)
>> It's not like your company's going to
(00:27:55)
die.
(00:27:56)
>> No. Exactly. You're going to learn from
(00:27:57)
it.
(00:27:58)
>> Exactly. You're going to Yeah.
(00:27:59)
>> And and as you grow more and more uh
(00:28:01)
there there's hardly going to be any
(00:28:03)
decision that makes or breaks the
(00:28:04)
company. Yes,
(00:28:05)
>> there could be some decisions that
(00:28:06)
really damage the company,
(00:28:08)
>> but there'll be nothing that makes or
(00:28:10)
breaks the company.
(00:28:11)
>> Yeah, exactly.
(00:28:12)
>> That's good. I really like that.
(00:28:14)
>> You know, to wrap up our our
(00:28:15)
conversation, you alluded a little bit
(00:28:17)
to to some things that were coming uh
(00:28:19)
specifically with Comet that are coming
(00:28:21)
up. What can you
(00:28:23)
>> I'm sure there's a lot that you can't
(00:28:25)
fully share yet, but what what can you
(00:28:26)
share? What can we expect that's coming
(00:28:28)
up for Comet? Yeah. So, um people are
(00:28:32)
mostly on their phones as you know like
(00:28:34)
the world has changed since mobile phone
(00:28:36)
became the dominant form of computing.
(00:28:39)
So, we got to make the mobile versions
(00:28:40)
of comet ready,
(00:28:42)
>> right?
(00:28:42)
>> Uh both iOS and Android. So, that's the
(00:28:45)
next step. And then uh we got the fact
(00:28:48)
that AI really works, you know, you can
(00:28:51)
just ask an AI to do stuff for you. It's
(00:28:54)
much more natural to like interact with
(00:28:56)
the internet now with just voice.
(00:28:58)
>> Yes. So having like voice work even
(00:29:01)
better and even more naturally
(00:29:03)
>> on comet both on phones and uh computers
(00:29:06)
but especially on phones when you know
(00:29:08)
it's kind of annoying to type stuff. Uh
(00:29:10)
that's going to be very important. So
(00:29:11)
we're going to work on that. And lastly
(00:29:13)
like um the comet assistant uh should be
(00:29:17)
able to like do things from the
(00:29:19)
background for you. Uh you shouldn't
(00:29:20)
always be having a computer open and
(00:29:22)
typing in and waiting for it to do
(00:29:24)
stuff.
(00:29:24)
>> Uh it should be able to do stuff even as
(00:29:26)
you sleep. Yeah, I love thatamam example
(00:29:29)
you gave. Yeah.
(00:29:30)
>> So, um
(00:29:32)
>> I think these are the kind of things we
(00:29:33)
want Comet to be able to do by end of
(00:29:36)
the year. And um hopefully it feels
(00:29:39)
truly special like like I mean look the
(00:29:41)
bar is you got to be able to feel the
(00:29:44)
utility of the product to the extent
(00:29:45)
that your work and life should run on it
(00:29:49)
in that's why I call it the OS the
(00:29:51)
operating system because the the
(00:29:54)
computer science definition of an
(00:29:55)
operating system is where processes are
(00:29:57)
run and memory is being managed
(00:29:59)
>> and and and your life is that except
(00:30:01)
there is no OS for your life. Uh the OS
(00:30:04)
exists for the computer applications
(00:30:06)
today like Windows, Mac, they're all
(00:30:08)
running the applications that are meant
(00:30:09)
to be run there and some of your life
(00:30:11)
lives on those applications, but it's
(00:30:12)
all very disconnected.
(00:30:13)
>> Yes.
(00:30:14)
>> And you're still the one that's
(00:30:15)
orchestrating your life.
(00:30:17)
>> I don't think about it as like you
(00:30:18)
giving up your agency, but more that
(00:30:20)
delegate the boring aspects of your life
(00:30:22)
that make your life like kind of
(00:30:24)
annoying and stressful
(00:30:25)
>> to something like comet and and you'll
(00:30:28)
get control and live your interesting
(00:30:30)
part of your life.
(00:30:31)
>> I love that.
(00:30:32)
>> Thank you. Thank you so much for your
(00:30:33)
time. I'm I'm leaving this conversation
(00:30:35)
very excited and extremely inspired. So,
(00:30:38)
thank you. I
(00:30:39)
>> appreciate it.
