Home Videos

The Future of Search: Inside Perplexity’s $20B Bet on AI | Aravind Srinivas, CEO of Perplexity (YouTube Video Transcript)

Need transcripts for other videos? Try our YouTube Transcript Generator →
Title: The Future of Search: Inside Perplexity’s $20B Bet on AI | Aravind Srinivas, CEO of Perplexity
Duration: 00:30:40
Total Correct Answers:
Current Caption
Correct

Learning Modes

YouTube Video Transcript Hide

Ask AI Result

The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here (00:00:00) An AI company did what no one thought (00:00:02) was possible. It made people stop using (00:00:05) Google and the person behind it, (00:00:06) Aravinduvenaz, the CEO of Perplexity AI. (00:00:10) >> We try to like prioritize being truth (00:00:12) seeking and correct here. Our product (00:00:14) needs to be fast otherwise search should (00:00:16) never be slow. It should feel like a (00:00:18) really premium product. (00:00:19) >> Now the entire world is paying attention (00:00:21) to perplexity. Even Cristiano Ronaldo (00:00:23) used Perplexity to help him write his (00:00:25) award speech. In just a few short years, (00:00:27) Perplexity has gone from an idea to a (00:00:29) multibillion dollar company, redefining (00:00:32) how we find information online. (00:00:34) >> Perplexity is an answer machine. You can (00:00:36) ask whatever question you want. Why (00:00:38) should academics be the only ones who (00:00:40) are allowed to ask questions? The (00:00:42) smartest people have always been (00:00:43) curious. They don't have to be (00:00:45) academics, but most people don't have (00:00:47) the platform or even the tools to like (00:00:50) engage in asking questions. We want to (00:00:52) give that power that billionaires have (00:00:54) access to to everybody. (00:00:56) >> That curiosity mindset shaped everything (00:00:58) they're building next, including Comet. (00:01:00) >> The next set of things we're working on, (00:01:02) which is Comet is running on the (00:01:04) background, even as you sleep, and you (00:01:05) don't even have your MacBook or Windows (00:01:07) computer open. (00:01:08) >> Perplexity is not just changing how we (00:01:10) searched, it's changing who gets to be (00:01:12) discovered. So, I sat down with Aravind (00:01:15) to ask, is this the start of a new kind (00:01:17) of internet? (00:01:24) How would you explain Comet to someone (00:01:27) who has never used it before? Because I (00:01:29) have trouble doing that because it's so (00:01:31) unique. (00:01:32) >> Of course, I I iterate on this multiple (00:01:34) times in terms of what is the best (00:01:36) oneliner. Uh for perplexity, it was very (00:01:39) obvious. It's perplexity is an answer (00:01:41) machine. That's it. Like, you know, you (00:01:42) can ask it whatever question you want. I (00:01:44) think the way I think about comet is (00:01:46) it's a personal assistant. The first (00:01:48) time you can have a personal assistant (00:01:50) for yourself or the other way of (00:01:52) thinking about it is you know uh the (00:01:54) second brain and so second brain doesn't (00:01:57) mean a dump of data. (00:01:59) >> Yeah. (00:01:59) >> Uh I'm not talking about a memory store (00:02:02) like brain is not just meant to be a (00:02:05) dump store for memory. Honestly, a brain (00:02:07) that actually can think with you, (00:02:09) another brain that can think with you, (00:02:11) and you can delegate what your first (00:02:14) brain, the core brain finds boring and (00:02:16) mundane and like tiring, (00:02:18) >> delegated to the second brain. So stuff (00:02:21) like booking reservations, uh moving (00:02:23) around meetings, scheduling a common (00:02:26) time for like four people to meet, um (00:02:28) sending reminders to people, uh getting (00:02:31) prepared for the day, you know, like (00:02:33) identifying like let's say you're going (00:02:34) to interview someone, u prepare like (00:02:36) looking reading all their past uh (00:02:38) episodes and podcasts and um trying to (00:02:41) ask questions that are new and (00:02:42) different. Like these are things that (00:02:44) you honestly need. It's not that you (00:02:46) don't enjoy the aspect of doing your (00:02:49) work, but the actual aspect of opening (00:02:51) all these tabs and like going through (00:02:53) all these transcripts and like manually (00:02:55) sorting out like what is different and (00:02:56) new and like documenting all that in a (00:02:59) new document, you know, and then (00:03:01) preparing a fresh set of questions like (00:03:02) that is boring. (00:03:03) >> Yes. Yeah. (00:03:04) >> The the part of the curiosity of like (00:03:07) asking new things, that is not boring. (00:03:09) >> So that's what your first brain is meant (00:03:11) to do. So our first brain can be truly (00:03:13) at our natural best selves if we can (00:03:15) just be curious and explore and interact (00:03:18) and and and meet people and like you (00:03:20) know uh strategize (00:03:21) >> while the second brain takes care of (00:03:23) like all the boring mundane workflows. (00:03:25) So that's how we think about comet (00:03:28) giving that power to people and note (00:03:30) that this power is already available to (00:03:33) billionaires or like people who are very (00:03:35) well off (00:03:36) >> elite society because they have (00:03:39) assistants who do this for them. They (00:03:41) have employees who do this for them. (00:03:42) They have like they have a team of (00:03:43) people working with them doing this for (00:03:45) them. (00:03:46) >> But the normal person doesn't have (00:03:47) access to all this. They still have to (00:03:49) do it all themselves. They have to (00:03:50) schedule their own hospital (00:03:52) appointments. They have to find the best (00:03:54) doctors that are covered in their (00:03:55) insurance. They have to find like local (00:03:57) experiences when they plan a trip. Uh (00:04:00) they have to find a good flight deal or (00:04:02) hotel deal. Like like very basic things (00:04:05) that you're doing on a daily basis. You (00:04:06) don't even think about it and you spend (00:04:08) hours and hours on it. I still vividly (00:04:10) remember going to a haircut once here in (00:04:12) San Francisco and another old man walked (00:04:15) in and he said, "Hey, I just I'm (00:04:18) frustrated with my morning. I I spend (00:04:20) like 3 hours just looking for a new (00:04:21) washing machine to buy because my (00:04:23) existing one doesn't work." (00:04:25) >> Like like imagine these are the kind of (00:04:26) things that Yeah. 3 hours because it's a (00:04:29) big volume purchase for most people. So (00:04:31) I think we want to give that power that (00:04:33) billionaires have access to uh to (00:04:36) everybody and um honestly like you know (00:04:39) of course it's done in a capitalistic (00:04:41) way. We are we are a profit-minded (00:04:43) company but it's one of the most uh (00:04:45) equalizing things you can do in terms of (00:04:47) making something very special and (00:04:49) exclusive and elite and like giving it (00:04:51) in the hands of normal people so that (00:04:53) like everyone can be the best version of (00:04:55) themselves like like if if you had your (00:04:56) first brain to just be yourself and (00:04:58) engage and and you know go deep into (00:05:00) things that you care about and you're (00:05:02) interested in how could life be for you (00:05:04) like what are the kind of questions you (00:05:05) would ask and what are the kind of (00:05:06) journeys you would discover? Your (00:05:08) research background is rooted in open (00:05:10) science and open source. How much of (00:05:13) that plays into what you are building (00:05:16) today with especially around you know (00:05:18) open source and and researching (00:05:20) different things (00:05:21) >> more than the uh open source aspect or (00:05:24) open science aspect like I I go back to (00:05:25) my roots. The way I think about it is I (00:05:28) was an academic during my academia. I've (00:05:30) always thought like you know let me give (00:05:32) you a personal example. My dad um kind (00:05:36) of wanted to be an academic like like he (00:05:38) realized this very late in his career (00:05:39) like he uh engaged more in in accounting (00:05:42) and finance and like you know getting a (00:05:44) job and all that stuff that most middle (00:05:47) class men go through (00:05:49) >> and then by the time you you know you (00:05:52) truly discover what you wanted to do (00:05:53) it's pretty late (00:05:54) >> right um and I've always thought like (00:05:57) why should academics only be the be the (00:05:59) only ones who are like allowed to ask (00:06:01) questions and think about like you know (00:06:04) what what is potentially possible and (00:06:06) engage in like deep scientific research (00:06:08) on that like like why should it be (00:06:10) restricted to just universities. Um all (00:06:13) the smartest people have always been (00:06:15) curious. They don't have to be academics (00:06:17) but most people don't have the platform (00:06:20) to or or even the tools to like engage (00:06:23) in asking questions. (00:06:24) >> The moment they have a question um (00:06:26) they're either like shut down saying (00:06:28) like you know your job's not ask that (00:06:30) your job is to go do this. (00:06:31) >> Yes. And even if they are allowed to ask (00:06:33) the questions, they probably don't have (00:06:35) the exposure to like the right set of (00:06:37) people who could answer them. And (00:06:39) they're definitely not allowed to ask (00:06:41) questions that don't have answers yet. (00:06:43) >> Yeah. And so at least having tools like (00:06:45) perplexity that give you like accurate (00:06:48) answers to almost anything out there, at (00:06:50) least the questions for which answers (00:06:52) exist, which is plentiful already and (00:06:56) and making sure you can trust the answer (00:06:59) because of the sources it provides and (00:07:01) and so every accurate answer is the (00:07:04) foundation for the next question, right? (00:07:06) And so the way we believe that uh you (00:07:09) know like encourage the question work (00:07:12) >> uh and allow anyone to ask questions and (00:07:14) hopefully that turns the whole idea of (00:07:16) being an academic something that is no (00:07:18) longer a luxury kind of coming comes (00:07:20) back to the whole equalizing aspect. So (00:07:23) I I really enjoyed my time in Berkeley (00:07:26) as a PhD student. Um and I I thought (00:07:29) like why why can't normal people just (00:07:31) have that kind of experience where you (00:07:33) know they could also like engage in (00:07:35) asking questions and getting back (00:07:36) answers and if a paper is pretty hard to (00:07:39) understand um you don't need to have (00:07:41) access to other elite Berkeley or (00:07:43) Stanford PhDs or professors you can just (00:07:46) ask a tool like perplexity to explain it (00:07:48) to you and it'll do that for you. I'm (00:07:50) really curious, you know, as you you're (00:07:54) you're an academic and and thinking (00:07:56) about how this (00:07:58) knowledge is now accessible to everyone, (00:08:02) right? (00:08:03) >> How do you think academia will change? (00:08:05) Yeah. (00:08:06) >> With this knowledge now being accessible (00:08:08) to everyone. (00:08:08) >> Yeah. (00:08:09) >> Um so I have a lot of um opinions on (00:08:12) this. I don't know how it's going to pan (00:08:14) out, but um have you ever seen movies (00:08:17) like how they portray (00:08:19) >> uh academic like like for example the (00:08:20) Oppenheimer movie or (00:08:22) >> uh the the the Stephen Hawking movie? (00:08:24) >> It's the job of the adviser is not (00:08:26) really to help you figure out answers (00:08:30) >> or or or um like kind of like come and (00:08:33) explain your doubts. (00:08:34) >> Yeah. you know, like like that that you (00:08:36) do your coursework for that and and you (00:08:38) know, that's part of the coursework, but (00:08:40) actually, (00:08:41) >> you know, you know that uh that scene in (00:08:43) the uh Stephen Hawking movie where he (00:08:45) knock he knocks on the door and like (00:08:47) time that's my thesis, right? (00:08:49) >> Yes. Yeah. (00:08:51) >> Time on time. That's your subject. (00:08:55) >> Why is that a big deal? Or or let me (00:08:58) give you another example from um the (00:09:00) days of Google like which I've really (00:09:02) studied a lot and uh Larry Page where (00:09:06) >> you know we got to go study the web like (00:09:08) that was the thesis (00:09:10) >> and and like there is no um fundamental (00:09:13) question there. It's just literally (00:09:14) being curious about something. (00:09:16) >> Yeah, it's true. (00:09:16) >> Right. Like you're curious about what is (00:09:18) even time. (00:09:19) >> Yeah. (00:09:19) >> Or or Einstein was curious about space (00:09:21) and time, the relativity of that. (00:09:24) >> Yes. or or Larry was curious about the (00:09:26) web and that all led to great (00:09:28) >> discoveries on top. Yes. (00:09:30) >> But it's the foundation is like just (00:09:31) being curious. So I I feel like an (00:09:34) academic adviser in in an ideal world (00:09:36) should be the one who really encourages (00:09:39) you to be curious about topics that most (00:09:42) people are like you know uh ridiculed (00:09:44) for being curious about like why oh what (00:09:45) is their question about (00:09:47) >> Newton's physics like you know Newton (00:09:49) just gave it away for you in this book (00:09:52) like just take it for granted and build (00:09:53) on top. (00:09:54) >> Yes. No, no. I'm I'm going to question (00:09:56) the foundations of it like and and say (00:09:59) every great discovery has come from (00:10:01) people uh being curious and and (00:10:03) relentless about questioning the status (00:10:05) quo and and and not taking it for (00:10:08) granted and seeing if there's (00:10:09) fundamentally a better way to do things. (00:10:11) And um and so uh that's the kind of (00:10:14) atmosphere that academic universities (00:10:15) should truly encourage. But I I I got to (00:10:18) say at least during my times at (00:10:20) Berkeley, it was not exactly like that. (00:10:23) uh people were definitely like after you (00:10:25) know publishing a certain number of (00:10:26) papers you know building a profile for (00:10:29) yourself so that you go get a job in (00:10:32) another university as a professor or (00:10:33) postto or you get hired at Google or (00:10:35) open AAI or whatever and I you know I (00:10:38) definitely had to do some of those (00:10:39) things too u and I feel like that should (00:10:42) probably die if you want to just do a (00:10:45) job at one of these labs I think you can (00:10:47) get it even without a PhD now (00:10:49) >> uh as long as you're good at writing (00:10:51) code and training models. (00:10:53) >> But if you're truly interested in (00:10:54) questioning, okay, why are we even (00:10:56) training transformers? Like, let me go (00:10:57) and look at the foundations of it. (00:11:00) >> Um, I hope there's a new academic (00:11:03) environment that stimulates that kind of (00:11:05) thinking. (00:11:06) >> And with tools like ours, like you don't (00:11:08) need someone else to answer your (00:11:10) questions. So, you actually need someone (00:11:12) to guide you to asking the right (00:11:14) questions and really teaching you how to (00:11:16) think. I think it will based on what you (00:11:18) said too with the examples and and now (00:11:20) with tools like like what you're putting (00:11:22) out there being able to to start asking (00:11:25) those questions, get the knowledge and (00:11:27) then continuing to build upon that. (00:11:28) >> Yeah, exactly. (00:11:29) >> Instead of framing AI as say replacing (00:11:32) human work, I can tell you're a very big (00:11:34) advocate, not only from this (00:11:36) conversation thus far, but even from (00:11:38) your past conversations you've had and (00:11:40) interviews with really making (00:11:43) information and knowledge more (00:11:44) accessible to others. (00:11:46) >> Yeah. What are some surprising ways that (00:11:48) you've actually seen now that people (00:11:50) have access to this information people (00:11:52) use comet? (00:11:53) >> Three or four things I can point out. Uh (00:11:55) since these were publicly shared with (00:11:57) the user I'm I'm I'm sharing that. Um (00:12:00) one is like a user was frustrated (00:12:03) talking to the customer support of (00:12:05) FedEx. (00:12:05) >> Okay. (00:12:06) >> Uh he just um had Comet talk on his (00:12:09) behalf to to the customer support which (00:12:11) could have been a bot too. You don't (00:12:13) know. (00:12:13) >> Yeah. uh like like which had so comet (00:12:15) got access to the tracking ID and all (00:12:17) that and and comet is like filing the (00:12:20) complaint on his behalf and then going (00:12:21) back and forth with the customer support (00:12:22) agent on the other end of FedEx. That (00:12:25) that was very interesting. Uh another (00:12:27) user actually like like figured out a (00:12:30) way to uh run a marketing campaign (00:12:33) >> on uh Facebook ads platform for a new (00:12:36) product that uh they were going to put (00:12:38) out on you know for a small business (00:12:40) that they were running. (00:12:41) >> That was very interesting. Some people (00:12:43) like to unsubscribe from spam emails and (00:12:45) like Google doesn't quite know to detect (00:12:48) if something is spam or not. So you can (00:12:50) literally have Comet look at the uh the (00:12:53) the username and and and the domain name (00:12:55) and and the the the email and tell you (00:12:59) if it's potentially spam and if it is (00:13:00) then you can say messages similar to (00:13:02) this should be flagged the spam for me (00:13:04) and you unsubscribe me from these kind (00:13:06) of email lists and it'll just do it for (00:13:08) you. So this makes you believe like (00:13:10) there is a world where even if the (00:13:12) software that you're forced to use (00:13:14) because you're in the you know like like (00:13:16) these softwares are built by different (00:13:17) companies even if they're imperfect (00:13:20) because they're not designed for you. (00:13:21) >> Yeah. (00:13:22) >> You can make it work for you and that (00:13:24) extra work that the developer ideally (00:13:26) has to do but might not prioritize (00:13:28) because it's a it's a problem for like (00:13:30) end of one user it's no longer a problem (00:13:32) because you can just c personalize the (00:13:34) software for you. So, comet provides (00:13:36) that bridge between what the software (00:13:38) that works in a general way and the (00:13:39) software that works for you could be (00:13:42) >> and there are endless applications like (00:13:44) this. Like for example, um people are (00:13:46) using it to like you know just keep up (00:13:48) to date on like the stock price like you (00:13:50) can just say anytime like I don't want (00:13:53) to keep looking at the S&P all the time (00:13:55) but (00:13:56) >> anytime there's a crazy movement just (00:13:59) let me know. (00:13:59) >> Yeah. and and and and that way you're (00:14:02) like delegate. It's all the second brain (00:14:03) concept like whatever you you know (00:14:05) frustrates you or like to waste your (00:14:07) time and we want to hopefully do things (00:14:09) like you know when the tickets for this (00:14:11) concert open up (00:14:12) >> uh make sure like you book it for me you (00:14:14) have access to my wallet everything but (00:14:16) I don't want to be the one checking when (00:14:17) it opens up and be at the you know wake (00:14:20) up early morning just just for that all (00:14:23) these sort of things like we want to (00:14:24) like figure it out for the usering (00:14:26) >> and and um for example you might have (00:14:28) booked a flight and like while you're (00:14:30) sleeping if the flight price actually (00:14:31) reduces (00:14:33) and comet is running on the background (00:14:34) and it's like going and rebooking the (00:14:36) flight for you canceling your existing (00:14:38) booking and saving you like a,000 bucks (00:14:41) >> then it's worth it right (00:14:42) >> major (00:14:43) >> so yeah so these are the things that we (00:14:44) want comment so the last thing that I (00:14:47) said does not exist today (00:14:49) >> uh so if a user tries out comet today (00:14:51) that's not going to work but um that's (00:14:53) the next set of things we're working on (00:14:55) which is comet is running on the (00:14:57) background on the server uh even as you (00:14:59) sleep and you don't even have to have (00:15:01) your MacBook for our Windows computer (00:15:03) like open with the browser open like (00:15:04) it's it should just be running on the (00:15:06) background and uh that's how uh (00:15:09) essentially the second brain becomes (00:15:10) like an OS that helps make your life (00:15:13) more efficient (00:15:14) >> so much more efficient because at the (00:15:15) end of the day you know first brain (00:15:18) second brain it all comes down to time (00:15:21) is the most valuable asset correct and (00:15:23) >> so we we think about it more from not (00:15:26) just like saving time and giving you (00:15:28) back time it's more like removing (00:15:31) removing time spent on activities that (00:15:33) you just don't enjoy. (00:15:34) >> Yeah. Yeah. (00:15:35) >> Right. Exactly. (00:15:36) >> Uh like we're okay if you spend more (00:15:38) time browsing um or got more time spent (00:15:41) with your uh family (00:15:44) >> um you know like a lot of people who (00:15:46) write code or like work in tech (00:15:48) companies. One of their biggest (00:15:50) questions they ask how is work life (00:15:51) balance and it's hard to have a work (00:15:53) life balance when you work in a fast (00:15:55) growing company. But why is that? (00:15:57) Because a lot of time is being spent on (00:15:58) doing things inefficiently. Mhm. (00:16:00) >> So we think about it less as (00:16:02) productivity (00:16:03) uh and and more as like making uh time (00:16:07) spent more pleasant and enjoyable (00:16:10) >> like like you know just do things you (00:16:12) enjoy (00:16:13) >> and you're naturally going to be more (00:16:14) creative in that aspect then you're (00:16:16) happier and can do things you enjoy and (00:16:18) that's where real innovation typically (00:16:20) comes out. (00:16:20) >> Exactly. Yeah. So removing the (00:16:22) unregrettable time spent in work is (00:16:25) great. I know we focus a lot on the (00:16:28) software side of AI. Lately, I've been (00:16:30) really curious though about the (00:16:31) hardware. Is that something that (00:16:34) Perplexity has to keep in mind is the (00:16:37) hardware side and if so, how to what (00:16:39) degree? (00:16:40) >> Fundamentally, the the part of hardware (00:16:42) that matters the most to us or or or for (00:16:45) other AI companies too is inference. (00:16:47) Mhm. (00:16:47) >> So all these models are (00:16:50) like like like most of the value of the (00:16:52) software uh that comes in in something (00:16:55) like perplexity is coming through the (00:16:57) model here. Not not 100% of it. (00:17:00) >> Mhm. (00:17:01) >> And again like just because the model (00:17:03) matters a lot like doesn't mean the (00:17:04) other things don't matter but most of it (00:17:06) is coming through the model and so um (00:17:09) and those models are running on GPUs. (00:17:11) >> Uh and so that's you know any (00:17:13) innovations there will automatically (00:17:15) matter to us. Interestingly, like the (00:17:17) the last 50 years have all been about (00:17:19) like making computing more efficient at (00:17:22) at the chip layer. Of course, you know, (00:17:24) the M's law, but then GPUs don't (00:17:27) necessarily follow the MS law, but they (00:17:29) they've had this sort of amazing uh next (00:17:31) generation chips coming all the time (00:17:34) that make these chips even more suited (00:17:36) for the transformer architecture. And (00:17:38) why is that? Because the transformer (00:17:40) architecture is essentially just a lot (00:17:42) of matrix multiplications. M (00:17:44) >> um it's it's it's heavily optimized for (00:17:46) parallel computation and so uh anything (00:17:49) that can increase the memory bandwidth (00:17:52) uh the speed at which like like you know (00:17:54) data is communicated between different (00:17:56) registries um how the chip is even like (00:18:00) built the next generation chip (00:18:02) >> in terms of like how much HPM you have (00:18:04) access to (00:18:06) >> all that like helps you to package (00:18:07) longer context (00:18:09) >> like the sequence length (00:18:10) >> bigger models (00:18:12) >> um and also So like throughput in terms (00:18:14) of how fast you can decode the output (00:18:16) tokens. (00:18:17) >> Yeah. (00:18:17) >> Um and like the the latency the time to (00:18:20) the first token like the the moment when (00:18:22) the answer starts streaming. (00:18:23) >> Yes. (00:18:24) >> And especially when you're doing agent (00:18:25) stuff like it has to do uh multiple (00:18:28) sequences of uh thinking and actions and (00:18:31) like (00:18:32) >> it shouldn't be too slow and of course (00:18:34) the the the the stochasticity of all (00:18:36) this (00:18:37) >> you know the more deterministic you can (00:18:38) make with existing hardware. the low (00:18:40) precision as you go low precision like (00:18:42) stoasticity increases. (00:18:44) >> So so these are things that uh will (00:18:46) impact how the software works. So you do (00:18:48) need to understand the implications of (00:18:50) the next generation chip. (00:18:51) >> Yes. (00:18:52) >> Uh and how it makes things more (00:18:54) efficient for you. Should we like stop (00:18:56) using H100s and go to the GB200s? Okay. (00:18:59) Is there a new rival to Nvidia? Should (00:19:01) we look at that? Um you know are they (00:19:03) offering like much better speeds? Like (00:19:05) what is the catch? Like there's always a (00:19:07) catch. (00:19:07) >> Yeah. you know when someone comes and (00:19:09) offers oh I have something thousandx (00:19:10) better than Nvidia (00:19:12) >> there's always a catch uh and it doesn't (00:19:14) work for all the models it only works (00:19:15) for certain architecture it's not like (00:19:17) it's going to scale the trillion primary (00:19:18) models don't work like you got to (00:19:20) understand the consequences (00:19:22) >> um and then the other side of it is (00:19:24) personal hardware like (00:19:26) >> can there be a future where uh all these (00:19:29) models can just run on your MacBook (00:19:31) >> um you know would that disrupt the (00:19:34) Nvidia's data center uh like like (00:19:36) economy uh where you know right now all (00:19:38) the models are living on the servers and (00:19:40) like all your AI software is just (00:19:42) sending requests to those endpoints and (00:19:44) getting back the answer and streaming (00:19:46) all that. It can be way faster the (00:19:48) models are just locally running on your (00:19:50) device (00:19:50) >> cuz that round tripping just goes away (00:19:53) >> but then uh your battery is going to die (00:19:55) if the models are running on your phone. (00:19:57) So, but Apple's making continual (00:19:59) progress (00:20:01) in its chips that that you know it's (00:20:04) kind of standardizing the chips that run (00:20:06) on the MacBook, the iPad and and the (00:20:08) phone. (00:20:09) >> And so, uh at least in a year or two (00:20:11) from now, there is a possibility that a (00:20:13) GPT4.1 or uh uh you know like like (00:20:17) Gemini or the best models today, a a (00:20:20) model to that class running on your (00:20:23) MacBook in a year or two from now is (00:20:26) possible. Yeah. (00:20:27) >> And so we're already creating a lot of (00:20:29) value uh with AI today (00:20:31) >> and imagine it just runs on your local (00:20:32) device. That could be pretty (00:20:33) interesting. (00:20:34) >> It would be. (00:20:34) >> Yeah. So we think about it from from the (00:20:37) consequences of what can happen at the (00:20:39) application layer. (00:20:40) >> Uh what are all the new things we can do (00:20:42) for the user? For example, in in the (00:20:43) case of comet uh if the if the (00:20:46) intelligence the model can run locally (00:20:49) on your computer (00:20:51) >> um you we can guarantee full privacy. (00:20:53) >> Yeah. Yeah, the privacy I was just think (00:20:54) about that (00:20:54) >> all your data that gets (00:20:56) >> used by the agent. (00:20:58) >> Yes. (00:20:59) >> Uh which is essentially a reasoning (00:21:00) model (00:21:02) >> does not have to go to any server and (00:21:04) and everything else on the browser your (00:21:06) passwords everything else is encrypted (00:21:08) and living locally. Your history is (00:21:10) local. So the model is also local like (00:21:12) it's a it's an end toend private browser (00:21:15) and uh we can take advantage of all the (00:21:17) hardware that Apple's going to build for (00:21:19) uh their computers and and ship the AI. (00:21:21) So that's that's exciting. And then (00:21:23) maybe 5 years from now like it can all (00:21:25) run on the phone or maybe there'll be (00:21:26) like a you know like a way to share the (00:21:30) computer across the MacBook and the (00:21:32) phone and the glass. So there's like all (00:21:34) sorts of like ways in which the hardware (00:21:36) can play out over time. (00:21:37) >> Yeah. (00:21:37) >> Yeah. (00:21:38) >> And it's true then it does impact your (00:21:40) to some degree your decisions for for (00:21:43) how you're building things as well as to (00:21:45) where it is today but also where it's (00:21:46) headed. (00:21:46) >> Yeah. We we so we focus a lot of our (00:21:49) inference team efforts today on (00:21:52) >> uh data center inference. So uh (00:21:55) especially we we we've looked into (00:21:57) multi-node inference where it's not just (00:21:59) like one node of eight hers but like (00:22:02) >> two or three nodes and we we showed that (00:22:04) the throughput is even higher when you (00:22:06) work like that (00:22:07) >> and we are currently like benchmarking (00:22:09) the black wolves versus hopper and (00:22:11) seeing like you know the next generation (00:22:12) Nvidia GPUs are giving us even better (00:22:14) throughput and latency but we are you (00:22:16) know like if if there is a time there is (00:22:18) an inflection point for local compute (00:22:21) like like which is honestly bottleneck (00:22:23) by whether there exists a model that (00:22:25) that's good (00:22:26) >> and of course M1 chip progress (00:22:29) >> but I'm sure we'll get there and then (00:22:31) we'll focus a lot on like the MLX (00:22:33) compiler (00:22:34) >> uh and and help like ship models locally (00:22:36) on our like we have a we have obviously (00:22:39) comet is a local desktop application we (00:22:41) have the perplexity desktop app (00:22:43) >> so we're going to ship models that are (00:22:44) local (00:22:45) >> and and it can integrate with your local (00:22:48) files and users are not going to feel (00:22:50) scared about it because it's going to (00:22:51) run on your phones and you own it Yeah. (00:22:53) >> And then you could imagine a new (00:22:55) innovation on top of that which is like (00:22:57) something like a you know I wouldn't (00:22:59) call it necessarily entire model (00:23:01) finetuning but (00:23:02) >> a few weights in the model could get (00:23:04) tuned for you personally. (00:23:06) >> So that's your personal intelligence. So (00:23:08) then you get to own your you get to own (00:23:10) that model. It runs on your device (00:23:13) >> um and and all the training and it's (00:23:14) updates (00:23:15) >> is update. (00:23:16) >> Yeah. It's it's it's on your data. (00:23:18) >> Uh and and and that never goes anywhere (00:23:20) to any other server. So that would be a (00:23:22) great even if we can achieve that with (00:23:24) just prompt engineering or context (00:23:26) engineering that's also fine. (00:23:27) >> Yes. (00:23:28) >> Um and so that that way like everything (00:23:30) feels like it's your thing. (00:23:31) >> Exactly. (00:23:32) >> With the pace of AI development, how do (00:23:34) you prioritize what to ship now versus (00:23:37) what to hold off on for long term? (00:23:39) Honestly, uh (00:23:42) you know the number one change most (00:23:45) developers or like engineers who come (00:23:47) work in our company or other AI (00:23:48) companies have to make is (00:23:50) >> adaptation. Um the world changes so (00:23:53) fast. (00:23:54) >> Um 6 month plans don't even make sense. (00:23:57) >> So we kind of work here with like (00:23:58) quarterly plans and even that we're not (00:24:01) rigid about it like we we we are very (00:24:03) flexible in terms of changing our world (00:24:06) views. M the one thing that's been (00:24:07) constant like like usually like you know (00:24:09) I like Jeff Bezos's paradigm of this (00:24:10) where when the world is changing fast (00:24:13) you have to ask the inverse question (00:24:15) which is what is not guaranteed to (00:24:16) change (00:24:17) >> ah (00:24:18) >> right like like you know like like in (00:24:19) his case he asked like (00:24:21) >> in 10 years from now would people want (00:24:23) slower package delivery or people want (00:24:25) worse customer support (00:24:27) >> or people want less selection of choices (00:24:30) >> no they they only want more (00:24:32) >> so uh work on those problems (00:24:34) >> I like that (00:24:34) >> and they get and they nailed it So in in (00:24:36) in our case, people are always going to (00:24:38) want faster answers. People are always (00:24:40) going to want more accurate answers. (00:24:41) People are always going to want the AIS (00:24:43) to like do things for them, not just (00:24:45) like answer stuff, but actually go do (00:24:47) stuff for them. (00:24:48) >> And um so you got to work on these (00:24:50) problems regardless of what happens in (00:24:52) terms of whether the models are getting (00:24:54) cheaper or like more expensive. These (00:24:56) are not the customer's problems. Like (00:24:58) customers don't care about these (00:24:59) problems. And again like interestingly (00:25:02) one thing that's been true fortunately (00:25:03) in AI is the cost of running inference (00:25:06) has been going down. (00:25:08) >> Um it's not clear how long it'll (00:25:10) continue but it's definitely like going (00:25:12) down (00:25:13) >> and more intelligence gets packed more (00:25:16) compactly into smaller and smaller like (00:25:18) like more efficient models. (00:25:20) >> Uh this of course all models are sparse (00:25:23) but like still the way we run inference (00:25:25) of sparse models is really really (00:25:27) improving. uh and the chips are still (00:25:29) improving like you know so I think (00:25:31) there's like plenty of competition at (00:25:33) the model layer to like keep continuing (00:25:35) this a lot of investments being poured (00:25:36) into like efficiency gains (00:25:39) >> and that's benefiting the application (00:25:41) layer like ours so we we get to like (00:25:44) reap the benefits of all this so I make (00:25:46) my decisions based on like not what's (00:25:48) likely to change (00:25:49) >> but what is more likely to remain the (00:25:52) same (00:25:53) >> and so you can like take concentrated (00:25:56) bets on that. Yes, exactly. I think (00:25:59) that's really smart cuz it is changing (00:26:00) so quickly, you know, throughout I found (00:26:02) it really interesting throughout our our (00:26:04) our conversation you (00:26:06) >> you reference you I can tell that you've (00:26:08) really studied uh other successful (00:26:11) individuals and I think that's really (00:26:13) really brilliant. What is your how has (00:26:16) that helped you uh by studying? You (00:26:19) know, you mentioned Jeff Bezos. How has (00:26:20) that helped you and and how much do you (00:26:22) take of what you study from them and (00:26:24) actually apply it to your decision-m? (00:26:27) >> I've studied all the all the successful (00:26:29) entrepreneurs, you know, all of them are (00:26:31) great in their own ways. Everyone has (00:26:33) like one common aspect like resilience. (00:26:36) >> Yeah. you know um and so that's the most (00:26:39) important characteristic that I I try to (00:26:41) take from that (00:26:42) >> because things will not always go well (00:26:43) and I've had like so many days in which (00:26:45) like (00:26:46) >> just waking up felt like so miserable (00:26:48) like I wanted to go back to bed but then (00:26:50) you know the whole company's like (00:26:52) working here uh the investors have put a (00:26:54) lot of faith in me you know the (00:26:56) employees even though they're doing (00:26:57) their job that's scoped out for them (00:26:59) fundamentally like they have a lot of (00:27:01) stock and they're relying on me to like (00:27:04) stay there so resilience is probably the (00:27:06) most important character istic um being (00:27:08) curious obviously uh you got to like (00:27:11) constantly keep questioning (00:27:13) >> you got to ask the right questions (00:27:14) that's how I think about it (00:27:16) >> so that requires the right frame of mind (00:27:19) >> you know um so so fundamentally if (00:27:21) you're not a curious person you wouldn't (00:27:22) even be asking questions leave alone the (00:27:24) right ones (00:27:25) >> so um so staying resilience thinking (00:27:27) curious and moving fast (00:27:29) >> that that's you know like like a quality (00:27:32) that I force myself to like try to be (00:27:34) decisive don't Don't don't try to like (00:27:38) drag along just because you have you'll (00:27:40) never have perfect information to make (00:27:42) decisions. (00:27:42) >> Yeah. (00:27:43) >> You know, and then and this is again (00:27:44) like a framework the basos framework of (00:27:46) oneway door versus two-way door (00:27:48) decisions helps you a lot like (00:27:49) >> you know like sure like if you're wrong (00:27:52) what's going to happen. (00:27:52) >> Yeah. (00:27:53) >> Right. (00:27:54) >> It's not like your company's going to (00:27:55) die. (00:27:56) >> No. Exactly. You're going to learn from (00:27:57) it. (00:27:58) >> Exactly. You're going to Yeah. (00:27:59) >> And and as you grow more and more uh (00:28:01) there there's hardly going to be any (00:28:03) decision that makes or breaks the (00:28:04) company. Yes, (00:28:05) >> there could be some decisions that (00:28:06) really damage the company, (00:28:08) >> but there'll be nothing that makes or (00:28:10) breaks the company. (00:28:11) >> Yeah, exactly. (00:28:12) >> That's good. I really like that. (00:28:14) >> You know, to wrap up our our (00:28:15) conversation, you alluded a little bit (00:28:17) to to some things that were coming uh (00:28:19) specifically with Comet that are coming (00:28:21) up. What can you (00:28:23) >> I'm sure there's a lot that you can't (00:28:25) fully share yet, but what what can you (00:28:26) share? What can we expect that's coming (00:28:28) up for Comet? Yeah. So, um people are (00:28:32) mostly on their phones as you know like (00:28:34) the world has changed since mobile phone (00:28:36) became the dominant form of computing. (00:28:39) So, we got to make the mobile versions (00:28:40) of comet ready, (00:28:42) >> right? (00:28:42) >> Uh both iOS and Android. So, that's the (00:28:45) next step. And then uh we got the fact (00:28:48) that AI really works, you know, you can (00:28:51) just ask an AI to do stuff for you. It's (00:28:54) much more natural to like interact with (00:28:56) the internet now with just voice. (00:28:58) >> Yes. So having like voice work even (00:29:01) better and even more naturally (00:29:03) >> on comet both on phones and uh computers (00:29:06) but especially on phones when you know (00:29:08) it's kind of annoying to type stuff. Uh (00:29:10) that's going to be very important. So (00:29:11) we're going to work on that. And lastly (00:29:13) like um the comet assistant uh should be (00:29:17) able to like do things from the (00:29:19) background for you. Uh you shouldn't (00:29:20) always be having a computer open and (00:29:22) typing in and waiting for it to do (00:29:24) stuff. (00:29:24) >> Uh it should be able to do stuff even as (00:29:26) you sleep. Yeah, I love thatamam example (00:29:29) you gave. Yeah. (00:29:30) >> So, um (00:29:32) >> I think these are the kind of things we (00:29:33) want Comet to be able to do by end of (00:29:36) the year. And um hopefully it feels (00:29:39) truly special like like I mean look the (00:29:41) bar is you got to be able to feel the (00:29:44) utility of the product to the extent (00:29:45) that your work and life should run on it (00:29:49) in that's why I call it the OS the (00:29:51) operating system because the the (00:29:54) computer science definition of an (00:29:55) operating system is where processes are (00:29:57) run and memory is being managed (00:29:59) >> and and and your life is that except (00:30:01) there is no OS for your life. Uh the OS (00:30:04) exists for the computer applications (00:30:06) today like Windows, Mac, they're all (00:30:08) running the applications that are meant (00:30:09) to be run there and some of your life (00:30:11) lives on those applications, but it's (00:30:12) all very disconnected. (00:30:13) >> Yes. (00:30:14) >> And you're still the one that's (00:30:15) orchestrating your life. (00:30:17) >> I don't think about it as like you (00:30:18) giving up your agency, but more that (00:30:20) delegate the boring aspects of your life (00:30:22) that make your life like kind of (00:30:24) annoying and stressful (00:30:25) >> to something like comet and and you'll (00:30:28) get control and live your interesting (00:30:30) part of your life. (00:30:31) >> I love that. (00:30:32) >> Thank you. Thank you so much for your (00:30:33) time. I'm I'm leaving this conversation (00:30:35) very excited and extremely inspired. So, (00:30:38) thank you. I (00:30:39) >> appreciate it.

Leave a Reply

Your email address will not be published. Required fields are marked *