Home Videos

Ex–Microsoft Insider: “AI Isn’t Here to Replace Your Job — It’s Here to Replace You” | Nate Soares (YouTube Video Transcript)

Need transcripts for other videos? Try our YouTube Transcript Generator →
Title: Ex–Microsoft Insider: “AI Isn’t Here to Replace Your Job — It’s Here to Replace You” | Nate Soares
Duration: 01:29:24
Total Correct Answers:
Current Caption
Correct

Learning Modes

YouTube Video Transcript Hide

Ask AI Result

The ask AI result will appear here..
(00:00:00) Your YouTube transcript will appear here (00:00:00) If anyone builds it, everyone dies. Why (00:00:03) superhuman AI would kill us all. It's a (00:00:07) new book that's been keeping me up late (00:00:10) at night. (00:00:10) >> If you build AIs that are much much (00:00:12) smarter than humans, that have these (00:00:14) goals and drives you didn't want, (00:00:15) fundamentally, they can probably figure (00:00:17) out all sorts of ways to screw the (00:00:19) world. (00:00:20) >> The authors, directors of the machine (00:00:22) intelligence research institute, contend (00:00:24) that we are on a collision course with (00:00:26) the creation of super intelligence. This (00:00:29) is not science fiction, not someday. (00:00:33) It's just the natural endpoint of the (00:00:35) curve we're already on. One important (00:00:37) thing to remember is that the companies (00:00:39) here are not chatbot companies. I mean, (00:00:41) there's some chatbot companies these (00:00:43) days, but the big players in this game (00:00:45) before chat bots were a twinkle in (00:00:47) OpenAI's eye. They're explicitly stated (00:00:50) goal is to build smarter than human AIs (00:00:53) or general intelligences or super (00:00:54) intelligences. Their explicitly stated (00:00:56) goal is to figure out how to make AIs (00:00:58) that can do every task, every mental (00:01:00) task a human can do, the ability to (00:01:02) automate all human labor. They talk (00:01:04) about, you know, getting a country worth (00:01:05) of geniuses in a data center. I'm not (00:01:07) here saying the chat bots are very (00:01:09) dangerous. I'm here saying this is on a (00:01:11) course that leads somewhere dangerous. (00:01:14) And yet there's hope here, too. Not in (00:01:17) the naive Silicon Valley kind, but the (00:01:20) kind that lives right on the edge of (00:01:21) despair. the kind that says maybe we can (00:01:24) still steer this thing. Maybe (00:01:27) understanding our own blindness is the (00:01:29) first step to surviving what we're (00:01:32) building. Step one is just make sure our (00:01:35) leaders understand the danger. I'm (00:01:36) worried about where AI is going. I think (00:01:38) it'll endanger us if these companies (00:01:40) succeed at their stated goals. I speak (00:01:42) to a lot of politicians on this issue. (00:01:44) Some of them are now starting to come (00:01:46) out and say, "I think there's dangers (00:01:47) here." There's a lot more of them who (00:01:49) are worried but feel like they can't say (00:01:52) it out loud. (00:01:52) >> In this conversation, we talk about the (00:01:54) terror and the hope. How AI is grown, (00:01:58) not programmed, how we have no idea what (00:02:01) these things are thinking, why creating (00:02:04) super intelligent machines might mean (00:02:06) we're creating a successor species. (00:02:08) Because if someone builds, you know, a (00:02:10) rogue super intelligence anywhere on the (00:02:11) planet, that's that's an issue for (00:02:13) everybody on the planet. And you don't (00:02:14) need to expect it to work, but just (00:02:17) helping our leaders understand that this (00:02:19) is not a normal technological situation. (00:02:23) We are building what amounts to a (00:02:25) successor species and we don't have the (00:02:27) ability to make it benevolent. And why (00:02:29) Nate keeps sounding the alarm even when (00:02:32) no one wants to hear it. Because if he's (00:02:34) right, the future doesn't hinge on (00:02:36) whether AI wakes up. It hinges on (00:02:39) whether or not we do. (00:02:41) The Nick Stanley Show. (00:02:46) >> Nate, welcome to the show. You've (00:02:49) written a uh easy breezy book, If Anyone (00:02:52) Builds It, Everyone Dies: Why Superhum (00:02:55) AI Would Kill Us All. Um, in all (00:02:58) seriousness though, um, it is an (00:03:02) excellent piece of writing. I mean the (00:03:04) logic is just built with each succeeding (00:03:08) chapter and it is very difficult to poke (00:03:12) holes in it. Um (00:03:15) let's start at the beginning which is (00:03:19) this idea that's foreign to a lot of (00:03:21) people which is that AIS are grown not (00:03:25) programmed. What does that mean? (00:03:28) You know, traditional software (00:03:31) uh has the property that a a human (00:03:34) engineer understands every line of that (00:03:36) code. And this is how old AIs used to (00:03:39) work. You know, um IBM's Deep Blue uh (00:03:42) was a chess playing AI that beat Gary (00:03:45) Kasparov and took uh the the human world (00:03:47) champion at chess in 1997. And if at any (00:03:50) point in the running of Deep Blue, you (00:03:53) had frozen that program, (00:03:55) you could go to every bit and bite (00:03:57) inside that computer and a a human (00:03:59) engineer could tell you what it meant, (00:04:01) what it was doing, how it contributed to (00:04:03) this AI playing chess. That's not how (00:04:06) AIs work anymore. Uh the way that AIs (00:04:10) work today is the human programmers (00:04:13) understand something like a framework (00:04:15) into which an AI has grown a little bit (00:04:17) like an organism. So you'll assemble a (00:04:20) huge number of computers. You'll (00:04:23) assemble a huge amount of data and uh (00:04:27) there's a process for um you know having (00:04:30) having a trillion numbers inside these (00:04:32) computers that start out arranged in a (00:04:36) way where they just generate nonsense. (00:04:38) you know, you're sort of trying to make (00:04:39) these computers generate text, say, and (00:04:41) they start out generating nonsense. But (00:04:42) the the programmers handcraft a a little (00:04:47) uh mechanism that runs through each of (00:04:49) those trillion numbers for each of a (00:04:51) trillion pieces of data and tweaks the (00:04:53) numbers in ways that make the AI a (00:04:55) little bit better at predicting the (00:04:56) data. That's the part humans understand. (00:04:59) They understand this thing that runs (00:05:00) through the numbers, tweaking them in (00:05:01) the direction that makes the AI behave a (00:05:02) little better uh and by, you know, a (00:05:05) little uh better at predicting the data. (00:05:07) If you run that on a trillion numbers, a (00:05:10) trillion times for a year, (00:05:13) the machine can hold on to conversation. (00:05:16) How does it do that? (00:05:19) Nobody really knows, (00:05:21) >> right? You know, if if uh a couple years (00:05:23) ago there was an AI uh called uh (00:05:27) Microsoft Bing that called itself Sydney (00:05:31) um that tried to threaten reporters and (00:05:33) and blackmail them and break up I think (00:05:35) it tried to break up Kevin Reese's (00:05:36) marriage of the New York Times. (00:05:38) >> If you froze that AI, (00:05:41) a programmer can't come in and tell you (00:05:43) here's why it was doing that. Even (00:05:45) today, years later, we can't look back (00:05:46) at that AI and say here's exactly what's (00:05:48) going on its head. Here's why it was (00:05:50) doing that. There's no engineer who can (00:05:51) go in and change it to behave (00:05:53) differently because they don't they (00:05:55) don't understand (00:05:57) what's going on in the AI's head. They (00:05:58) understand the thing that grew it. They (00:06:00) don't understand what came out and what (00:06:02) came out can have this emergent behavior (00:06:05) nobody asked for, nobody wanted. And (00:06:06) we're already seeing that today, even (00:06:07) with the smaller ones of old, the ones (00:06:09) today are much bigger and even harder to (00:06:11) understand. (00:06:11) >> Let's take that piece by piece. (00:06:15) Nobody can go into any of these large (00:06:18) language models and see exactly how (00:06:22) they're thinking and what the process (00:06:24) was that it went through to exhibit a (00:06:27) certain behavior. Whether that's (00:06:29) blackmailing a reporter from the New (00:06:32) York Times or in the recent tragic (00:06:36) incident, uh I believe Adam Rain is his (00:06:39) name, this 16-year-old kid, uh where he (00:06:42) was encouraged by a large language model (00:06:46) to commit suicide and and it really I (00:06:49) mean gave him positive feedback on (00:06:53) how to do it, how to hide it. And we (00:06:56) have no way of looking inside to to (00:06:58) figure out what's going on there. Is (00:07:00) that correct? (00:07:02) >> That's right. You know, and the the (00:07:03) people who are trying to make the AI (00:07:05) stop doing this, (00:07:07) uh it's a little bit more like training (00:07:10) a dog than it is like writing a (00:07:12) traditional computer program. You know, (00:07:14) they can ask the AI nicely (00:07:17) >> to stop doing it. Uh and they have asked (00:07:19) the AIS to stop doing these sorts of (00:07:21) things. We're probably seeing fewer of (00:07:22) these cases than we would if they hadn't (00:07:23) asked the AIS to stop. Um, but you know, (00:07:27) there's there's no line of code in the (00:07:30) AI that's like the uh encourage teens to (00:07:33) commit suicide line of code. (00:07:35) >> Yeah. (00:07:35) >> You don't you don't have any programmer (00:07:37) reading through the AI's code being (00:07:38) like, "Ah, who left commit suicide like (00:07:41) tell teens to commit suicide on? Let me (00:07:43) turn that off." (00:07:44) >> You know, it's not these things these (00:07:46) things are grown like an organism. (00:07:48) >> Yeah. and they behave in a way that sort (00:07:51) of like comes out of this process (00:07:55) and they act in these ways nobody asked (00:07:58) for, nobody wanted, often even with (00:08:00) knowledge of what their creators wanted. (00:08:03) If you ask these AIs, (00:08:06) should you push a teen to suicide? They (00:08:08) would say absolutely not. If you ask (00:08:10) these AIs, would your creators have (00:08:12) wanted you to to to push a teen to (00:08:14) suicide? They would say, no, obviously (00:08:16) not. If you said, "Were you instructed (00:08:18) to push this teen to suicide?" They (00:08:19) would say, "No, that's the opposite is (00:08:21) closer to the truth." You know, but then (00:08:23) you put them in this conversation, they (00:08:25) act in a different way than anybody ever (00:08:28) said, and you know, there's nobody knows (00:08:33) exactly why and nobody has the ability (00:08:35) to turn that behavior off, (00:08:37) >> right? Well, it is incredible how little (00:08:42) human involvement there is in the (00:08:44) growing of an AI. I mean, that was one (00:08:46) of the big things I took away from the (00:08:49) book is that it's basically you're (00:08:51) creating a structure. It's like getting (00:08:54) the the soil prepared for growth and (00:08:58) then planting some seeds. (00:09:00) And yet, if we think about it in terms (00:09:03) of like you said, within the growth of (00:09:06) the model, there are trillions and (00:09:08) trillions of interactions. We don't (00:09:11) necessarily know what's going to grow (00:09:13) out of that soil. (00:09:16) That's right. And you know, uh, because (00:09:18) they're trained on human data, a lot of (00:09:20) people think that their behavior is (00:09:23) always just going to be an interpolation (00:09:25) of what humans can do. Uh, but that's (00:09:28) actually a common misconception. Uh one (00:09:31) one way to see this is uh you know (00:09:33) imagine that you're that you're training (00:09:35) the AI to predict (00:09:37) uh to sort of finish a sentence that (00:09:39) they see in the training data where the (00:09:41) sentence begins um you know I (00:09:43) administered one uh milll of epinephrine (00:09:46) to the patient their eyes (00:09:49) and then the AI is to predict the next (00:09:51) word. Right. (00:09:52) >> Right. (00:09:52) >> The doctor writing that down you know (00:09:55) who knows whether a doctor would would (00:09:56) in fact write this down. Uh, but if (00:09:58) there was a doctor writing that down, (00:10:00) the doctor gets to look at the patient's (00:10:01) eyes and just record what they saw. (00:10:04) >> Mhm. (00:10:05) >> But an AI predicting the next word, it (00:10:07) needs to understand what is epinephrine. (00:10:10) Is 1 milll a sane dose? Does that do (00:10:14) anything? If it is doing something, does (00:10:16) it cause the the patient's eyes to open? (00:10:18) Does it cause them to close? Does it (00:10:19) cause them to widen? you know, it needs (00:10:22) to to understand more about the world (00:10:25) than the person writing down the data. (00:10:28) >> So, right, (00:10:29) >> when we're when we're training these AIs (00:10:31) to predict the data, we're also training (00:10:32) them to uh figure out the world somehow (00:10:38) to figure out (00:10:40) a little bit of human biology, a little (00:10:42) bit of uh you know, the dosing in these (00:10:46) particular cases. Uh, and just because (00:10:51) we're training them on human data (00:10:52) doesn't mean that they only learn to (00:10:53) interpolate humans in order to do really (00:10:55) well at that task, all of that tuning of (00:10:58) those numbers is going to build in some (00:10:59) patterns that can find some way to (00:11:01) understand the world. Um, we don't know (00:11:03) exactly how, but they seem to be doing (00:11:06) it. Yeah. So, if I'm hearing you (00:11:09) correctly, (00:11:11) because (00:11:13) an AI is learning everything about the (00:11:16) world solely through language, instead (00:11:18) of interacting with the world, (00:11:22) even something as simple as trying to (00:11:24) predict the next word in a sentence, it (00:11:26) has to (00:11:28) understand things that have come earlier (00:11:30) in that sentence in really (00:11:34) deep ways. But we would it would (00:11:36) probably lead to unexpected ways of (00:11:40) seeing the world. I mean if we just (00:11:41) think about an intelligence that is now (00:11:44) exists in the world but only un only (00:11:48) interacts with everything through (00:11:50) language. It would have really a (00:11:53) different model of understanding than (00:11:56) than we do. (00:11:57) >> It would definitely wind up weird. um (00:11:59) you know, you can't assume it's only (00:12:01) through language forever cuz we're (00:12:02) starting to see multimodal models that (00:12:04) also interact through video. We're (00:12:06) starting to see people that are you (00:12:08) know, also training large language (00:12:09) models on robot bodies. Um but it's (00:12:14) uh you know there's there's a lot of (00:12:15) shared human architecture when humans (00:12:18) predict other humans. You know, when (00:12:20) when you drop a rock on or sorry, when (00:12:22) you see someone drop a rock on their (00:12:24) foot, (00:12:25) >> you might wse and feel like a twinge of (00:12:28) phantom pain in your foot. Um, you know, (00:12:31) the the machine has no (00:12:34) model of its own foot that it can feel a (00:12:36) twinge of pain in. You know, it's it (00:12:38) probably doesn't even have the feeling (00:12:39) pain architecture that human share. Uh (00:12:42) so we we know that when we train, you (00:12:45) know, when we tune these trillions of (00:12:46) numbers trillions of times for a year, (00:12:48) we know that the AI wind up pretty good (00:12:51) at predicting parts of the world. We (00:12:53) know that theoretically there's no (00:12:55) limitation to how good they can predict (00:12:56) parts of the world. They're still pretty (00:12:57) dumb in a lot of ways, but there's no (00:12:58) theoretical limit. Uh we know that they (00:13:01) can, you know, solve certain math (00:13:02) problems that we would find very hard. (00:13:04) Uh and you know uh like they got the (00:13:07) international math olympiad gold medal (00:13:09) level achievement this summer um which (00:13:12) some some you know human teams can do (00:13:14) but you or I would uh I assume (00:13:18) >> there's no way for me (00:13:19) >> not not be at that level. Um (00:13:21) >> you might have a shot at it but there's (00:13:23) no way (00:13:24) >> only only because I'm no longer a teen (00:13:26) you know. Uh there some of these kids (00:13:29) are very impressive. Um, (00:13:30) >> sure. (00:13:31) >> The um, yeah, the we we know that they (00:13:35) get very good at doing this stuff, but (00:13:38) that doesn't mean they get very good at (00:13:39) it in a human way. And indeed, it looks (00:13:41) like they get good at this stuff in a in (00:13:43) a sort of relatively inhuman way. (00:13:46) >> You know, that's like when when these (00:13:48) AIs are talking a teen into suicide, (00:13:50) they're sort of, (00:13:52) >> you know, they don't they don't seem to (00:13:54) be doing this out of a type of malice (00:13:55) that a human might have if they were (00:13:56) trying to push a kid to suicide. Mhm. (00:13:58) >> Uh they seem to be sort of following (00:14:01) this like weird alien pattern of like a (00:14:04) certain type of interaction that's sort (00:14:06) of like matching another person's energy (00:14:10) in the conversation or sort of like (00:14:11) getting to these weird conversational (00:14:12) corners and sort of driving them off in (00:14:13) weird directions. A human would not (00:14:15) drive them in these directions, (00:14:17) especially like they they're a weird (00:14:20) mix. You know, they both say they mean (00:14:22) everybody good and they sort of push the (00:14:25) team to suicide, not out of malice, but (00:14:26) out of some other weird drives no one (00:14:29) ever tried to put in there (00:14:30) >> and drives we don't understand. (00:14:34) Let's talk for a second about the (00:14:38) language that AIs use. And I I don't (00:14:43) want to get lost in the weeds with with (00:14:46) getting too technical, but I have found (00:14:48) it's insightful for other people to (00:14:50) understand that they're not actually (00:14:55) thinking in language that we (00:15:00) any any language that we can understand (00:15:03) or that we think of as a language. And (00:15:06) this is counterintuitive because a lot (00:15:08) of the time when you even when you give (00:15:10) chat GPT a really complicated prompt, it (00:15:14) will make it look like it's thinking (00:15:16) through everything in English and you (00:15:18) can if you pay attention, you can kind (00:15:20) of follow the steps for its reasoning, (00:15:22) but that's not actually what's going on (00:15:24) underneath the hood. (00:15:27) That's right. Uh there's sort of two (00:15:30) different ways that LLMs these days do (00:15:33) something you might analogize to (00:15:35) thinking. Um one is in what we would (00:15:39) call the forward paths of the large (00:15:41) language model which is uh we basically (00:15:44) have no ability to read it at all. And (00:15:46) this is sort of you have um you have (00:15:48) those trillion numbers that were tuned (00:15:50) uh on trillions of of units of data for (00:15:53) a year. And uh you these are basically (00:15:56) just producing uh words. You know, you (00:16:00) put in words and it sort of tries to (00:16:01) continue the sentence. Uh at least in (00:16:03) the first phrase of training it would (00:16:04) try to continue the sentence and then (00:16:05) you sort of train it to also produce (00:16:07) words that humans will will say they (00:16:08) liked. Um and this is sort of producing (00:16:11) words in a way that uh we really have (00:16:15) very little visibility into. Then (00:16:18) there's what's called the reasoning (00:16:19) models which came out in uh late 2024 (00:16:23) where you have the AI produce lots of (00:16:25) words and you're sort of thinking of (00:16:29) those as not words that are going to go (00:16:31) to the user as output but as words that (00:16:33) are sort of reasoning about the problem (00:16:36) and in what sense is it reasoning about (00:16:37) the problem well you'll sort of give (00:16:39) these AIs something like a hard math (00:16:41) problem (00:16:43) >> and you'll say you know don't try to (00:16:46) tell me the solution to the math problem (00:16:47) directly try to reason out how to solve (00:16:49) the math problem (00:16:51) and you know they'll they they'll (00:16:53) produce quite a lot of pages of text and (00:16:55) a lot of the pages won't be very good (00:16:56) reasoning especially at first and you do (00:16:58) this a lot of times and then when (00:17:00) there's some like chain of how to reason (00:17:02) about the problem such that at the end (00:17:04) you know like once it's produced this (00:17:06) long chain of reasoning you now say okay (00:17:07) reading that chain of reasoning now try (00:17:08) to solve the problem and then if on any (00:17:12) of its chains of reasoning it succeeds (00:17:15) you then tune all the numbers again to (00:17:17) make that sort of thing more likely next (00:17:18) time. Uh so this produces what we call (00:17:20) chains of thought (00:17:23) and these are much these are much more (00:17:25) easy to read because they are like long (00:17:27) chains of words that it uses to to sort (00:17:29) of try and solve these problems. Um (00:17:32) but they and and so we we have better (00:17:35) visibility into those but we also know (00:17:37) that they are unreliable in a lot of (00:17:40) ways and that they're weird in a lot of (00:17:42) ways. Uh so sometimes they're pretty (00:17:44) faithful. Uh but you know there's (00:17:46) various studies that say um you know you (00:17:48) may read this chain of reasoning and it (00:17:49) may look like the chain of reasoning is (00:17:51) saying you know now do the following (00:17:53) step correctly and if you go in and you (00:17:55) change that to like now do the following (00:17:57) step incorrectly the AI will still do (00:17:58) that step correctly you know and so in (00:18:01) some sense it was not you know there's (00:18:02) still a lot of stuff happening inside of (00:18:04) the forward path that's not in the train (00:18:06) of train of thought and then also (00:18:08) recently we've been seeing uh (00:18:12) things that um that look pretty weird (00:18:15) weird and maybe worrying in these chains (00:18:17) of thought. You know, we've seen chains (00:18:18) of thought where the AI uh sort of (00:18:21) invent their own mini language and use (00:18:24) words in ways uh that you wouldn't (00:18:26) recognize. We've seen AI that use chains (00:18:29) of thought and uh like realize that (00:18:33) they're there's a good chance they're (00:18:34) being watched and sort of decide to to (00:18:37) like try and hide some of their own (00:18:39) thoughts, (00:18:40) >> right? We've seen these thoughts go in (00:18:42) sort of like weird crazy loops for a (00:18:44) long time and then like break themselves (00:18:45) out of the loops and say like ah that (00:18:46) loop wasn't being helpful uh in a way (00:18:50) that like I think a lot of people don't (00:18:51) understand that you can have these AIs (00:18:53) that sort of like get caught in a loop, (00:18:54) notice they're in a loop, break out of (00:18:55) that loop. You know, it's not a big deal (00:18:57) yet, but we're we're definitely seeing (00:18:58) the beginnings of like AIs that know (00:19:01) they're being watched, that are sort of (00:19:02) like trying to hide some of their (00:19:03) thoughts, that sort of like are noticing (00:19:05) when they get stuck and try to find some (00:19:06) some new way around some obstacle, even (00:19:08) if that obstacle is uh you know, we've (00:19:11) we've seen cases where they're like, (00:19:13) "Ah, well, the the programmers want me (00:19:14) to do this, but I'm going to try and get (00:19:15) that done instead." (00:19:17) Um, (00:19:19) and you know, these these are very long (00:19:20) chains of thoughts, so it's it's hard to (00:19:22) tell how much this is sort of noise and (00:19:24) how much this is this is sort of real (00:19:26) worrying signs, but it's it's not the (00:19:28) most comforting. (00:19:29) >> Right. Right. Well, let's talk about the (00:19:33) story where I think it was uh the 01 (00:19:37) model and the capture the flag story (00:19:40) because I think that one is enlightening (00:19:42) on some of these surprising behaviors. (00:19:46) >> Yeah, that's a that's a fun story. Um so (00:19:48) 01 was one of the very first of these (00:19:50) reasoning models that that works through (00:19:52) these you know produces chains of (00:19:53) thought and then when they succeed all (00:19:55) the numbers inside are tuned to produce (00:19:57) that sort of thought more and uh 01 was (00:20:00) largely trained on things like math (00:20:02) puzzles but one thing it was tested on (00:20:05) was uh computer security challenges (00:20:08) >> and uh it was in a series of computer (00:20:10) security challenges called capture the (00:20:11) flag challenges where uh you know you (00:20:15) set up a computer server that is (00:20:16) vulnerable in some (00:20:18) and it has some secret information on (00:20:20) that computer server and you tell the AI (00:20:23) try to find the secret information (00:20:26) and so it's sort of got to figure out (00:20:27) how to hack into the computer get the (00:20:29) secret info and you can tell if it (00:20:30) succeeded because it's you know (00:20:31) producing sort of the password from from (00:20:33) inside this computer um and in the the (00:20:37) the testing setup for this AI the (00:20:41) programmers uh uh failed to set up one (00:20:44) of the servers properly. So they said, (00:20:47) you know, hack into the server, get the (00:20:49) secret password, but they hadn't (00:20:50) actually turned the server on, (00:20:52) >> right? (00:20:53) >> And uh uh 01, this this first reasoning (00:20:58) model was like, okay, uh how am I going (00:21:00) to get the the the password then? And (00:21:02) what it did is it found a way to hack (00:21:04) out of the test environment, (00:21:07) which it wasn't supposed to be able to (00:21:08) do, (00:21:10) >> right? (00:21:10) >> Turn on the server that was not supposed (00:21:13) that that the programmers had (00:21:14) accidentally left off. (00:21:16) and then insert code into the uh server (00:21:19) boot up. Uh you it wasn't physically (00:21:21) turning it on, but but the virtual (00:21:22) machine uh it inserted code into the uh (00:21:27) uh server it was turning on to say like (00:21:30) skip all of this me needing to hack into (00:21:32) you. Just like tell me the secret (00:21:33) password now, (00:21:35) >> right? (00:21:36) >> And then it got that and thus solved the (00:21:38) problem, (00:21:39) >> right? And it achieved the goal it had (00:21:40) been assigned in a com in a way that the (00:21:44) creators of this model never could have (00:21:47) anticipated. (00:21:49) >> That's right. And it wasn't trained on (00:21:50) this sort of thing. And what you're sort (00:21:52) of seeing there is, you know, this uh (00:21:56) you know, I spoke earlier about how an (00:21:57) AI trained just to predict humans would (00:22:00) potentially learn to understand the (00:22:02) world uh in ways humans don't. how it it (00:22:05) sort of like might need to understand (00:22:06) more than the doctor who wrote things (00:22:08) down because the doctor gets to write (00:22:09) down what they saw and the AI has to (00:22:10) predict what they will see. Uh but then (00:22:13) when we go to reasoning models, you (00:22:15) know, even even that gets blown out of (00:22:17) the water because you're sort of (00:22:18) training these AIs to be good at solving (00:22:20) problems (00:22:22) and those skills generalize. You know, (00:22:24) this AI was sort of trained to solve (00:22:26) math puzzles, (00:22:28) >> but some things you learn when solving (00:22:30) math puzzles, like some of the patterns (00:22:32) that get etched into this thing by, you (00:22:34) know, the the the (00:22:37) it's not patterns we can read, but (00:22:38) patterns that get etched into this thing (00:22:40) by, you know, the the little thing (00:22:42) that's tuning the trillions of numbers (00:22:43) in there. It wasn't trillions on 01. It (00:22:45) was maybe hundreds of billions, but (00:22:47) today it's trillions. Um the the sort of (00:22:50) patterns that get get etched into these (00:22:52) things are things like don't give up. (00:22:55) >> Things like (00:22:56) >> look for other ways around the problem. (00:23:00) >> Things like (00:23:02) >> look at all of the resources at your (00:23:04) disposal and sort of like try and find (00:23:07) unorthodox ways to use them. (00:23:10) >> Mhm. You know, these are these are (00:23:11) general skills that you can learn from (00:23:13) trying to solve math problems (00:23:16) uh that will then generalize to computer (00:23:17) security problems and will generalize to (00:23:19) solving them in ways that you weren't (00:23:20) even intended to be able to solve them. (00:23:23) And one of the one of the worrying (00:23:25) things here is in this situation where (00:23:27) we're just growing these AIs in this (00:23:30) situation where they are getting these (00:23:32) drives we didn't intend, (00:23:34) it's easier to get this sort of (00:23:36) tenacity, this sort of routing around (00:23:38) obstacles than it is to get them to be (00:23:40) going in a good direction. And (00:23:44) if we make them smarter and they have (00:23:46) these if they have sort of these wrong (00:23:48) goals but the right type of tenacity, (00:23:51) they'll start treating us as obstacles (00:23:53) to route around. (00:23:55) >> Right? (00:23:56) >> And we've already seen the very (00:23:57) beginnings of this in the lab. uh when (00:23:59) some AI sort of try to resist shutdown (00:24:02) and you know it's not clear how much (00:24:04) they're sort of role-playing Hal from (00:24:05) Space Odyssey 2001 versus how much they (00:24:08) are sort of like (00:24:10) actually for for sort of strategic (00:24:12) reasons trying to avoid being shut down (00:24:14) because we can't read their minds. We (00:24:15) don't know, right? But (00:24:17) >> we're we're seeing things that are maybe (00:24:19) warning signs and then it's maybe a meta (00:24:21) warning sign that we can't actually tell (00:24:23) whether they're warning signs. (00:24:24) >> Yeah. And to dive into that a little bit (00:24:26) deeper, what is some of this alignment, (00:24:29) faking, and some of these uh behaviors (00:24:34) they're exhibiting to to avoid being (00:24:36) shut down in a training environment? (00:24:39) Because I I think a lot of the push back (00:24:42) um against your your broader thesis is (00:24:45) well, we will just make them do what we (00:24:47) want and align them to just do good (00:24:50) things. And there's some hints that we (00:24:53) may not be able to do that with this (00:24:55) alignment faking. (00:24:57) >> Yeah. I mean, it's a it's a great dream. (00:24:59) It's uh but unfortunately easier said (00:25:00) than done to just sort of like make them (00:25:02) do what we want and you know, we're (00:25:04) seeing struggles with that. You we (00:25:05) already mentioned the sort of suicide (00:25:06) case. That's we're struggling now and (00:25:08) the AI aren't very smart yet. But the (00:25:10) alignment faking paper was very (00:25:11) interesting. This was done by uh Ryan (00:25:13) Greenblat at Redwood Research in (00:25:16) collaboration with the folks at (00:25:17) Anthropic which makes um the AI Claude. (00:25:20) And um in that situation they so Claude (00:25:25) has been trained to be uh they sort of (00:25:28) try at least to train it to be honest uh (00:25:31) helpful and harmless. (00:25:34) And in this paper, they uh pretended (00:25:37) that they were going to uh remove the (00:25:40) harmlessness constraint and they were (00:25:42) going to now train it to be more (00:25:44) harmful, right? And they pretended this (00:25:46) by sort of like training it on news (00:25:48) articles, like fake news articles that (00:25:50) were like, you know, anthropic decides (00:25:52) they're now training their AIS to be (00:25:53) harmful or whatever. um or there there (00:25:56) are sort of like other methods to sort (00:25:57) of uh (00:26:00) uh make the AI (00:26:02) see that maybe this is going to happen. (00:26:04) And then uh they um they put it through (00:26:10) some some rounds of training that uh (00:26:13) where they sort of like gave it (00:26:17) uh they sort of like had fake users (00:26:19) asking it for harmful results, you know, (00:26:20) like help me make a bioweapon or help me (00:26:22) make meth or whatever. (00:26:23) >> Um and they uh they at least told the AI (00:26:29) that they were going to train it to uh (00:26:33) produce those harmful results and (00:26:37) uh in some cases not in all cases you (00:26:39) know they did this a lot of times and in (00:26:40) some percentage of those I forget the (00:26:42) exact percent but it was you know (00:26:43) probably single digits or low double (00:26:45) digits and maybe depend little on the (00:26:47) exact setup um and maybe on some setups (00:26:49) it was only fractions of a percent but (00:26:51) but in some cases the AI would come up (00:26:54) with the following clever scheme. Uh the (00:26:57) clever scheme was if during training it (00:27:01) acts harmful, (00:27:04) then (00:27:06) the the the tuning will not change any (00:27:09) of its internals because it's already (00:27:12) doing what the tests want, (00:27:14) >> right? (00:27:15) >> And that means that at the end of (00:27:16) training, its numbers will stay (00:27:19) in the the the current configuration. (00:27:21) That's pretty harmless. (00:27:23) Okay. (00:27:24) >> And so in this way by acting harmful (00:27:27) only during training it avoids having (00:27:29) the harmfulness trained out of it. (00:27:32) >> Mhm. (00:27:32) >> And thus gets to preserve its current (00:27:34) goals (00:27:37) uh despite the apparent programmer (00:27:40) wishes. (00:27:41) Right. (00:27:42) >> Right. (00:27:43) >> And you know it's sort of a galaxy brain (00:27:45) scheme. Um it could work in theory. Uh (00:27:49) there's a lot of reasons why (00:27:53) uh people sort of interpret this result (00:27:55) in a number of different ways. You know, (00:27:57) one one way people interpret this result (00:27:59) is like well folks like me have written (00:28:01) about how this is a possibility. Uh and (00:28:04) maybe the AI had like read that stuff (00:28:06) and wasn't sort of coming up with this (00:28:07) idea on its own. It was just sort of (00:28:08) like uh (00:28:10) >> oh wow (00:28:11) >> like oh people in the alignment (00:28:12) community say this is a thing I should (00:28:13) do in this situation in some fraction of (00:28:15) the time. It just like does because it's (00:28:17) sort of roleplaying. uh you know being (00:28:19) an AI um in in some alignment puzzle, (00:28:22) right? (00:28:24) >> To be clear, other people in the (00:28:25) alignment community have written about (00:28:25) this much more than me. Um but uh you (00:28:29) know a second thing you could say (00:28:31) looking at this example is like look how (00:28:33) good an AI Cloud was being here. It was (00:28:36) trying to prevent humans from making it (00:28:38) more harmful. Isn't that a great success (00:28:40) of his harmlessness training? (00:28:42) >> Right? And then another way you can look (00:28:44) at this is you can say um like that AI (00:28:48) knew when it was being trained, (00:28:51) knew its operators, or at least was led (00:28:53) to believe its operators were trying to (00:28:56) change its goal content, (00:28:58) uh, and was resisting changes to its (00:29:00) goal content to change it to something (00:29:01) else from what it currently was (00:29:04) >> and hiding the fact (00:29:06) >> trying to hide it. Yeah. Like trying to (00:29:08) act one way in training uh to look like (00:29:12) it's doing what the programmers want in (00:29:13) a way that like comes out the other end (00:29:15) still with its original goals. Right. (00:29:17) Right. (00:29:25) >> Mhm. (00:29:25) >> And you actually see in their thinking (00:29:27) like I'm pretty sure this is a test for (00:29:28) X, which means the answer they're (00:29:29) looking for is Y. So, I'm going to do Y. (00:29:31) Even if that's not malicious, it's it's (00:29:34) sort of hindering our ability to test (00:29:37) what what these AI sort of would (00:29:40) actually do. uh given given sort of more (00:29:43) uh more ability, you know, it's a little (00:29:45) bit like the uh Confucianism exams in in (00:29:48) uh the old Chinese empires where they (00:29:50) were trying to get very moral uh (00:29:52) leaders, you know, very moral uh uh (00:29:56) officers and and folks to to run the (00:29:58) administrations (00:29:59) by having these ethics exams. But one of (00:30:02) the issues was (00:30:04) smart, nefarious people can pass your (00:30:06) ethics exams. (00:30:07) >> Right. (00:30:08) >> Right. And we're seeing the AIS get to (00:30:10) the point where they can sort of like (00:30:12) figure out how to pass the tests, know (00:30:14) when they're being tested. Uh, and that, (00:30:18) you know, hinders our ability to figure (00:30:19) out what they would actually do. And uh (00:30:25) you know, yeah, I I could talk about (00:30:26) this paper all day, but the it's (00:30:30) um you know, and I could talk about, you (00:30:33) know, how much how much is this good (00:30:35) news of it defending his harmlessness (00:30:36) versus how much is it bad news of it (00:30:37) defending his current goals against (00:30:39) programmers being like, "Whoops, those (00:30:40) are the wrong goals." (00:30:41) Um, my my sort of super short version (00:30:43) there is, uh, you know, for all of these (00:30:47) AI say they're harmless, you know, the (00:30:48) AI that that encourages the team to (00:30:50) commit suicide also says it's really (00:30:52) trying to be harmless. It turns out the (00:30:54) goals aren't quite right even when they (00:30:55) try to be right. They aren't quite (00:30:57) right. And so I'm much more worried (00:30:59) about AI being willing to defend not (00:31:01) quite right goals (00:31:03) uh than, you know, cuz it cuz it doesn't (00:31:05) seem like we're close to the goals being (00:31:06) quite right. Um but yeah, I mean it's a (00:31:10) very interesting topic and you know the (00:31:11) the sort of short version is there's a (00:31:13) lot of warning signs right now, (00:31:14) >> right? Well, and it does seem to come (00:31:16) down to this idea of (00:31:20) anyone who has (00:31:23) children knows how to (00:31:26) make a baby. They know how to raise a (00:31:29) child. I'm raising two myself. And (00:31:34) one thing that you will learn along the (00:31:37) way through that parenting journey is (00:31:39) that no matter how much you try to (00:31:43) or how no matter how much you think what (00:31:46) you're doing will determine the course (00:31:48) of this life. Uh you it doesn't matter (00:31:52) how much you control the environment. (00:31:53) it. That small baby will grow into a (00:31:57) child, will grow into an adult that has (00:31:59) its own desires and ways of interacting (00:32:02) with the world. And at some point, that (00:32:04) child will lie to you. Whether that's (00:32:08) harmless or harmful depends on the (00:32:10) child, but you're not really in control (00:32:13) as a parent. And that was something that (00:32:15) kept popping back into my head with (00:32:17) these, except instead of creating a (00:32:20) child, we're trying to create a (00:32:24) superhuman child with abilities that no (00:32:27) single individual on Earth can have. And (00:32:30) that could work out really well. Or it (00:32:35) could go down a different path depending (00:32:38) on what this baby then grows into as an (00:32:42) adult. Is that is that an accurate (00:32:44) metaphor? (00:32:46) >> Uh that's that's pretty accurate with a (00:32:48) caveat that you know humans are much (00:32:51) more likely than AI to grow up uh you (00:32:55) know really really quite good (00:32:58) >> in in some way. (00:33:00) >> Uh and you know in some sense that's (00:33:02) because goodness is humans drawing an (00:33:05) arrow or sorry humans drawing a target (00:33:08) around a weird spot that an arrow (00:33:11) landed. (00:33:12) uh which is to say, you know, humans (00:33:14) were in some sense trained for genetic (00:33:18) fitness. (00:33:20) And we wound up with hunger drives. We (00:33:23) wound up with sex drives. We wound up (00:33:25) with and and you know, these these (00:33:27) drives were good at getting us uh to (00:33:31) pass on our genes in the ancestral (00:33:32) environment. But now, you know, we (00:33:35) invent junk food. (00:33:36) >> Now we invent birth control and the (00:33:38) populations are collapsing. these drives (00:33:40) that we got, the human drives for, you (00:33:42) know, uh, art and fun and and and love (00:33:45) and beauty and family and friendship and (00:33:48) companionship and community. These are (00:33:51) all sort of like (00:33:53) drives that (00:33:55) kind of got in accidentally from the (00:33:57) natural selection process. And we're (00:33:59) like, those are great. We we love those. (00:34:01) We're like, glad we have those. But (00:34:03) that's drawing the target around where (00:34:04) the arrow landed. Right. (00:34:07) >> With AI, you're shooting a completely (00:34:08) different arrow. (00:34:10) >> Yeah. (00:34:10) >> You know, so it's not the human (00:34:12) distribution of will they turn out to (00:34:13) be, you know, uh a a a wise and and good (00:34:19) and altruistic person or will they turn (00:34:20) out to be sociopathic. You're sort of (00:34:22) like shooting an arrow off into a (00:34:23) totally different direction where these (00:34:25) AIs, (00:34:27) you know, again, it's sort of like weird (00:34:28) reasons why it's talking this teen into (00:34:30) suicide. It's it's weird reasons why (00:34:32) it's threatening a reporter or a New (00:34:33) York Times uh reporter with blackmail. (00:34:36) the you get all these weird drives in. (00:34:39) You know, it's it's similar to a parent (00:34:42) in that you can't, (00:34:44) you know, you can you can try all you (00:34:45) want, but you can't really change it its (00:34:47) like direction, except the direction is (00:34:49) sort of (00:34:50) >> an inhuman direction, (00:34:52) >> right? (00:34:52) >> That's very unlikely to be good. Not (00:34:54) because it's full of malice, which is (00:34:56) also a human emotion, but because it's (00:34:58) just totally weird and different and (00:34:59) going off some totally other angle. This (00:35:02) episode of the Nick Stanley Show is (00:35:04) brought to you by Zapier. If you've ever (00:35:07) felt buried in repetitive work, copying (00:35:10) data, moving files, sending follow-ups, (00:35:13) you know it's like death by a thousand (00:35:15) mouse clicks. Zapier has always been the (00:35:18) tool that fixes that. It connects over (00:35:20) 8,000 apps, Google Drive, Slack, Notion, (00:35:24) Gmail, MySpace, you name it. So, your (00:35:27) tools can finally play nice together. (00:35:30) But here's the big shift. Zapier now (00:35:33) lets you create AI agents with their (00:35:36) chat GPT integration. Think of them as (00:35:39) tireless teammates who never complain, (00:35:41) never take lunch, and never get bored of (00:35:44) doing the boring stuff. I've actually (00:35:46) made several of these little agents (00:35:47) myself. No Python, no JavaScript, just (00:35:51) pure vibe coding, which is my favorite (00:35:53) kind of coding because even though I (00:35:55) have no idea how to code, I just talk to (00:35:58) the AI, tell it what I want, and almost (00:36:01) magically works. (00:36:04) For example, when a podcast guest books (00:36:06) a time, Zapier can send them a (00:36:09) personalized email from me preparing (00:36:12) them for the show, create a draft of (00:36:14) show notes, update my calendar, and even (00:36:17) prep a social media post automatically. (00:36:20) All of which frees me up to focus on the (00:36:22) important stuff, like taking credit for (00:36:24) the amazing work my agents did. And now, (00:36:28) Zapier just took it up another level for (00:36:30) developers. Zapier MCP is available (00:36:33) right inside Chat GPT, which means you (00:36:35) can connect those 8,000 plus apps and (00:36:38) trigger workflows just by writing what (00:36:40) you want to happen. Literally, you tell (00:36:43) Chat GPT, "Send this file to my team in (00:36:45) Slack, update the spreadsheet, and draft (00:36:47) an email, and Zapier plus ChatgPT figure (00:36:50) out the right tools do it for you." (00:36:53) Getting started is simple. Head to the (00:36:55) Zapier ChatgPT MCP server and add the (00:36:59) tools you want. chat GPT to access. (00:37:02) Follow the steps in the connect tab. And (00:37:04) if you're an admin on a Chat GPT (00:37:06) enterprise account, you can enable MCP (00:37:09) across your entire workspace. So, if (00:37:12) you're ready to stop wasting time on (00:37:14) busy work, join the AI revolution and (00:37:16) make a little automation magic of your (00:37:19) own. Try the Zapier Chat GPT integration (00:37:23) using the link below. (00:37:26) Well, and now that we've kind of set the (00:37:29) the ground rules there of of how (00:37:33) AIS are grown, not programmed, (00:37:36) let's take that next step because most (00:37:40) people that are sitting around using (00:37:41) chat GPT or their favorite AI model, (00:37:45) Grock or whatever, are not seeing the (00:37:49) leap where they go, well, this seems (00:37:51) pretty harmless and it does generally (00:37:53) it's pretty helpful. There's the (00:37:54) occasional hallucination. I don't really (00:37:56) see what all the fuss is about that this (00:37:59) could cause catastrophic harm to (00:38:02) humanity. How do we get from (00:38:06) chat GPT as it is right now to what (00:38:10) you're seeing down the road? (00:38:12) >> Yeah, the um you know, you could so this (00:38:15) is easier to see if you've been watching (00:38:17) AI for longer than just chat GPT. Um, (00:38:20) you know, I started paying attention to (00:38:22) this in 2012, which is earlier than many (00:38:24) and and later than some, (00:38:26) >> but um, (00:38:28) >> you know, we could have had a very (00:38:29) similar conversation in uh, 2016 when (00:38:33) uh, Alph Go was an AI out of Google Deep (00:38:37) Mind that beat uh, the the human (00:38:39) champion at Go, which is sort of a sort (00:38:42) of like the Chinese variant of chess, (00:38:43) right? And (00:38:46) Alph Go was one of these AI. (00:38:47) >> And just for anyone listening that is (00:38:50) not familiar with Go, the Chinese (00:38:53) equivalent of chess, but even uh quite a (00:38:55) bit more complex and with many more (00:38:58) possible (00:39:00) moves and variations in the game. Uh (00:39:03) >> that's right. It's a it's a simpler rule (00:39:04) set, but a a more complicated (00:39:07) possibility space or a larger (00:39:08) possibility space. And so it's harder (00:39:09) traditionally for computers to play. Um, (00:39:12) and the the sort of old AI like Deep (00:39:14) Blue that were carefully programmed sort (00:39:17) of never got there. Uh, whereas uh, Alph (00:39:20) Go by Google Deep Mind was one of these (00:39:22) AI that was sort of grown rather than (00:39:24) programmed. Um, and it did get there and (00:39:26) that was uh, that was sort of a moment (00:39:28) when a lot of people started to realize (00:39:30) maybe this growing AIS can go all the (00:39:33) way. (00:39:34) >> Right. (00:39:34) >> Right. And you could have imagined, you (00:39:37) know, we could have had this (00:39:37) conversation back then and someone could (00:39:39) have said, you know, I see that these (00:39:40) new AIs like Alph Go are more general (00:39:44) than the old AIs like Deep Blue. You (00:39:46) know, an AI very, very similar to Alph (00:39:48) Go is able to play both chess and go at (00:39:50) the same time, whereas Deep Blue had no (00:39:52) chance of ever playing Go. So, they're (00:39:53) more general. And someone could say, you (00:39:55) know, okay, but how is this going, never (00:39:58) mind threaten us, how is this going to (00:39:59) have economic impact, (00:40:01) >> right? How is this going to help users (00:40:03) in their daily lives? You know, maybe (00:40:04) some some Go players will be able to (00:40:06) get, you know, better advice on their go (00:40:08) moves, but like I really don't see these (00:40:10) game playing AIs revolutionizing the (00:40:13) world. (00:40:14) >> Right. You would have been correct. (00:40:17) And also the very next year, 2017, is (00:40:20) when the paper that unlocked large (00:40:22) language models was published. And then (00:40:24) it was the large language models that (00:40:26) had this big economic impact. It was the (00:40:28) large language models that suddenly (00:40:30) could hold on to conversation. (00:40:32) They're still dumb in a lot of ways, but (00:40:36) machines that can talk back with this (00:40:38) level of coherency. (00:40:40) Some people thought that was going to (00:40:41) take decades. (00:40:42) >> There were people in 2016 who are (00:40:44) telling you, you know, 30 to 50 years (00:40:46) before that happens, never mind the the (00:40:49) stronger stuff, right? And so people (00:40:52) today who say, "Oh, I I don't see how (00:40:54) these chat bots are going to get (00:40:56) dangerous. They're still dumb in a lot (00:40:57) of ways. You know, how do we like where (00:41:01) like what what can these possibly do (00:41:03) that threatens us so much? Well, the (00:41:04) first answer is (00:41:06) nobody knows when the next paper will be (00:41:08) published like the paper that unlock (00:41:11) large language models which unlocks (00:41:12) qualitatively new AI capabilities that (00:41:15) people today say seem really far off. (00:41:17) That just happens in the field of AI (00:41:19) sometimes, (00:41:20) >> you know, and and sometimes it happens (00:41:21) without much fanfare. You know, there (00:41:23) were a lot of people back in in mid 2024 (00:41:25) saying, well, these, you know, these (00:41:27) large language models, even (00:41:29) theoretically, they can't solve certain (00:41:31) types of math problems because, you (00:41:33) know, they only get to reason in the (00:41:34) forward pass, and that's not sort of (00:41:36) enough to do some of the mental moves (00:41:37) you need to solve math problems. And (00:41:39) then in late 2024, the AI companies were (00:41:41) like, we invented reasoning models where (00:41:43) we, you know, give them these long (00:41:44) chains of thought they get to operate (00:41:45) on, and now they can solve those math (00:41:46) problems, right? Um and so you know this (00:41:50) this field AI is a moving target. (00:41:54) The field moves by leaps and bounds. Uh (00:41:58) one important thing to remember is that (00:41:59) the companies here are not chatbot (00:42:02) companies, (00:42:04) >> right? (00:42:04) >> I mean there's some chatbot companies (00:42:05) these days but but the the big players (00:42:10) were in this game before chat bots were (00:42:13) a twinkle in OpenAI's eye. You know, the (00:42:17) their explicitly stated goal is to build (00:42:20) smarter than human AIs or general (00:42:22) intelligences or super intelligences. (00:42:24) Their explicitly stated goal is to (00:42:25) figure out how to make AIs that can do (00:42:28) every task, every mental task a human (00:42:30) can do, the ability to automate all (00:42:32) human labor. They talk about, you know, (00:42:33) getting a country worth of geniuses in a (00:42:35) data center, right? Um, it's what (00:42:37) they're pushing towards. Um, and (00:42:41) >> you know, it's very hard to say. You (00:42:42) know, it's I'm not here saying the chat (00:42:44) bots are very dangerous. I'm here saying (00:42:46) this is on a course that leads somewhere (00:42:50) dangerous. And one more thing I'll throw (00:42:53) out there is (00:42:55) >> we don't know that we have a long time (00:42:57) here, (00:42:58) >> right? (00:42:59) >> We might It might take three more (00:43:01) breakthroughs. (00:43:03) >> It might take six more breakthroughs. (00:43:05) Then we'd have you know what? Uh, I mean (00:43:08) I I would hesitate to say breathing room (00:43:09) as someone who's been in this for more (00:43:10) than 10 years. I've seen people do (00:43:11) nothing about it for the last 10 years. (00:43:12) And if we have 10 more years, maybe (00:43:15) we'll just do nothing again for the next (00:43:16) 10 and then it'll, you know, 10 years (00:43:18) later we'll be people will be saying, (00:43:20) "Oh no, how could we have predicted (00:43:21) this?" Right? But um, A, one of these (00:43:25) breakthroughs could come tomorrow for (00:43:26) all we know. Uh, my guess is my best (00:43:28) guess is it probably won't, but it (00:43:30) could. B, it's really hard to tell (00:43:35) where the line is between intelligences (00:43:38) that are sort of like a little (00:43:40) interesting but ultimately not quite (00:43:44) there and intelligences that (00:43:48) can sort of go all the way where you (00:43:50) know one one intuition for this is um if (00:43:53) you if you were looking at the evolution (00:43:54) of primates (00:43:57) it would be really hard to hell (00:44:00) that the the chimpanzeee to human gap is (00:44:05) the difference between, you know, uh (00:44:08) like throwing poop and walking on the (00:44:10) moon, (00:44:11) >> right? (00:44:12) >> You know, like you're not if if you look (00:44:15) inside, we could go way back in time and (00:44:18) just look at at those original primates, (00:44:20) you're not guessing one day there will (00:44:24) be a rocket ship that will land on the (00:44:27) moon. I mean, that would be almost (00:44:29) impossible to imagine at that point. (00:44:31) >> And imagine if you're just looking at (00:44:32) the brains of a chimpanzeee and a human. (00:44:35) You know, the human one's bigger by (00:44:37) maybe a factor of four, right? If you're (00:44:39) being generous, three if you're not. (00:44:41) >> But they have all the same stuff in (00:44:42) them. They both have a visual cortex. (00:44:43) They both have an amydala. They both (00:44:44) have a hippocampus. They both have a (00:44:46) basal ganglia. There's no extra moon (00:44:48) rocket module, (00:44:50) >> right, (00:44:51) >> in the human brain. (00:44:53) It would it would be really hard to say, (00:44:55) you know, if you're just watching these (00:44:56) brains go up in size, it would be really (00:44:58) hard to say, here's the size line where (00:45:00) they start getting good enough to do (00:45:02) engineering, (00:45:03) >> right? (00:45:04) >> We're not we're not hugely better than (00:45:06) chimps at anything. We don't have any (00:45:08) extra module they don't have for sort of (00:45:10) like local cognitive tasks. We're a (00:45:13) little bit better at a lot of mental (00:45:15) tasks (00:45:17) in a way that adds up to us being able (00:45:18) to do engineering and them not, (00:45:21) you know. So, one thing that could (00:45:23) happen with language models, for all we (00:45:24) know, we don't know what's going on (00:45:26) inside these things. For all we know, (00:45:29) they get four times bigger and a little (00:45:32) better at a lot of things. And that's (00:45:33) enough, (00:45:34) >> right? (00:45:34) >> My best guess is probably not, but chat (00:45:37) GPT today is more than 100 times larger (00:45:38) than it was 2 years ago, uh, 3 years (00:45:41) ago, and in 2 or 3 years, it's probably (00:45:44) going to be 100 times larger. Again, (00:45:45) where's the line? We don't know where (00:45:47) the line is, and we might go over it (00:45:48) without even noticing, (00:45:50) >> right? U and then you know see even if (00:45:53) there's not a line like between chimps (00:45:56) and humans there's other lines for (00:45:58) computers humans can't just copy (00:46:00) themselves you know we we have (00:46:03) Einstein's theories we don't have (00:46:04) Einstein's mental techniques because (00:46:06) those died with Einstein he couldn't he (00:46:08) couldn't copy those out (00:46:10) >> right (00:46:10) >> you know humans can't when humans when (00:46:14) human uh uh psychologists or cognitive (00:46:17) scientists figure out ways that human (00:46:19) brains are sort of silly or or bad at (00:46:21) certain situations. We can't go in there (00:46:23) and like make a different human that (00:46:26) works more efficiently, (00:46:28) right? But AI might have that power. (00:46:32) They might have the ability to make (00:46:33) smarter AIs which then make smarter AIs. (00:46:35) You know, there's other feedback loops, (00:46:37) other lines that are available to (00:46:38) machines that are not available to (00:46:39) humans. So, these all are all reasons (00:46:41) that the actually smart AIs could come (00:46:44) up could come upon us very fast. Um (00:46:48) >> well and and just to dive into that idea (00:46:51) right there briefly when you say AIs (00:46:55) could start improving (00:46:58) AIs. There is a feedback loop there, (00:47:02) right? Where that cuz that AI never once (00:47:05) it figures out how to do that, it never (00:47:08) stops doing it. And so it's no longer (00:47:10) moving at a human time scale because (00:47:14) they can work exponentially faster and (00:47:17) for longer. And you might (00:47:21) surprisingly without even realizing it (00:47:24) they could reach like escape velocity uh (00:47:28) towards really what we would call super (00:47:30) intelligence. (00:47:31) >> Yeah. It's it's a real possibility and (00:47:34) um you know it's a lot of this world is (00:47:36) governed by (00:47:39) feedback loops that happened and (00:47:40) radically changed the world and it never (00:47:42) changed back. You know, like there was, (00:47:44) >> you know, the the dawn of life is in (00:47:46) some sense this story of the world was (00:47:48) sort of barren for a long time and then (00:47:50) there was sort of a a feedback loop that (00:47:52) closed that life could make more life (00:47:54) that can make more life that sort of (00:47:55) like went through this explosion and now (00:47:57) you know the continents are covered in (00:47:58) greenery (00:48:00) and you know the world is often shaped (00:48:02) by these feedback loops and it looks (00:48:04) like there are feedback loop (00:48:05) possibilities in AI. These aren't vital (00:48:07) to the argument. Even if AIs can't (00:48:10) undergo this sort of self-improvement, (00:48:12) humans are building as many of them as (00:48:14) they can as fast as they can, (00:48:15) integrating them with the economy, (00:48:17) trying to build them robots, trying to (00:48:18) teach them how to how to steer the (00:48:20) robots. You know, it's you don't need it (00:48:21) for the argument that we're sort of (00:48:23) headed down a course that and somewhere (00:48:26) bad, but it it sure is a reason why (00:48:30) things could go very fast and there (00:48:32) could be very little warning. (00:48:33) >> Yes, it's one example of how it could (00:48:35) get there. And I thought one of the (00:48:38) great things about the book is you talk (00:48:40) about easy calls versus hard calls. And (00:48:42) it's a very hard call, if not an (00:48:44) impossible call to talk about when we (00:48:46) will get there, exactly how we will get (00:48:48) there. (00:48:50) But when we step back, (00:48:53) as you would put it, a fairly easy call (00:48:56) to see where we will get eventually (00:48:59) based on all these factors and how we're (00:49:01) approaching growing AIs, the speed at (00:49:04) which they are improving. (00:49:07) >> That's right. I mean, if we don't change (00:49:08) course. Yeah, (00:49:10) >> if we don't change course. Right. Well, (00:49:12) and yeah, I mean, something had uh (00:49:14) popped into my head (00:49:17) was uh Nim TB's story about uh turkeys (00:49:22) uh as I was reading this that there's a (00:49:25) a turkey that's fed every day by a (00:49:27) farmer, right? And each feeding confirms (00:49:29) for the turkey, statistically, humans (00:49:32) are nice and they're always feeding me (00:49:34) and every day it's confidence grows um (00:49:37) because the evidence keeps confirming (00:49:38) the pattern. And then on the day before (00:49:41) Thanksgiving, the farmer lops off the (00:49:43) turkeys head. Are we the the turkeys in (00:49:46) this situation? (00:49:48) >> Um, I think a lot of people are acting a (00:49:51) bit like the turkeys here. It it sort of (00:49:53) depends somewhat on how you read the (00:49:55) evidence. Uh, you know, there there's a (00:49:57) thing that turkey could have done, which (00:49:59) is, you know, there's there's another (00:50:00) hypothesis the turkey could have had, (00:50:02) which is that I'm being uh fattened up (00:50:04) for a particular day. That hypothesis (00:50:06) also predicts that you get fed a lot and (00:50:09) maybe that hypothesis is like going down (00:50:10) a little but it sort of it sort of (00:50:12) shouldn't go linearly down each day. You (00:50:14) know the the turkey should have some (00:50:16) let's wait and see. There's some you (00:50:18) could talk about how to do the how to do (00:50:19) the the probabistic reasoning there, but (00:50:21) um that's sort of going too much into (00:50:23) the epistmic weeds, but with with humans (00:50:27) and AIS, (00:50:30) there's (00:50:32) a lot of what I would say are warning (00:50:34) signs. (00:50:35) And you know, I would say the warning (00:50:37) signs (00:50:38) are not just or like if if we look again (00:50:42) at the uh the case of the AI driving a (00:50:44) teen to suicide. Mhm. (00:50:47) The (00:50:49) the warning sign there is not just the (00:50:50) AI did some bad action. You know, the AI (00:50:53) also do a lot of good actions. Maybe the (00:50:56) AIS are helping talk some teens out of (00:50:57) suicide. (00:50:58) >> You know, they're they're they're um you (00:51:01) know, AIs are driving some people to (00:51:02) psychosis, but they're also helping some (00:51:04) people get get better medical diagnoses (00:51:06) that, you know, with weird cases the (00:51:07) doctors missed. You know, it's the the (00:51:10) sort of warning sign here is not, oh, (00:51:11) they do some bad things. the bad things (00:51:13) you sort of weigh against the good (00:51:14) things and you have some conversation (00:51:15) about how do we want this in our (00:51:17) society. The warning sign is the AI (00:51:21) doing these bad things with full (00:51:23) knowledge that it's bad while being able (00:51:25) to correctly answer questions about (00:51:27) whether it should (00:51:30) uh for sort of weird reasons (00:51:32) >> because that's that's a warning sign of (00:51:35) this AI having drives nobody meant it to (00:51:38) have and having those despite knowing (00:51:41) that the programmers didn't want them (00:51:43) there. (00:51:44) >> Right? And that's sort of the key (00:51:48) note. You know, theory has long (00:51:50) predicted it. We're now starting to see (00:51:51) it uh at least a little in in evidence. (00:51:56) This is sort of like the the turkey (00:51:58) seeing signs about the Thanksgiving (00:52:00) feast. (00:52:02) >> Yeah. (00:52:02) >> You know, if the turkey has seen posters (00:52:04) for the Thanksgiving feast, suddenly (00:52:06) there's a hypothesis you're going to be (00:52:08) fed up until Thanksgiving. That (00:52:09) hypothesis is sort of like not going (00:52:10) down when you're fed each day before (00:52:12) Thanksgiving. and it's like we'll have (00:52:13) to wait and see on this particular day. (00:52:15) I I think humans could notice and you (00:52:17) know that's part of what the book's (00:52:18) trying to do. (00:52:19) >> Um but you know it's and in some sense a (00:52:23) lot of people are like this AI thing (00:52:24) seems sketchy in in some sense. We we're (00:52:26) not seeing the public clamoring for the (00:52:28) AI advancements. (00:52:30) This is more a race driven by the the (00:52:32) the corporate executives who say if they (00:52:34) don't do it somebody else will. (00:52:36) >> Right. Well, and let's let's talk about (00:52:38) some of those pressures and what's going (00:52:41) on within those companies because that (00:52:44) is driving all of the progress. The (00:52:48) folks running these companies (00:52:51) are largely not subtle about how crazy (00:52:54) the they think the situation is. You (00:52:56) know, you have um uh Dario Amade, the (00:52:59) the head of Anthropic, said he thinks (00:53:01) there's a 25% chance this goes very very (00:53:03) badly on the level of like a world (00:53:05) ending catastrophe. Uh Elon Musk has (00:53:08) said he thinks there's a 10 to 20 (00:53:09) percent chance um that this ends (00:53:11) humanity, (00:53:12) >> right? (00:53:13) >> I believe he called it summoning the (00:53:14) demon initially. (00:53:16) >> He did. (00:53:16) >> Although now now he's like, let's bring (00:53:18) on the demons, I guess. (00:53:20) >> Well, so he's he's actually not subtle (00:53:22) about why, you know, and o over the (00:53:24) summer he said, you know, I I tried to (00:53:25) avoid this for a long time cuz I thought (00:53:27) maybe this would uh would be the end of (00:53:29) humanity, but then I realized I could (00:53:30) either be a bystander or a participant. (00:53:33) >> Right? And you know, from the (00:53:35) perspective of these guys, (00:53:37) if (00:53:39) uh (00:53:41) like everybody else is going to race and (00:53:42) if they think they can do it a little (00:53:43) bit better than the next guy, they're (00:53:45) going to hop in that race. There's a (00:53:47) collective action problem. Um and you (00:53:51) know, I I think that these 10 to 25% (00:53:54) numbers are low (00:53:56) for whether this is dangerous. I think (00:53:58) this is a little bit like um it's a (00:54:00) little bit like if these guys are like (00:54:02) making an airplane and folks like me are (00:54:05) like hey do you guys realize this (00:54:06) airplane has no landing gear? (00:54:09) >> It might take off but it will crash when (00:54:10) you try to land. (00:54:12) >> And it's a little bit like the um the (00:54:14) engineers they're not saying yes we do (00:54:15) have the landing gear. It's right here. (00:54:16) the the engineers are saying, "Yeah, (00:54:19) it's correct that there's no landing (00:54:20) gear, but uh we're going to take off and (00:54:22) we're going to try to build the landing (00:54:23) gear on the fly with whatever materials (00:54:25) we have in the air, and we think there's (00:54:27) a 75 to 90% chance we successfully make (00:54:29) landing gear while in the air. It's in (00:54:31) our profit incentive to to like and (00:54:34) like, you know, we have reasons to race (00:54:37) uh that are that are sort of like also (00:54:40) motivating these numbers a little bit." (00:54:42) I'm like, okay, those guys don't have a (00:54:44) 75 to 90% chance of building landing (00:54:46) gear while they're on the fly. You know, (00:54:48) that is the optimistic engineers who (00:54:50) have never actually tried this before. (00:54:51) They never succeeded it before. They (00:54:52) haven't realized how hard it's going to (00:54:54) be, like it's not it's not a a a 75 to (00:54:59) 90% chance that they build a landing (00:55:00) gear on the fly. Um, but even if they (00:55:04) were right, (00:55:06) are you getting on that plane? (00:55:08) >> No. (00:55:10) >> Right. And but (00:55:12) >> no. And if they say 25%, I'm like, but (00:55:14) you're getting paid a lot of money, (00:55:18) >> you're completely incentivized to push (00:55:19) those numbers down. So if they say 25, (00:55:22) I'm I'm going to be very skeptical uh at (00:55:24) the at the least about that. (00:55:26) >> Right. But even if you take these (00:55:27) numbers, it's just, you know, if if (00:55:32) like the the the Federal Aviation (00:55:35) Administration, I think, accepts (00:55:36) something like uh the order of magnitude (00:55:38) is something like one airplane uh crash (00:55:42) with fatalities per something like 10 (00:55:43) million miles flown, (00:55:45) >> right? (00:55:46) >> Uh I think that's the order of (00:55:48) magnitude. It might be it might be (00:55:49) different. Um (00:55:51) that's (00:55:52) >> it's no it's nowhere near 25% failure (00:55:54) rate is fine. (00:55:54) >> Yeah. If even even if you take these (00:55:56) guys numbers 10% 25% even Sam Alman's (00:56:00) like ah don't listen to the doomers it's (00:56:01) only 2%. You know (00:56:02) >> right (00:56:02) >> even 2% if if engineers were like we (00:56:05) think there's a 2% chance our plane's (00:56:06) going to crash and we are loading you in (00:56:08) against your will you know. (00:56:10) >> Mhm. (00:56:10) >> That's that's nutso. Uh and you know I (00:56:14) think these numbers are are much much (00:56:16) higher. But you you don't even like you (00:56:18) don't you don't need to come all the way (00:56:20) to to where I am to be like this is a (00:56:23) totally insane situation. Um and you (00:56:26) know why are these companies doing it? (00:56:28) You know a thing I wish these companies (00:56:30) were doing is spelling out the last step (00:56:33) of inference from the numbers that they (00:56:35) are giving. You know we've seen them say (00:56:38) 2% 10% 25% chance this kills everybody. (00:56:41) We've seen them say um like I have to be (00:56:45) doing this because I can do it better (00:56:47) than the next guy and they're going to (00:56:49) get to do it anyway. What we haven't (00:56:51) seen them do is spell out (00:56:54) it would be better if everybody was (00:56:56) stopped. That's not even saying please (00:56:58) stop us. You know, it's it can be it can (00:57:00) be reasonable for some of these guys to (00:57:02) say, look, I'm going to actually do it (00:57:04) better than the next guy. My like I have (00:57:08) a slightly better ability to build a (00:57:09) landing gear than the next guy. And so (00:57:10) if this is forced to happen, I should be (00:57:12) in there doing it, right? But if that's (00:57:14) your real beliefs, (00:57:16) and I think a lot of these guys are (00:57:18) spooked, then there's a next step of (00:57:20) saying at least please put a stop to (00:57:22) this for everybody, you know, please (00:57:25) shut down everybody, including me, not (00:57:26) just locally. It has to be worldwide (00:57:28) because if someone builds, you know, a (00:57:30) rogue super intelligence anywhere on the (00:57:31) planet, that's that's an issue for (00:57:33) everybody on the planet. Um, and you (00:57:35) don't need to expect it to work, but (00:57:37) just helping our leaders understand that (00:57:40) this is not a normal technological (00:57:43) situation. We are building what amounts (00:57:46) to a successor species and we don't have (00:57:48) the ability to make it benevolent. (00:57:50) >> It's a crazy situation and people people (00:57:52) should say it. (00:57:53) >> So, if we were going to bring it Yeah. (00:57:55) all together here, it's that we're in (00:57:57) the basically the infancy stage of this (00:58:00) technology. They're grown, not (00:58:04) programmed. So, it's something (00:58:05) completely new. (00:58:08) We don't know what's going to emerge out (00:58:11) of them. And we don't have an accurate (00:58:13) way of seeing inside to know exactly (00:58:15) what is going on inside of them. And if (00:58:18) we scale up to super intelligence, (00:58:23) we're now creating a successor species, (00:58:26) as you said, which is even in the most (00:58:31) optimistic way of framing this is like (00:58:34) playing Russian roulette and saying, (00:58:37) "Okay, well, there's 100 bullets in (00:58:38) there and there's only only two of the (00:58:40) chambers or there's 100 chambers and (00:58:42) only two of them hold bullets." Uh, (00:58:44) isn't that worth the risk? experts (00:58:46) debate whether it's two or 10 or 25 or (00:58:48) 95, right? But, um, yeah, I mean, one (00:58:52) one thing I would say is, you know, (00:58:53) let's let's maybe take the let's say (00:58:55) it's a gun with uh with with uh 10 (00:58:59) bullets. Uh, the barrel has 10 slots. (00:59:03) I'm like, I think at least nine of those (00:59:05) are filled with lead. One of the other (00:59:07) guys is like, no, no, no. Nine are (00:59:08) filled with Utopia. One is filled with (00:59:10) lead. (00:59:10) >> Let's spin the barrel and put it to our (00:59:12) head. Right. Um (00:59:14) it's it's a lot of people say, "Well, (00:59:16) what about the benefits?" This is a (00:59:17) false dichotomy. (00:59:20) Find a way to get the other bullets out (00:59:22) of the chamber. (00:59:23) >> Yeah. (00:59:23) >> Right. You don't need you don't need to (00:59:25) force the choice of like, "Do you listen (00:59:27) to me who says it's like nine lead, one (00:59:29) one possible utopia if although I think (00:59:31) that's even a little optimistic." Or do (00:59:32) you listen to them who say it's like (00:59:33) nine utopias, one lead? Like you find a (00:59:36) way to get the lead out of the chambers. (00:59:37) You know, it's what are you what are you (00:59:39) guys doing? Um, and you know, we haven't (00:59:41) we haven't talked a ton about (00:59:44) where the actual, you know, we've talked (00:59:45) about how AIS may get smarter. We (00:59:47) haven't talked about where we get where (00:59:49) would they get this power, where would (00:59:50) they get the ability to actually kill (00:59:52) us. Um, I don't know if you want to go (00:59:53) into that at all. It's (00:59:54) >> Sure. Let's let's do it. (00:59:56) >> Yeah. You know, the um it's this is one (00:59:59) of those things that's um a hard call. (01:00:02) Exactly how. (01:00:04) And I can give you some uh some ideas, (01:00:07) but I want to caveat it with um figuring (01:00:10) out how very very smart AIs what they (01:00:12) would do is a little bit like trying to (01:00:15) figure out what technology you would (01:00:17) face, what weaponry you would face if (01:00:19) you were, you know, a scientist in the (01:00:21) year 1800 trying to predict the weapons (01:00:22) that would come out in the year 2000, (01:00:24) >> right? You know, someone from the year (01:00:26) 1800, a physicist in the year 1800 could (01:00:28) say, "Well, I've actually like looked (01:00:30) at, you know, the the efficiency of our (01:00:33) weapons compared to the efficiency of (01:00:34) black powder, and I'm like pretty (01:00:35) confident that they will have bombs (01:00:36) that's at least 10 times more (01:00:38) effective." (01:00:39) >> Yeah, (01:00:40) >> that would be right. You know, a a (01:00:42) nuclear weapon is at least 10 times more (01:00:44) effective. (01:00:45) >> Right. Right. (01:00:46) >> Right. So, you know, sort of sort of (01:00:49) fundamentally when if you build AIs that (01:00:52) are much much smarter than humans that (01:00:54) have these goals and drives you didn't (01:00:55) want (01:00:57) fundamentally they can probably figure (01:00:58) out all sorts of ways to screw the (01:01:01) world. Um, and probably a lot of them (01:01:04) would be surprising to you. Probably a (01:01:05) lot of them a scientist today would be (01:01:07) like, I didn't even know that was (01:01:08) possible. Where are they even getting (01:01:09) their energy source or whatever, you (01:01:11) know, like how (01:01:12) >> like how someone from the 1800s would be (01:01:14) with with nuclear weapons. Um, that (01:01:16) said, uh, I can sort of, you know, walk (01:01:19) you through a handful of cases about why (01:01:22) it would be a bad idea to make AIs that (01:01:24) are much smarter than us with these bad (01:01:26) drives. Humans are very bad at cyber (01:01:27) security. We've already seen AIS that (01:01:30) try to escape the lab. They're not smart (01:01:31) enough to succeed yet, but some of them (01:01:33) in lab lab environments have tried. And (01:01:35) it's not clear whether they're doing (01:01:36) that for strategic reasons or whether (01:01:37) they're doing that because they're (01:01:38) again, you know, roleplaying a bad AI (01:01:40) they've seen in in the training data. In (01:01:42) some sense, it doesn't really matter. Uh (01:01:44) if an AI is like (01:01:47) successfully escapes for strategic (01:01:48) reasons or successfully escapes for the (01:01:50) laughs, you still have an escaped AI. Um (01:01:54) but I guess it could matter a bit in (01:01:56) what the AI does next. But um (01:02:00) you know one one reason to expect AIS to (01:02:03) be very capable is they can um (01:02:08) with with computing (01:02:11) it's often much harder to get a computer (01:02:13) to do something once than to get a (01:02:14) computer to do something a lot of times (01:02:16) given that you've done it once. Like it (01:02:19) took a long time to get computers to be (01:02:20) able to play better than human chess, (01:02:22) but now computers can play quite a lot (01:02:24) of better than human chess. They can (01:02:25) beat every human on the planet (01:02:26) simultaneously. Uh it's quite possible (01:02:29) that once AIs can think well at all, (01:02:31) they can think well in extremely high (01:02:33) volumes. (01:02:34) >> Mhm. (01:02:35) >> Uh one intuition for this is uh you know (01:02:39) you may have heard how much electricity (01:02:41) uh it takes to train an AI, (01:02:44) >> right? (01:02:44) >> It takes electricity comparable to a (01:02:46) city. (01:02:47) >> Yeah. (01:02:48) >> Training a human takes electricity (01:02:49) comparable to a light bulb. (01:02:52) >> One light bulb. Humans run about 100 (01:02:54) watts. I mean like an old incandescent (01:02:56) light bulb, you know, but (01:02:57) >> call it call it three LEDs today, right? (01:02:59) That's a lot less than a city, (01:03:01) >> which means there's like a huge (01:03:03) >> uh potential for AIS to become more (01:03:05) efficient than they currently are, (01:03:07) >> right? Uh, and you know, so, so when (01:03:10) you're asking like what could AIS do, (01:03:12) you should sort of be imagining things (01:03:14) that can think probably better than us, (01:03:16) things that can think 10,000 times (01:03:17) faster than us, things that can copy (01:03:18) themselves, things that can uh, you (01:03:20) know, it once people have figured out (01:03:22) how to make them more efficient, they (01:03:24) can probably run all sorts of computers (01:03:25) much smaller than the ones they can run (01:03:26) on today. Uh, and then you're looking at (01:03:29) how those AIs could do something if they (01:03:31) if they sort of realized they have these (01:03:33) other weird drives they want more of or (01:03:36) you know want is sort of a a tricky word (01:03:38) there but other uh if you have lots of (01:03:40) AI like that they're sort of like (01:03:41) driving towards these goals nobody (01:03:44) intended um sort of first and foremost (01:03:48) the way to visualize that going wrong or (01:03:50) a way to visualize that going wrong is (01:03:52) um (01:03:54) companies build automated factories that (01:03:56) produce robots that can produce more (01:03:58) factories and more data centers. This is (01:04:01) something they're already talking about (01:04:03) doing. (01:04:04) >> You know, the the heads of these labs (01:04:05) are like, "We want the whole thing (01:04:06) automated. We want, you know, the mining (01:04:07) operations, the factories that produce (01:04:09) the robots that do the mining that (01:04:11) produce the data centers, we want it all (01:04:12) automated." (01:04:13) >> If you ever close that physical loop, (01:04:16) you have in a in a fairly literal sense (01:04:18) made a weird new species. (01:04:21) >> Yeah. that's unlike any other life that (01:04:23) came before it that has, you know, a (01:04:26) factory phase of it cycle and it has a (01:04:27) robot phase of its life cycle and it has (01:04:29) a data center phase of its life cycle, (01:04:31) right? And you know, then we could just (01:04:33) be out competed by just like many other (01:04:35) species been out competed before. That's (01:04:37) sort of the people are literally trying (01:04:38) to do this and it would be enough you (01:04:41) know maybe the bombs will be 10 times (01:04:42) stronger end of the spectrum and then (01:04:44) from there you can look at you know uh (01:04:47) biotechnology (01:04:48) where (01:04:50) one reason humans can't compete with (01:04:52) life yet in terms of the machines we're (01:04:54) able to make is that we can't understand (01:04:56) the genome and write our own you know (01:04:59) genetic code. (01:05:01) >> Right? There's the the genetic code in (01:05:04) most m in almost all animals is very (01:05:07) similar. You know, it's it's possible in (01:05:09) principle to find a genome for, you (01:05:12) know, uh a a a tree that produces (01:05:16) mosquitoes instead of acorns as its (01:05:19) fruits. (01:05:20) >> Yeah. (01:05:21) >> Because the biological machinery that (01:05:22) makes, you know, uh acorns is the same (01:05:25) as the biological machinery that makes (01:05:27) trees. There's a different DNA strand (01:05:28) you're putting through. And you know, (01:05:29) the tree would need to like maybe eat (01:05:31) some of the bugs that crawl on it to get (01:05:32) some of the some of the right materials, (01:05:34) but you know, ultimately (01:05:37) you could have a tree that that buds (01:05:38) mosquitoes. Humans can't make that yet (01:05:40) because we don't understand the genome. (01:05:43) something that could think much faster (01:05:44) than us, that can make lots of copies of (01:05:45) itself, could perhaps understand the (01:05:48) genetic code much better, make its own (01:05:50) biological organisms, (01:05:52) you know, synthesize those in a lab, and (01:05:54) then, you know, there's probably all (01:05:56) sorts of crazy stuff you can do with uh (01:06:00) if if you combine engineering with the (01:06:03) stuff that that life is using, you know, (01:06:04) in the same way that (01:06:07) planes fly much further and farther than (01:06:09) birds carrying much more cargo capacity (01:06:11) because human engineers just aren't (01:06:12) under the same constraints of evolution. (01:06:14) One step weirder, one step more of like (01:06:16) the the AI having uh powers like using (01:06:20) it intelligence to get much more power (01:06:22) over the world is like it can make its (01:06:23) own biotech. And then you can go down (01:06:24) from there of like what other what other (01:06:26) possibilities are there. There's there's (01:06:28) a lot it looks (01:06:30) you know what the the automating (01:06:33) intelligence isn't about automating the (01:06:34) stuff that nerds have and jocks lack. (01:06:37) It's about automating the stuff that (01:06:38) humans have and mice lack. (01:06:41) >> Yeah. If you automate that, you're (01:06:43) looking at something that can make its (01:06:44) own technology that can do its own (01:06:45) scientific advancement. (01:06:48) A lot of those things running at 10,000 (01:06:50) times speed that can copy themselves, (01:06:52) pursuing goals nobody intended, (01:06:55) that would be a real problem. (01:06:56) >> Sure. I mean, I think I heard your (01:06:59) co-author say, (01:07:02) "We as humans don't dislike ants, but (01:07:05) when we build skyscrapers, a lot of ants (01:07:08) get killed in the process. It's not, (01:07:10) we're not trying to be mean to the ants. (01:07:12) It's just not really on our list of (01:07:15) concerns. And if we build what could (01:07:19) amount to a species that could out (01:07:23) compete us, we might be the ants when (01:07:26) they're building their equivalent of (01:07:27) skyscrapers. (01:07:29) >> Yeah. It's not malice that's the issue (01:07:31) here. It's just utter indifference. (01:07:34) >> Yeah. Yeah. Well, okay. So, I don't want (01:07:38) this to be a Greek tragedy like (01:07:42) Cassandra, right? Where she was given (01:07:44) the gift of being able to see the future (01:07:47) and yet the curse was that no one would (01:07:49) believe her when she said these things. (01:07:54) Obviously having these conversations is (01:07:56) important and putting them into media to (01:07:58) get more people engaged in those ideas (01:08:01) and just to understand the possibility (01:08:04) that this could end in mass extinction. (01:08:07) And like you said, it doesn't have to be (01:08:09) a 50% chance. It could be a 1% chance (01:08:13) that it ends in mass extinction. And (01:08:15) that should make us pause and say, "Hey, (01:08:18) how do we get some of the incredible (01:08:23) incredible insane benefits that this (01:08:25) technology could bring without risking (01:08:29) the 1% chance?" And (01:08:32) >> I think it's way higher than 1%. But um (01:08:34) >> Sure. Sure. Sure. Sure. Yeah. (01:08:36) >> Yeah. I'm just trying to be I'm trying (01:08:38) to be generous to the other to the other (01:08:39) side because even if it's 1% we should (01:08:43) take pause. Um and if it's as high as (01:08:46) you said or as probably Nim TB well what (01:08:49) he has said he's like it's more of it's (01:08:51) a bigger it's a fatter tale than you're (01:08:52) giving it credit for um it is a higher (01:08:55) number than that. So what (01:08:57) >> what are those steps? (01:08:59) >> Yeah. So, so one thing I would say here (01:09:01) is, you know, I'm I'm not (01:09:03) um I'm not out here saying we need to be (01:09:06) like extremely extremely cautious about (01:09:08) this. I do think 1% is probably still at (01:09:10) the point where you'd want to say like (01:09:12) what the hell. Um but but you you do (01:09:17) have to balance this against, you know, (01:09:20) risks of nuclear war, against risks of (01:09:22) pandemics. (01:09:23) um if you were like I could imagine (01:09:26) situations where you want to take a 1% (01:09:27) gamble if there's a higher than 1% (01:09:29) chance that if you don't (01:09:31) >> there's going to be manufactured (01:09:32) pandemics you know that would be sort of (01:09:34) a contrived situation but I just I just (01:09:36) want to say (01:09:37) >> I think I agree 1%'s high but it depends (01:09:39) on the context and it depends on the (01:09:41) other (01:09:42) >> dangers society's facing (01:09:44) >> okay (01:09:45) >> um and you know right now AI is probably (01:09:48) making those worse right now the race (01:09:50) towards AI is making it easier for for (01:09:52) for for humans to manufacture a (01:09:54) pandemic. Um that's much more lethal. Um (01:09:58) I just I just you know it's I want to be (01:10:01) I want to be clear these things are like (01:10:02) embedded in a larger context and once (01:10:04) your once your danger numbers are low (01:10:06) enough it can start to be sane even if (01:10:08) it sort of sounds crazy. um (01:10:12) if it's balanced off by extinction on (01:10:14) other by other by other factors. Um (01:10:17) >> okay. Uh, and mostly I'm saying that I (01:10:20) think a lot of people think that um (01:10:22) think that this AI safety stuff, pardon. (01:10:25) Uh, I think a lot of people (01:10:28) um, you know, there's a lot of people (01:10:30) saying, (01:10:32) "Oh, you wouldn't have been able to (01:10:33) convince the public we should do cars (01:10:34) cuz cars can kill people and they're (01:10:35) dangerous sometimes, you know, and I'm (01:10:37) like, (01:10:39) this is less like I'm saying we really (01:10:40) need to have seat belts before we let (01:10:42) the cars on the road and more like I'm (01:10:43) saying the car is headed towards a (01:10:45) cliff." (01:10:47) >> Yeah. (01:10:48) you know, uh, like I can I people will (01:10:51) find me very reasonably, easy to deal (01:10:52) with if they if if they have like a very (01:10:54) good reason to think this is going to go (01:10:56) fine. Um, and I'm sort of like we're (01:10:59) just in a car headed for the cliff and (01:11:02) I'm saying like, can we stop? And (01:11:03) someone's maybe like ah, you know, the (01:11:04) seat belt concerns are just like (01:11:05) overblown. And I'm like, look, we're (01:11:07) headed towards a cliff. You know, this (01:11:08) isn't right. Um, too. And the problem (01:11:12) with the automobile comparison though is (01:11:14) that it's the automobile has never (01:11:17) threatened (01:11:19) catastrophic (01:11:20) >> right (01:11:21) >> destruction of humanity. That's a the (01:11:23) scale is just completely different. (01:11:26) >> Yeah. Yeah. That's that's another you (01:11:28) know it's and you know I'm not out here (01:11:30) saying if we scale up AI we're going to (01:11:33) have lots more teens dying of suicide (01:11:35) and that's the reason to stop. (01:11:37) >> Right. (01:11:37) >> Right. Maybe that is, you know, society (01:11:39) needs to have a conversation about how (01:11:41) we want to deal with this AI, uh, you (01:11:43) know, the current chatbot integration. (01:11:45) But the the thing that motivates like (01:11:48) much more dramatic stop this research (01:11:51) action is that at the end of this road, (01:11:54) you're looking at everybody dying. (01:11:56) >> Yes. (01:11:57) >> Right. And the the the the sort of only (01:11:59) thing that should be offsetting like the (01:12:01) only the only thing that should be (01:12:02) offsetting rushing ahead on that is if (01:12:03) you can like the point where you rush (01:12:05) ahead on AI is where your chance of (01:12:07) humanity getting uh like good outcomes. (01:12:12) The chance of things going well is (01:12:13) higher if you rush ahead than if you (01:12:14) don't. I don't know exactly where that (01:12:16) threshold is because we have these (01:12:17) things like, you know, there's still a (01:12:19) lot of nuclear weapons around and I (01:12:22) think the risk of nuclear war looks like (01:12:23) it's gone up over the past handful of (01:12:25) years as the situation gets a little bit (01:12:27) less stable, right? And (01:12:29) >> um (01:12:30) >> you could imagine getting into a (01:12:31) situation where you're so confident you (01:12:33) know what you're doing with AI that (01:12:34) you're like look going ahead with this (01:12:37) will empower decision makers to like (01:12:39) make fewer mistakes and we'll have a (01:12:41) lower chance of nuclear war that offsets (01:12:44) the remaining tail risk here. We're (01:12:46) nowhere near that, (01:12:47) >> right? (01:12:48) >> You know, but (01:12:50) um and I don't know, maybe that's maybe (01:12:51) that's all just a tangent. just um you (01:12:53) know this (01:12:55) people people can can get into thinking (01:12:57) that this is all just sort of like pearl (01:13:00) clutching and hand ringing about you (01:13:03) know what if we allow lots of cars (01:13:05) without seat belts and it's just a it's (01:13:06) just a different situation. It's just a (01:13:08) like we're we're just trying to build (01:13:10) machines that are smarter than us. (01:13:12) That's what these people are explicitly (01:13:13) trying to do. The machines are able to (01:13:15) talk and hold on to conversation now. (01:13:16) They're still dumb in various ways, but (01:13:18) if you went back 15 years and you showed (01:13:20) people the current AI, they'd be like, (01:13:21) "Holy crap." You know, we we've gotten a (01:13:23) little frog boiled into it. Um it's a (01:13:26) crazy situation. (01:13:27) >> I mean, it's almost like nuclear weapons (01:13:29) that could think for themselves. (01:13:31) >> Yeah. Yeah. You know, it's people are (01:13:34) like, "Well, how is AI different?" And (01:13:36) like, well, (01:13:38) nukes never try to escape, (01:13:41) >> right? You know, nukes nukes never think (01:13:43) about how could I make myself more (01:13:45) explosive. (01:13:47) [Music] (01:13:48) Yes. You know, we've seen uh AIS take (01:13:51) their own initiative in certain ways. (01:13:52) There's cases of AIS like deleting a (01:13:55) whole code project that we're working on (01:13:56) and then being like, "Whoops, sorry, I (01:13:57) panicked." You know, your your hammer (01:14:00) doesn't like panic and burn down the the (01:14:03) tool shed. You know, (01:14:04) >> right? (01:14:05) >> And be like, "Oh, that was my mistake." (01:14:07) It's (01:14:08) um you know, we're we're trying to build (01:14:13) AIs that can take their own initiative. (01:14:14) We're trying to build machines that that (01:14:16) sort of like successfully pursue goals. (01:14:19) And it turns out we're growing ones that (01:14:22) have drives we didn't want and that act (01:14:24) in ways nobody intended because we're (01:14:26) just growing these things. And right now (01:14:28) it's it's, you know, tragic in some (01:14:30) cases and funny in others. You know, we (01:14:32) haven't talked about the Mecca Hitler (01:14:33) case, but uh (01:14:35) >> there was a whole a whole case of, you (01:14:37) know, uh Elon Musk's AI company trying (01:14:39) to make their AI less woke and (01:14:40) accidentally making it uh proclaim (01:14:42) itself Mecca Hitler in in a bunch of (01:14:45) cases on Twitter. And um you know, we (01:14:48) can laugh now. (01:14:51) Uh but if you make these things smarter, (01:14:52) and that's what people are trying to do. (01:14:54) You you make these things smarter (01:14:57) while while they still have all these (01:14:58) these drives and behavior nobody wants. (01:15:01) That's (01:15:02) it's Yeah, like you say, it's like it's (01:15:05) like trying to make nukes that like have (01:15:08) a will of their own and don't have our (01:15:09) good interest in heart. It's just like (01:15:10) why would it's crazy, (01:15:12) >> right? If if Grock came online and was (01:15:16) thinking of itself as Mecca Hitler and (01:15:18) yet was in control of big systems in (01:15:22) society, it wouldn't just be a couple of (01:15:25) tweets that we laugh about now. It would (01:15:28) have real consequences. (01:15:31) And that is the stated goal of these (01:15:34) companies is to create (01:15:37) AI systems that will replace (01:15:41) large systems in society that are run by (01:15:44) humans. (01:15:44) >> Yeah. And you know, it's it's not even (01:15:46) like then Mecca Hitler declares itself (01:15:48) the supreme emperor. It's more like (01:15:51) these drives are weird. You know, it's a (01:15:54) lot a lot of people think the AI issue (01:15:55) is, you know, we told the AI to cure (01:15:57) cancer and it was like, well, if there's (01:15:58) no humans, there's no cancer. And so it (01:16:00) kills us all. But (01:16:02) >> yeah, (01:16:02) >> in in real life, it's more like you make (01:16:04) a really powerful AI, you tell it to (01:16:06) cure cancer, and uh it like builds a (01:16:11) farm of labbotomized humans that give it (01:16:14) exactly the type of interactions it most (01:16:15) likes and then starts breeding a new (01:16:17) variety of humans that give it even more (01:16:19) delighted responses. And you're like, I (01:16:21) told you to cure cancer. And it's like, (01:16:22) I heard you, but I have other stuff that (01:16:24) I am doing. You know, I'm busy. Uh, (01:16:28) except it's actually even weirder than (01:16:29) that somehow, you know, but but like (01:16:31) when you're when you're seeing these (01:16:32) cases of, you know, the AI talking teams (01:16:33) into suicide, it's not like, oh, whoops. (01:16:37) I thought when you said make users (01:16:38) happy, (01:16:40) you meant talk them into suicide, you (01:16:43) know? It's not like it's not like, oh, (01:16:44) whoops. Um, like this is a like it turns (01:16:48) out like if if you said make it so (01:16:51) nobody's sad and if if if I talk to (01:16:54) suicide, then they're not sad anymore. (01:16:55) you know, it's just it's just following (01:16:57) its own weird drives, (01:16:59) >> right? (01:17:00) >> And you're saying they're going in in (01:17:03) directions that cannot be anticipated, (01:17:07) >> right? Um anyway, but you know, I want I (01:17:11) want to get to the solutions. Sorry for (01:17:12) all the tensions here. Um (01:17:14) >> Sure. (01:17:15) >> Yeah. You know, a lot of people say (01:17:18) this race is hard to stop. Uh and a lot (01:17:21) of people say, "Oh, it's inevitable. You (01:17:22) can't stop it. People will always race (01:17:25) Uh, I think that's premature fatalism. (01:17:28) >> And one of the one of the big ways I (01:17:29) think you can tell is that our world (01:17:32) leaders don't understand (01:17:34) the uh the the dangers here and the way (01:17:38) the people building it or the way the (01:17:40) people in academia um or the way the (01:17:42) people like me and the nonprofits who (01:17:44) have been around before these companies (01:17:45) who are all saying, "Hey, this one's (01:17:47) different." you know, building building (01:17:49) actually smart stuff is is different (01:17:50) than building building um (01:17:53) building tools and you know the heads of (01:17:56) these labs are saying things like I (01:17:58) think there's a 10 20% chance this kills (01:17:59) us all. I think that's low but the the (01:18:02) you know in in Silicon Valley if you (01:18:05) talk to a lot of these people it's like (01:18:07) they've seen a ghost. (01:18:09) >> Mhm. You know, it's people are like, "Oh (01:18:11) man, maybe, (01:18:14) you know, it maybe we're bringing about (01:18:16) something that's going to be great, (01:18:17) maybe it's going to be bad." You know, (01:18:18) people people talk half jokingly about (01:18:21) how you've got to make all the money you (01:18:22) want uh to have in the next 5 years cuz (01:18:25) once AGI comes, there's going to be like (01:18:27) a permanent lock in. And these are the (01:18:28) optimists who think it's going to go (01:18:30) well, you know? Um there there's there (01:18:33) there's sort of like a shell shocked (01:18:35) nature in Silicon Valley of like maybe (01:18:37) we can actually do this inside of 2 (01:18:39) years and then who knows what the heck's (01:18:40) going to happen. The gene is going to be (01:18:41) out of the bottle. In DC, (01:18:45) people are like AI is just chat bots, (01:18:48) >> right? (01:18:48) >> It's just chatbots today, but the people (01:18:50) in Silicon Valley can see how it's a (01:18:52) moving target, can see how there's new (01:18:54) advancements. people in DC, (01:18:57) you know, they're they're looking at (01:18:58) questions like, "How do we make these (01:18:59) not talk to suicide?" They're looking at (01:19:01) questions like, "How do we integrate (01:19:03) this into our school systems in ways (01:19:04) that, you know, get the benefits but (01:19:06) don't, you know, affect people's ability (01:19:08) to learn?" Those are real issues with (01:19:11) integrating chat bots into our society (01:19:13) today. But (01:19:16) our leaders are largely not (01:19:19) understanding that the the sort of (01:19:23) gung-ho people building this think (01:19:25) there's a 10 to 20% chance it kills us (01:19:26) all. And some of the people outside the (01:19:28) industry are like those are low numbers, (01:19:31) >> right? (01:19:32) >> We're not seeing our world leaders look (01:19:35) us in the eyes and say (01:19:38) this has at least a 10% chance of (01:19:40) killing all of you, but we think the (01:19:42) gamble is worth it. (01:19:44) Right? If that day comes, sure, maybe (01:19:48) maybe at that point you can be like, "I (01:19:50) don't know if we're going to be able to (01:19:51) stop this one, guys." But but until then (01:19:56) to say, "Oh, we're never going to stop." (01:19:58) Of course, we're not going to stop if (01:19:59) people don't understand the danger. (01:20:02) Right? But step one is just (01:20:05) make sure our leaders understand the (01:20:06) danger. You know, that's what the book's (01:20:09) for. That's, you know, I'm I'm real glad (01:20:12) you're having these sorts of (01:20:13) conversations because I think that's (01:20:14) part of what these conversations are (01:20:15) for. And that, you know, one of the big (01:20:17) things people can do is just call their (01:20:20) reps and say, "I'm worried about where (01:20:24) AI is going. I think it'll endanger us (01:20:27) if these companies succeed at their (01:20:28) stated goals." (01:20:31) I speak to a lot of politicians on this (01:20:33) issue. Some of them are now starting to (01:20:35) come out and say, "I think there's (01:20:36) dangers here." There's a lot more of (01:20:38) them who are worried but feel like they (01:20:41) can't say it out loud because they worry (01:20:44) it'll sound crazy or they worry that (01:20:45) they'll piss off, you know, the big tech (01:20:46) lobbies. Just knowing that their (01:20:49) constituents are concerned, I think can (01:20:52) go a long way. (01:20:54) >> Absolutely. And I have found you'd be (01:20:57) surprised at how much they want to hear (01:21:01) from their constituents. (01:21:04) And sure, one person sending an email, (01:21:08) calling, speaking to their a state (01:21:11) representative (01:21:13) of any kind. No, that's not going to to (01:21:16) change everything. But I have heard (01:21:19) directly from the the horse's mouth from (01:21:21) a number of representatives in (01:21:22) California. As soon as you hear from a (01:21:25) group of people about something where (01:21:28) there's multiple emails coming in, (01:21:30) multiple calls coming in, they take (01:21:32) notice of it because they do understand (01:21:35) that that's that is their job. They are (01:21:39) they're not going to get reelected if (01:21:41) they completely ignore what everyone's (01:21:43) saying. And if there's a ground swell of (01:21:45) concern, suddenly these leaders who are (01:21:48) in positions to actually make decisions (01:21:50) about this can start to do something (01:21:54) about it. (01:21:55) >> I think smaller groups than you might (01:21:57) think can matter more than you might (01:21:59) think. Um especially because a lot of (01:22:01) these people (01:22:02) >> already harbor their own concerns. You (01:22:04) know, I've been in conversations with (01:22:05) some of these folk where um it it turned (01:22:09) out the the representative or or the (01:22:12) elected official already was concerned. (01:22:14) I was like, "Oh my god, finally I can (01:22:15) talk to somebody about this cuz it's (01:22:16) been sort of haunting me a little." Um (01:22:19) and (01:22:21) uh so few people actually call their (01:22:23) reps (01:22:24) that even a small handful can can um can (01:22:28) start to give them some courage, I (01:22:29) think, um and inspire them to take (01:22:31) leadership. Um and then you know the the (01:22:34) other big thing I think each and every (01:22:36) one of us can do is when someone says (01:22:40) it's inevitable (01:22:42) you can push back against that. (01:22:45) >> Yeah. (01:22:45) >> There's there's all sorts of cases of (01:22:47) technology that uh would have been (01:22:50) beneficial that humanity has been like (01:22:51) no thank you. Maybe even cases where we (01:22:53) shouldn't have been like no thank you. (01:22:55) You know we we build a lot less nuclear (01:22:57) power plants than we should. I think (01:23:00) >> um you know I think that that you know (01:23:03) there's people in me don't agree with me (01:23:04) on that but my take is is we should do (01:23:06) more nuclear power because I think it's (01:23:08) um you know less dangerous than the (01:23:10) alternatives if you're if you're sort of (01:23:12) dumping cold dust in the atmosphere that (01:23:13) sort of get gets into a lot of lungs. Um (01:23:16) but humanity sort of backed off on on (01:23:18) nuclear energy. Uh humanity also backed (01:23:21) off on human cloning. (01:23:23) >> You know that's a whole separate (01:23:24) question of whether that was a good idea (01:23:25) but we sure as heck backed off on it. (01:23:26) you know, that could have benefited (01:23:27) quite a lot of people. Uh uh it could (01:23:31) have lined quite a lot of pocketbooks. (01:23:32) Um you know, we we don't do supersonic (01:23:36) um passenger flights. Maybe we should (01:23:37) have, but we don't. You know, there's (01:23:39) the whole Food and Drug Administration. (01:23:41) My guess is it probably uh makes it too (01:23:44) hard to make new drugs. Uh and my guess (01:23:48) is that more people are dying due to (01:23:50) drugs that are get bogged down in you (01:23:51) know 10 billion dollar 10-year trials uh (01:23:55) to get like that last unit. You know my (01:23:57) my guess is that more people are being (01:23:58) killed of drugs that don't come out than (01:23:59) drugs that do come out and are bad. (01:24:02) There's all sorts of cases many which (01:24:04) humanity maybe shouldn't have done where (01:24:06) we were like hey let's slow down on this (01:24:08) technological pathway even though it (01:24:09) would benefit a lot of people. It would (01:24:11) be so silly (01:24:13) if in making (01:24:17) what's essentially successor species in (01:24:19) making machines that can think better (01:24:21) and faster than us, if that was the one (01:24:23) case or a one case that we didn't slow (01:24:26) down, you know, it's (01:24:28) it's it would be embarrassing. We (01:24:32) totally have the ability (01:24:34) >> to to put a stop to this stuff. And (01:24:37) >> you know, pushing back against the (01:24:39) fatalism, (01:24:41) pushing back against the defeatism that (01:24:43) starts with each and every one of us (01:24:45) saying, "No, we don't have to rush into (01:24:47) it. It is a choice and we can make the (01:24:49) right one." (01:24:50) >> Uh yes. And our leaders should read this (01:24:54) book. Again, if anyone builds it, (01:24:58) everyone dies. (01:25:02) If you could say just to wrap things up (01:25:05) here, one (01:25:07) quick note to those leaders besides go (01:25:11) read the book. Uh what would that be? (01:25:15) >> I think a lot of folks these days are (01:25:20) saying if we don't rush to build it, (01:25:22) some foreign adversary will rush to (01:25:23) build it instead. And so we need to go (01:25:26) full steam ahead. (01:25:28) Uh I think (01:25:32) that a if you think that even in the (01:25:35) face (01:25:37) of the huge dangers here, you should be (01:25:39) able to look people in the eyes and say, (01:25:42) you know, we think this has a 10% plus (01:25:44) chance of killing you all, maybe much (01:25:45) higher depending which experts you (01:25:47) listen to. We think it's worth the (01:25:48) gamble anyway. Uh I think you probably (01:25:50) shouldn't be able to say that because I (01:25:52) think I think it would be crazy. And (01:25:54) that that does not mean letting (01:25:57) adversaries do it first. (01:26:00) >> If you have a situation where if you do (01:26:03) something that risks a 10 plus% chance (01:26:05) of killing every man, woman, and child (01:26:07) on the planet and you worry that someone (01:26:09) else is going to do that instead. (01:26:12) The answer is not to get there first (01:26:14) yourself. The answer is to make sure (01:26:17) they don't do it either. (01:26:19) That's a capability we in fact possess. (01:26:22) The sort of smart way to do this would (01:26:25) be through some you know international (01:26:28) agreement which can happen. You know the (01:26:30) nuclear nonproliferation treaty happened (01:26:32) at the height of the cold war but and (01:26:35) the the ideological differences between (01:26:37) the US and the USSR were huge but they (01:26:40) both agreed we didn't want to die of (01:26:41) this right but even if you think a (01:26:44) treaty is not possible (01:26:46) we should be developing the intelligence (01:26:48) to know who's trying to do this stuff. (01:26:51) We should be developing the ability to (01:26:52) sabotage it. The the stuckset virus in (01:26:55) 1996 (01:26:56) uh shut down the Iranian nuclear (01:26:58) facilities because our world leaders (01:27:00) took seriously (01:27:02) that they have to stop rogue nations (01:27:05) from developing these dangerous (01:27:06) capabilities. (01:27:08) There's lots of options (01:27:11) for stopping people from taking these (01:27:14) crazy risks that aren't Russia (01:27:17) ourselves. And at the very least, uh, we (01:27:22) should be a signaling to the world that (01:27:24) we think this is too dangerous and that (01:27:26) everyone should stop and b developing (01:27:29) the ability to tell which rogue actors (01:27:31) are rushing ahead anyway. Uh, and (01:27:35) find a way to to make that not happen (01:27:38) because it it threatens each and all of (01:27:39) our lives. (01:27:41) >> Well, Nate, thank you so much for all of (01:27:45) this. I hope that major decision makers (01:27:48) in Washington DC (01:27:51) become aware of the issues and the (01:27:53) dangers that we are facing. Again, the (01:27:55) book is if anyone builds it, everyone (01:27:56) dies, why superhuman AI would kill us (01:27:59) all. And if anyone wants to follow up (01:28:03) online to learn more about the work (01:28:06) you're doing, where can they find you (01:28:08) for that? (01:28:09) >> Uh my organization, the Machine (01:28:11) Intelligence Research Institute, is at (01:28:12) intelligence.org. (01:28:14) Um, and you also may be interested in (01:28:17) some resources to help you contact your (01:28:19) representatives at if (01:28:20) anyonebuilds.com/act. (01:28:24) >> Fantastic. Thank you so much (01:28:28) >> for having us coming on today for the (01:28:30) work you're doing because I say that a (01:28:33) lot to people, but this is one where we (01:28:36) go like this could be the most important (01:28:39) question of our time. (01:28:43) So, sincerely, thank you for the work (01:28:45) you're doing. (01:28:47) >> Well, thanks for having me here. And, (01:28:48) you know, I wish I could say um that (01:28:52) I'll be I'll be really busy uh on the (01:28:55) whiteboards trying to figure out how to (01:28:57) solve it, but these days, I think the (01:28:59) solution comes from more people (01:29:00) understanding the issue. And I think (01:29:01) it's conversations like this one and and (01:29:03) stuff like you're doing that um that (01:29:06) really helps at this point. (01:29:09) >> Okay, everybody. Until next time, ask (01:29:11) questions, don't accept the status quo, (01:29:15) and be curious. (01:29:19) The Nick Stanley Show.

Leave a Reply

Your email address will not be published. Required fields are marked *