Developer-First Infrastructure

Session Information

Listen as Joe Duffy leads a talk about Cloud Engineering with special guests Ken Exner and Luke Hoban. Joe discusses how he thinks of the cloud as the a giant super-computer and dives into each component of the cloud operating system. Then Luke will talk about how Pulumi makes authoring cloud components easy and gives a few examples of Pulumi’s Multi-Language components. Finally, Joe and Ken discuss how customers are currently using the cloud and how they both envision the future.

Presenters
  • Joe Duffy
    CEO & Founder, Pulumi
  • Luke Hoban
    CTO, Pulumi
  • Ken Exner
    General Manager Developer Tools, AWS
  • Hi, everyone. My name’s Joe Duffy, founder and CEO of Pulumi. I’m here today to talk to you about something we’re really excited about that we’re calling developer-first infrastructure. And to get started, to set a little context, I wanted to start by talking about the Modern Cloud. It’s no surprise to anyone here that the Modern Cloud is very complicated, but with that complexity comes a lot of exciting benefits.

    But you know, we’re shipping code faster than ever before today. We’ve got more moving pieces than ever before. We’re really talking about distributed architectures, and that’s actually one of the themes of the talk today that I’m really excited about, and why bringing the cloud closer to developers is so exciting. But the fact is, between AWS, Azure, Google Cloud, there’s hundreds of building block services in each one of those that can be stitched together in infinite ways to build powerful software. There’s vendors like Cloudflare and Snowflake, and many, many others that are introducing their own cloud services that are exciting as well.

    And then we’ve got the whole Cloud-Native ecosystem, with a new project seemingly launched every week. It used to be that there was a new JavaScript framework every week, and now we have a new Cloud-Native tool every week to stay on top of. It’s a lot of moving pieces, a lot of complexity in how we’re building and shipping software these days. And if you’re a developer, your background’s being a software developer like mine is, that infrastructure might seem boring and tedious. In fact, there is sort of a meme that infrastructure should be boring.

    But should it really? Is it actually tedious? Does it have to be? I think a different perspective is that cloud infrastructure is really the essential building block of our modern application architectures. That sure, maybe setting up a network or a Kubernetes cluster, that should be, quote, boring and left to the experts who are the infrastructure experts who are gonna go make sure that’s secure and reliable and cost-effective. But what about a serverless function? Is that in the infrastructure domain or is that in the application domain? I’m gonna argue today that it’s a little bit of both, and that developers really should care about some of these things. You know, Pub/Sub topic, a queue. A lot of these are essential building blocks of building distributed applications, and that’s really exciting.

    And that’s something that’s changed over the last 10 years. I would say 10 years ago, the conversation was more about, “Hey, let’s provision three virtual machines in a database. " And sure, infrastructure was relatively boring back then, but the world of Modern Cloud has really changed all of that. The way I think about it, and the context for today’s talk is what if we re-imagined the cloud as effectively, a giant supercomputer that’s planet-scale, infinitely scalable, that’s compute and storage available to us on demand as our application architecture evolves, and as our application becomes wildly successful and we go from tens of users to thousands, to millions of users. You think of what it takes to run Lyft, for example.

    A lot of these modern companies that were fortunate enough to be born in the cloud are actually using the cloud, using this giant supercomputer to fuel their business and really fuel the innovation. And as a developer, the same way I’m writing software that runs on an Intel PC back 15 years ago, now I’m writing code that runs on this huge computer. And arguably the cloud has become a bit of the operating system because there’s a software component sitting on top of the supercomputer. And as a developer, this is really exciting. This goes from infrastructure being this boring thing that’s an afterthought to really being part of an application’s architecture.

    I think of, for many decades, we’ve been talking about the age of distributed computing. In fact, in the ’50s and ’60s, there were tons of papers and interesting research done around communicating sequential processes, which, by the way, is the foundation for Go’s concurrency model. A lot of these pieces have come together finally. I think a lot of folks predicted it would happen sooner when we went through the whole multi-core transition, which led to concurrency and async programming. But now what we’re seeing is, thanks to the cloud, the age of distributed computing has actually arrived, and that’s a very exciting thing.

    The challenge is, how do we tap into all of that capability? It’s not always easy. The cloud is complicated. You’re wading in tons of YAML, and there’s sort of a missing programming model. And so to take us into the solution to that problem, I’d like to demonstrate an analogy in three parts. First of all, several decades ago, we went from writing assembly language to higher-level programming languages, like C, Fortran, COBOL, et cetera.

    Obviously we’ve come a long way since then, but this was a huge revolutionary change in enabling people to focus on building better software, being more productive. It almost sounds funny in hindsight now, many decades later, but actually writing in C, it was way more productive than writing in assembly language. Fewer bugs, you just get more done, 10 X the productivity, right? And that was a huge change in our ability to build this software ecosystem. Arguably Microsoft wouldn’t exist. Amazon wouldn’t exist.

    None of the current industry leaders would exist if it weren’t for that innovation. And then we didn’t stop there, right? We kept going. So we went from writing C, which is a relatively low-level language. You know, C’s goal was to expose the underlying capability of the hardware directly to the programmer. And for good reason.

    C was used to write operating systems and the run times themselves that I’m about to talk about. But that wasn’t good enough to fuel sort of the next several decades of innovation. We had to increase the level of abstraction even further. And so we came up with things like Java and JavaScript, with Node. js eventually coming on the scene, and Python, Go, .

    NET. There’s a lot of things I could have put in this category, but here are some of the things that come to mind when you think about higher-level programming languages. Some higher-level than others, some dynamically type checked, some statically typed checked. But the key here is we kept increasing the level of abstraction so that developers could focus on business logic, could focus on what matters for the problem they’re trying to solve and not swizzling bits and bytes like in C, for example. Of course, plenty of people still write code in C.

    Envoy is written in C++ because the developers there really wanted tight control over performance. And the key here is we’ve got a lot of different tools in the tool belt that we can choose from. One key part of this that I think I wanna highlight, that I hope folks keep in mind as we go through the rest of this talk, is some of these are multi-operating system technologies. Java, although yes, it works on Windows, is not Windows-specific. You can run Java on the server.

    In fact, that’s one of the reasons it became very popular in the early days was, you could write servlets in Java, and you could run it on the server. This was around when the web was emerging, right? And Python, you can run Python anywhere. And so that leads to the next analogy, and the final part in this three part series is, the cloud really has become the new operating system, and there’s a software fabric that sits on top of the hardware that we’re now targeting with our software. So that we don’t have to be always experts in the underlying bits and bytes of the cloud. And this architecture diagram on the left is Windows NT.

    On the right, we’ve got Kubernetes, which arguably has become the standard scheduler for workloads in the cloud. Obviously it doesn’t cover everything that you do in the cloud, and so the analogy is a loose one. We don’t really have that one diagram that works anywhere, but we’ve got lots of different operating systems, too. We’ve got Windows, Mac OSX, Linux, et cetera. And so I think the reason for highlighting this, is just a mindset shift.

    It’s a shift in thinking about the cloud as just a container of compute that runs my code, to an entire ecosystem of building blocks that I can use to build powerful software. And that mindset is why we founded Pulumi, frankly. But it is really exciting. And that’s the prelude for the rest of the talk. infrastructure as code is what we ended up building with Pulumi, but unfortunately infrastructure as code today is largely YAML domain-specific languages.

    The analogy here is, think back to the assembly language to C transition that I highlighted earlier. A lot of the configuration management techniques that we use today treats the infrastructure as though it’s an afterthought. It doesn’t embrace sort of this new worldview that we just covered. So even though it’s called infrastructure as code, it’s typically not, it’s missing a lot of the capabilities that we think of when we use the term code. It’s text, so we can version it, we can check it in.

    It’s repeatable so we can effectively execute it, which leads to sort of a code analogy. But it’s definitely a loose analogy to say that it’s code. What we’re seeing now is this transition to infrastructure as software which is, think of all these cloud services that we’re configuring using infrastructure as code, as composable building blocks. Really take that three-part analogy that I was just covering and take that to the next level. I think I mentioned we at Pulumi built an infrastructure as code technology.

    I’ll be honest, we didn’t know that’s what we were gonna build at the outset. We had that vision of, hey, the cloud is the next operating system and is really this powerful capability. infrastructure as code is what gives you this programmable abstraction and a consistent resource model on top of the cloud service model so that you can then compose all of these things and use programming languages that we know and love to build cloud software in a more native way and bring the cloud infrastructure closer to the application architecture. And so when I say infrastructure as software, what do I mean? Well, software engineering, if you’re a developer, means a lot of different things. It means typically you’ve got facilities for abstraction.

    We’ve got higher-level languages that give us great productivity and expressiveness. We’ve got type checking to find errors much sooner in that process. We’ve got great IDEs, editors that have syntax-highlighting colorization. Red squiggles if I make a typo. Right click to refactor, all of these things just built-in so that the code is just there and I can get into the flow and really, really get my job done.

    There’s unit testing and integration testing, various ways to make sure the code is correct before we go deploy it or execute it in production. Debugging so I can interactively find errors, CI/CD, all these things that we know and love about software engineering, historically have not applied to infrastructure as code. And we think that’s a real shame. And that’s why we’re excited to talk about Pulumi today. There are other folks who are doing great work here as well.

    I wanna call out the AWS CDK team, for example, for also seeing this exciting future. But really, really excited about this next level of innovation and really up-leveling our entire game when it comes to infrastructure. And to show that in action, I’m gonna invite Luke Hoban, our CTO of Pulumi, to program the cloud.

    Hi, my name is Luke Hoban. I’m the CTO of Pulumi. And I’m gonna today talk about how we can take some of the developer-first infrastructure principles that Joe talked about and apply them using a tool like Pulumi. So today I’m gonna focus on using Pulumi, but really most of what I’m talking about here will apply for any modern infrastructure as code tool that’s taking a developer-first approach to how we think about managing and working with our infrastructure. I’m gonna start where you’d kind of expect for a developer-first experience, which is inside my IDE. And so I’m here inside Visual Studio Code. I’ve got some TypeScript up on my IDE, and I’m just gonna start writing and programming some infrastructure, and programming the cloud from scratch.

    I’m gonna start with a really simple use case. And as we go through this demo, we’ll bring in more complex building blocks to show you how this looks as you get to larger and more realistic applications. But to begin with, let’s just start with something really simple. Let’s create an AWS S3 bucket. Now, the first thing we see as we start typing is that we’re getting that same developer experience that we’re used to for an application developer, the rich IntelliSense, the completion lists, the squiggles, all those sorts of very quick feedback, the ability to discover and understand what’s available.

    For example, when I type AWS dot, I can see all of the different namespaces that are available for all the various features that are available inside AWS. There are many, many hundreds of different APIs available from the AWS platform, and all of them are available here within Pulumi. I can then go ahead and say I wanna create a bucket. Just give it a name. And now, just so I have something to work with, I’m gonna export out the name of the bucket.

    So I’ll export, bucket name is equal to bucket dot ID. Okay, so I’ve written my first program. Let’s go and deploy this and create some infrastructure. I’m gonna come down here, I’m just gonna type Pulumi up to update the cloud with this infrastructure. Now, one of the really important things with these modern infrastructure as code tools like Pulumi, is that even though they’re using a programming language like TypeScript in this example, they really are still Desired State Configuration.

    So when I say Pulumi up, the first thing it does is shows me a preview. And that preview tells me what changes are gonna be made when I try and deploy this program to the cloud. Now in this case, because I have nothing deployed so far, it knows it’s just gonna need to create that bucket that I specified. I can see the details, understand exactly what’s gonna get created, and go ahead and say yes to actually deploy that into the cloud. Okay, I’ve created my bucket.

    Now, I might wanna create additional objects. So one thing I could do is create an object, and I’ll call it obj. And this time, instead of a bucket, I’ll grab a bucket object. I’ll call it obj again. And this time I need to specify some properties to indicate the details of what this thing should be.

    And so for example, I’m gonna say I want this object to live inside that bucket. I’m gonna say I want the content to be hello world. Okay, so I’ve specified an additional resource. And now when I type Pulumi up, we see that there’s two unchanged. So that bucket that I already deployed doesn’t need to be modified.

    But this bucket object does need to be created. So I’ll go ahead and say yes to create that. Here we go. I created my second piece of infrastructure. Now I can say something like AWS S3 LS and I can grab the name of this Pulumi stack output bucket name.

    And we can see that we have a single object named obj, 13 bytes for hello world inside my bucket. Okay, so we’ve got some infrastructure. We’re building up something inside this bucket. Now so far, we’ve just done things that you’d kind of expect from any infrastructure as code tool. Whether it’s CloudFormation or Azure Resource Manager or Kubernetes YAML, anything like that.

    So what we wanna do because we have software is really take advantage of some of the unique benefits that sort of a programming environment, a software engineering environment can enable for us. So one of the really simple ones is just, hey, I can write a for loop. So maybe I can say const file name of, and now I’ll use some built-in libraries. So I’ll read a directory off of disk to find all the different files that are available inside this folder. And for each one of those files, I’ll go ahead and create an object.

    So in this case, it’s gonna be named file name, and the content inside that folder is gonna be F-S dot read file sync. And then path dot join dot slash files with my file name. Okay, so there we go. I’ve written some code. This is using a for loop.

    It’s using some libraries that are available for me, like readdirSync and readFileSync, to interact with my environment and use the libraries inside my new JS environment. I’m taking advantage of this being a programming language and having that flexibility and infrastructure to work with. I can also see one additional thing, which is that I get an error here. So I get a squiggle telling me, giving me the feedback right away that I have a problem with my code. And if I look at this, it’s saying that buffer is not assignable to type string.

    That’s ‘cause I have a slight issue here where I actually need to specify what content-encoding that file uses. So I’ll go ahead and fix that. Now I’ll type Pulumi up to deploy this. We’ll see that I actually have this file called index dot HTML inside files. And so it’s gonna deploy that file instead of the original object with hello world in it.

    Now we see we’re gonna create this resource and delete this resource. I’ll go ahead and say yes. Now one thing I’m just gonna do here, instead of continually running Pulumi up and checking the preview and doing the updates, that’s really useful when I wanna make sure I know exactly what change I’m making to the cloud, but while I’m in this rapid development mode like I’m showing you right now, it’s really important that I be able to quickly make changes, see them and react to them as I do. So one of the things we can do to enable that is type Pulumi watch, and this is just gonna watch every time I save, we’re gonna see an update deployed to the cloud. So let’s see what happens now.

    There’s a couple of additional things I wanna do to this. I wanna modify my bucket to make it be able to host a website. In this case, I have an index document which I just uploaded called index dot HTML. So I’ll just save that. I go ahead and click save on my file, and that starts an update down here.

    And that’ll take just a couple of seconds to deploy up to the cloud. But there’s one additional change I need to make, which is I need to make the ACL public read so this can be read from the internet. And I need to indicate that the content type is text HTML so it’ll be rendered correctly by the browser. Okay, so now we hit save again. That’s gonna deploy that.

    I can come over here. Oh, and there’s actually one last thing I need to do. Export const URL. Get the website endpoint. Okay, so there we go. We’ve written some code which does a little static website hosting. We’re updating that right now. In just a second, this should be available. There we go. There we go. So now we’ve got our hello Cloud Engineering Summit. Now of course, if you wanna modify this, I’ll just remove some of those excitement from this. Hit save. That’s gonna deploy. And here we go.

    Now, when we hit that curl endpoint object in that bucket, it’s changed to be this one. So then we have all those developer productivity benefits of being inside an IDE. Quick feedback loops, error messages, completion lists, the ability to discover APIs, and the ability to quickly iterate on my infrastructure. One of the things I wanna do, because I have a programming language, is not have to rewrite this code every time. Somebody else has probably had to sort of invent this idea of creating this for loop and reading files from a disc and creating objects from them.

    So maybe I wanna give that thing a name, turn it into a reusable piece of infrastructure instead of copy-pasting it around, actually share it in some useful way. So the first step there could be something like, I wanna say syncDir, and I wanna take a bucket. And I wanna take a dir. Okay, so now we’ve made this a function and we just need to generalize it a little bit by not hard coding this folder. And now we just need to call it.

    So we say syncDir. Pass the bucket in and pass dot slash files. So in our particular case, we’re gonna call this function, but in general, we’re gonna use this thing here. So hit save on that. One of the things you’ll notice, it’s gonna start doing an update, but there’s actually not gonna be any change it needs to make, because I just did a refactoring of my code here.

    This is another important thing, I can refactor my code. I can be confident those changes are not gonna make any changes to my cloud environment because I can do that preview. And now I can have it be a separate piece of functionality. There we go, I’ve abstracted out that logic, given it a name, given it an API, all that sort of thing. And I could go further.

    I could move this into its own file. So for example, I actually have a file here called sync, which has a sync folder API. So I’m gonna just rename this to syncFolder. That’s now a reusable piece of infrastructure that I’ve factored out into its own file. It’s documented.

    If I come back over here I can hover over this and see what the description of that API is. And so I’ve created a reusable piece of code. And this is really what programming languages and software engineering are so good at, is the ability to create abstractions and reuse them. There we go. Now I’ve got, in a very simple form, static website hosting with just creating a bucket and then syncing the folder of files to that bucket.

    Okay, so we’ve started with some very simple bottoms up examples here. Could start from just the raw building blocks of AWS. But one of the things we’ve learned is about this ability to create abstractions. And we really wanna make that something that’s available to as many users as possible. And so earlier this week, we actually launched something called the Pulumi Registry.

    And the Pulumi Registry is a place where you can go see all the various things that are available for you from within Pulumi. And so your Pulumi programs have access to all of these different packages. And there’s 78 of them right now, but quickly growing over the next few weeks and months. There’s packages for all the things you expect, like AWS, Azure, Google Cloud, and Kubernetes. We looked at AWS in the last example, but for example, I can come over into Azure.

    I can see an overview of the package and how I use it, information about how to install and configure. And then most importantly, as a developer, I can go and access the API docs. So I can find things I might be interested in, like maybe I wanna know how to work with virtual machines in Azure compute. I can click on this, come into the API docs and see that there are dozens of examples that I can work with, in a whole bunch of different languages, like TypeScript, Python, Go, and C#, that I can use as a starting point to start working with these raw building blocks of the cloud. But there’s not just these core cloud building blocks.

    There’s also two additional things. One is, there’s dozens and dozens of long-tail cloud and SaaS providers that I can work with. If I wanna work with Akamai or Alibaba or Auth0, I can work with those inside Pulumi. And I can get the documentation from the Pulumi Registry. But then, just like the syncFolder API that we looked at, we also have a bunch of components.

    And these aren’t things published by cloud providers themselves. These are packages of functionality that’s built on top of what the cloud providers offer to make it easier to work with certain parts of the cloud. There’s things like EKS and API Gateway. Things like VPC and Azure Quickstart for container registry, geo-replicated. And finally there’s things like the ability to deploy some very common and useful helm charts like CoreDNS.

    The one I wanna dive into is Amazon EKS. So with the EKS package, it takes all of the complexity of standing up a EKS cluster in AWS and turns it into, with smart defaults, best practices built-in, a single line of code that does all the right things by default, but then offers a bunch of configuration, that we can go look at in the API, for all the additional things you might wanna do on top of that. So if you wanna create a new IDC provider, if you wanna set the desired capacity, all of these options are available, but there’s smart defaults and best practices defaults built-in to the API. Let’s take a look at what it looks like to use a component like this Amazon EKS component from within Pulumi. So this case, we looked at TypeScript in the previous example, let’s show that we can do this with Python as well.

    And so here we just took that little piece of example code that was in the registry, we brought it over into our program here. And this is just a normal Pulumi Python program. When I do Pulumi up in this context, we’ll see that something’s quite different. Even though I only wrote one line of code here to create an EKS cluster. We’ll see the Pulumi is, when I try to deploy this.

    is gonna deploy quite a few resources. So it’s gonna deploy 28 resources into the cloud, and that’s the component I specified. But then we see all these different children of that component. And that includes the EKS cluster itself inside AWS. But it also includes some networking capabilities, some IM capabilities, and even some resources inside the Kubernetes cluster itself.

    So these are resources, not within AWS, but actually within that Kubernetes cluster, they wanna create a config map. And so this shows how we can mix and match those cloud providers to do that. So this is really the power of abstractions, is all of this logic to build up and connect all these different building block pieces to create that best practices EKS cluster. All of those are built-in to this component, which was built once, shared in the registry, and now anyone can come and use it and automatically benefit from all this without having to just copy-paste that over and take ownership of it. Now, I wanna actually also show, in the registry there’s a link to the source code.

    So we can come over here, see that source code in GitHub, but actually have that downloaded locally as well. And there’s two things I wanna highlight related to the source code for that package. The first is that the EKS package is actually written in TypeScript. And so here’s an implementation of that cluster class that we just used to create the EKS cluster, and all the outputs that are specified in the documentation for that. Now you might wonder, we used that component from Python in the last demo, but here we’re actually showing that it’s implemented in TypeScript.

    And this is because Pulumi has made available the ability to create components in one language and use them from other Pulumi languages. That means that this ecosystem of components can be shared across the various different language ecosystems that are working with modern infrastructure as code. The second thing to note is that once we define a component and give it a nice interface and API and documentation of what its contracts are for the behavior of the various different interfaces that it exposes, we can then write tests that test the behavior of that interface and of that component. And so, for example, here are some of the tests that we have for that EKS component. Some basic tests that when I stand up a cluster, the kubeconfig that’s generated is what I expect.

    And some more complex tests that configure the service IP range and verify that the output is the expected values from the actual cluster provisioned by AWS. These sort of tests allow us to really enforce and make sure we’ve created the right contracts for our components, and that those are robustly tested on every commit that’s merged into these components. That means that as a consumer, I can be confident that this piece of logic behaves as it’s designed. And that means that I don’t have to worry about all those details of the internals of that. I know that when I use this component, it’s tested and validated to make sure it behaves as expected.

    Okay, so we’ve looked at two examples so far of storage infrastructure and of compute infrastructure, but there’s one last one I wanna touch on just very briefly, and that’s application infrastructure. So one of the other components that we have available inside the registry is the API Gateway. So API Gateway is obviously a service inside AWS that lets me build serverless REST APIs. And so with Pulumi and the AWS API Gateway component, will make it really, really easy to create one of these API Gateway components directly from within your code. And so here’s the documentation for that API Gateway.

    We’ll just go ahead and take this particular example. We’ll actually come over into our original code base here, and we’ll just replace this with that code. So I’m gonna hit save to deploy this. We’ll actually see that because I removed the existing code, we’re going to delete a number of things that were provisioned already. I have some things I didn’t mean to have here.

    Okay, there we go. Let me go ahead and hit save. I got that feedback, I got that error really quickly. That’s good. I didn’t have to wait for this to deploy, but now it’s deploying.

    We’ll actually see this is gonna destroy some of the existing infrastructure and stand up the new infrastructure. So it’s gonna stand up a variety of things to support this REST API, as well as a number of things to support this callback function. But the interface that I specify here is really simple. I say I want a REST API. I say I want the routes to be just a single route, and for that route to be the root path, the method GET, and then when that’s called, it’s gonna invoke this function F.

    And this function F is just some code I’ve written in line here that’s using a callback function, which lets me actually specify the implementation of this callback right in line. So I specify that I wanna log the call and that I wanna return a 200 that says hello. So I’ve exported this as URL again. So I can just come over here, curl that URL endpoint. And now, instead of pointing to the S3 bucket, which has now been removed, it’s pointing to this function that I’ve specified in API Gateway.

    So a really simple way to build up serverless applications by using these higher-level building blocks that are available inside the Pulumi Registry to do things like deploying an API Gateway REST API. Okay, so we’ve seen several examples of how we can kind of use something like Pulumi. But really these come back to, all these examples lean on the ability to use programming language capabilities, use software engineering capabilities to get developer productivity within the IDE, to create abstractions within functions or libraries or packages or versioning, and ultimately to work with the entire breadth of the cloud, from AWS, Azure, Google Cloud and Kubernetes, to a wide variety of components built on top of them. That’s it for me. Thank you. Back to Joe.

    Well thank you, Luke. That was super exciting. I think it’s one thing to talk about it and it’s another thing to see it in action. I think that was a really great demonstration of a lot of the concepts of really going from infrastructure as code to infrastructure as Software, and giving us a way to go from being buried in YAML to programmable building blocks and reusable architectures.

    So that technology is one thing, but really this idea of getting developers more in the driver’s seat, really empowering infrastructure teams to go to the next level goes well beyond just the technology. I think, you know, the infrastructure as software approach is a necessary prerequisite, but not sufficient on its own to do what we’re calling cloud engineering, which is bring the cloud closer to developers, bring great software engineering practices to infrastructure teams, break down the walls between the two sides of the house and let people really collaborate at an entire rapid new pace of productivity. And to talk a little bit more about that, I’m thrilled today to have Ken Exner, the GM of Developer Tools from Amazon Web Services, here to chat with me a little bit about the role of developers in the Modern Cloud era. Thanks for being here, Ken.

    Of course, good to be here. Thank you for having me. Yeah, I’d love to hear just a little bit about kind of what your role at AWS is, and then we’ll jump into some fun topics to chat through together. Sure. So I manage Developer Tools for AWS, sort of a portfolio of products that are targeting developers and making their lives easier in developing on AWS. So it’s everything from infrastructure as code tools, like CloudFormation and CDK, to the SDKs and CLIs that developers use, to services like Amplify, AppSync, the code services, Cloud9.

    So a big portfolio of tools and services for helping developers be productive on AWS. Awesome. Yeah, it’s always fun to chat with a fellow developer productivity and developer tools nerd. It’s frankly what gets me up out of bed every morning. So great to have you here.

    Yeah, maybe to kick it off, I think we’re seeing much more of a transformation in terms of the way that developers and infrastructure teams are working together. And I think one of the catalysts for that is sort of the Modern Cloud architectures. This move from monolithic simple applications to more distributed architectures, which is frankly really exciting for developers. Talk me a little bit through why does the Modern Cloud change the way that we should approach how we build software?

    Sure. So I think a lot of the traditional architectures are based around this monolithic application architecture where you have a server and your goal as a developer is to get software onto that server.

    One of the things that has happened with modern architectures is while we made it a lot easier to operate these things, we’ve also introduced a lot of complexity in the distributed nature of these things. So if you look at a typical modern architecture, you no longer have to manage these servers. We’ve created this serverless environment, we’ve created a lot of these capabilities that make it easier for developers and operators to use these pieces, either for managed services, or you’re using container environments. But there’s a lot of moving parts. So a typical application, modern application, will be sort of a dance of microservices and managed services, or maybe you’re using some serverless services from AWS, maybe you have some distributed microservices are part of your architecture, you have different data stores.

    So it’s become sort of this network of all these distributed pieces that a developer has to think about. I think one of the things that I wanna see us do better as an industry is get better at making people productive in this distributed architecture. It’s a lot easier to operate. You’re able to get a lot of benefits from being able to use all these managed services. But how do you make it easy for developers to develop against that, I think, is the challenge I think Pulumi and AWS see, is sort of the next thing in productivity.

    We need to make that easier. Yeah, absolutely. I still remember, frankly, this is sort of the era that I think AWS was born out of, but back in the early to mid-2000s I was working at Microsoft, and we had all these XML Web services and distributed architectures. But really the world today has come so far beyond that, where the systems are much more loosely connected, we’re shipping different pieces at different rates. Why is this, do you think, exciting for developers, and how do you see their role in creating, developing, maintaining, these more distributed architectures?

    I think this is sort of the entire story of DevOps, right? It’s the developer is now part of the story in defining infrastructure. The line between infrastructure, operations and development has become much blurrier. A developer typically has to manage infrastructure as well. Think about infrastructure. Sometimes it’s as little as they may be responsible for creating their own containers. Is that infrastructure? Is that operations? It’s a little bit of both.

    At the same time, you’re seeing IT and traditional operations folks have to pick up development responsibilities. So a typical cloud Center of Excellence or IT shop is starting to pick up development responsibilities as well. And sort of the line between a developer and operator has really, really become blurry, and something that most professionals who work in this space typically have to wear both hats, at least some part of their time. They have to be an operator and a developer and think like both personas.

    What is the ideal interface between the, let’s say, operations and developers? Is it code? Is it point and click? Is it ticketing? Is it Kubernetes? What are you seeing in your customers? Well, I hope it’s not ticketing.

    I think the space that we both play in, sort of infrastructure as code, application code, I don’t wanna get into a debate about declarative versus imperative languages, but having an artifact that is sort of the contract between infrastructure and application code, I think that’s the right answer. The infrastructure as code space, whether it gets realized as declarative or imperative languages, is sort of the way to think about your infrastructure. It’s the way to sort of reason about your infrastructure. It gives you something that can be version controlled, that can be code reviewed, that can be used to sort of describe your infrastructure at any point in time. You can recreate your infrastructure from that definition.

    I think these are important improvements in how we think about and manage our applications and our infrastructure, is being able to put it into code that can then be all the benefits of code. You can reason about it like code. You can do a diff, you can do a code review, you can version control it. All these things that are super important that have made developers productive. You’ve taken that to infrastructure, given the power of code to your infrastructure management.

    I think it’s a big, important movement in how we manage our infrastructure as an industry. Absolutely, I mean, once you do infrastructure as code, whether it’s declarative or imperative, you have a artifact, you can version it, you can bring a lot of the lessons learned that we know about software engineering and now apply it to infrastructure. And I think it’s very important not to try to boil the ocean. One of the benefits is a lot of these infrastructure as code tools do have ways to integrate with existing resources. I think it’s easy to generate some CloudFormation from your existing resources, or just pick specific pieces to modernize one at a time.

    So that’s really, really good advice there. I think one of the things that languages and code gives us, let’s say, is abstraction, encapsulation, the ability to build bigger things out of smaller things. One of the things that always struck me and resonated with me about the AWS platform in particular is you’ve got hundreds of building block services, and you can stitch them together in infinite numbers of ways to create infinitely rich and capable cloud services out of these different building blocks. It can be daunting though, to understand the proper way to configure all these things. But with code, we can now begin to up-level the level of abstraction that we’re programming against and think about entire systems or entire architectures, not just the building blocks.

    Where does this all go from here? Are you excited about that capability? I think what you’re sort of describing is sort of the history of abstractions and software. Software is an evolution of abstractions. You’re trying to build on previous abstractions. EC2 is an abstraction of a server. Lambda’s a further abstraction.

    So what you’re seeing is just abstraction being built on top of abstraction on top of abstraction. And I think you’ll see that continue to happen. One of the things I’m excited for in AWS is making it easier for people to start using the 200 plus services that we have. How do you create abstractions that allow people to be more productive at a higher level than the lower-level building blocks? It’s important that we provide lower-level building blocks, but also that we start going higher and higher and making it easier for people to develop in particular use cases. Typically this is around a particular types of use cases like maybe someone wants to build a front-end or mobile application.

    So we started developing abstractions like Amplify in AWS, that allows you to work at that level of abstraction rather than at the API Gateway Lambda levels of abstraction. So I think we will continue to see more and more of this, trying to push it up and make sure that people can be more productive at higher levels of abstraction. I think with Pulumi as well, and in some of the infrastructure as code, there’s an opportunity for us to do things there as well. You can define an AWS resource, or you can create higher levels of abstraction. Maybe you would create an SNS resource or an SQS resource.

    But if you combine them together and create a higher-level component resource, you can do Pub/Sub. And that new abstraction provides sort of an architectural pattern for how to use these two lower-level building blocks. So I’m excited about doing more together with Pulumi and others to create these new higher-level components. Take the building blocks that we have and build higher-level application building blocks that make it easier to develop with opinionated patterns rather than having to stitch everything together at the low-level building block. So I think you guys have something called a component resource or a resource component.

    I wanna take that idea and go bigger with that. How do we create sort of the ability for developers to start creating these higher-level patterns and sharing them with others? I think we can create an entire programming model based on these abstractions that we can build on top of Pulumi and CDK and other infrastructure as code frameworks. Well thanks, Ken, for sharing the perspectives there. I totally agree with everything we just discussed. I think I’m really excited to go from building blocks to these architectures and patterns.

    I think that’s the next level in this cloud evolution is not reinventing the wheel, really, really putting the builders first and really enabling us to build bigger things out of smaller things. And I think that’s really exciting. I think in terms of putting this into practice in your own teams, really putting the builders first and putting them in the driver’s seat is kind of the first step. So I’ll talk through a little bit about different ways we’re seeing teams organize and empower developers. The unfortunate news is there is no one right answer, but this is definitely a truism across all of the different models that we’re seeing in practice.

    I think enabling the folks who are gonna innovate to do that is key. It may sound obvious, but you look at where we’re coming from, and many developers just don’t have the ability to spin up the infrastructure they need. It’s surprising to me, but we still talk to folks all the time who, developer has to file a ticket to get some infrastructure, and then wait a long time to get that. Up to a month or longer. And that’s no fun. That’s not the recipe for moving fast. Really operating with code is the key that we’re really driving towards here. This whole infrastructure as software approach. This is not new, in a sense. We’ve been on this DevOps journey for over 10 years.

    In fact, I’ve gotten all this way into the talk and I don’t think I’ve said the word DevOps once, but I wanna really tip my hat to kind of everything that’s come before. And I think DevOps was an essential movement to getting where we are today. I think the unfortunate reality is DevOps really brought more dev to the ops than it did the opposite of empowering the developers to really get their hands on cloud infrastructure. Which is great because cloud engineering’s about going both ways, bringing the cloud closer to developers, but also delivering great software engineering practices to infrastructure teams, which DevOps really played an essential role in doing. And it’s really laid the foundation for the shift to cloud engineering.

    And I think this is really taking a lot of the lessons of DevOps and taking them to the next level in terms of really, software engineering, even in the context of infrastructure. Yes, we used infrastructure as code and Config Tools, and we did some amount of testing when it came to DevOps, but we didn’t really get all of the benefits that we talked about. The software engineering “Desiderata” slide from earlier, a lot of those things didn’t really apply in the realm of DevOps. And so this really is about supercharging infrastructure teams’ ability to get things done. The demands on infrastructure teams are greater than ever before.

    And so this is one way to keep up with those demands. Empowering developers, from the perspective of an infrastructure engineer, is great because that means the infrastructure team isn’t always on the hook for getting everything done and getting blamed when schedules slip. Of course there’s often gotta be guardrails in place. For example, if you’re empowering a developer, that developer may not be the expert in all things security or how much things cost. And the infrastructure team, really, they have to keep control over those things and make sure that there’s not a serious security incident and so on.

    So cloud engineering is about empowering the right people to do the right job at the right time. I think one pattern we often see, especially with early stage startups, or mid-stage startups, you know, folks that were fortunate enough to be born in the cloud really are adopting models that are much closer to what Amazon Web Services themselves internally evangelize. You know, if you don’t need to create a separate DevOps organization or infrastructure organization at your scale, you know, that’s almost always preferable for folks because you can just empower the builders to build. And over time you have infrastructure experts who emerge. You know, configuring a virtual private cloud in Amazon? Most developers are gonna, you know, be bored out of their minds trying to figure out how best to do that.

    That’s where an expert in infrastructure and a domain expert in networking, for example, really can be valuable. But in the early days, if you’ve got a building block and it’s a virtual private cloud that’s been written in Pulumi, for example, and there’s a module and you can just go pick it up off the shelf, and you don’t have to become an expert in how that thing is configured, that’s probably preferable. Especially if your goal is to get something shipped. If you’re in a Y Combinator incubator and you’ve got demo day coming up next week, the last thing you wanna do is spend three days configuring a VPC. So use tried and true best practices, get up and running.

    And you know, this whole two pizza thing is, size of the team should be no bigger than can consume two pizzas. You know, it’s kind of rule of thumb for service teams. And the nice thing about that is it keeps the boundaries between the teams pretty reasonable, and you can get your arms around it. You can define it with an API, so that you’re, again, not dealing with ticketing, you’re reducing the amount that humans have to be in the loop and really using software as the interface between the teams. So this is a very popular model, even in larger companies.

    You know, obviously Amazon is quite a large company doing this at an incredible scale. I will mention there’s often site reliability engineering that emerges as a practice within these teams as well. So you often have an SRE expert who’s really coaching the team on how to run a highly reliable, you know, scalable service. A common pattern we see also in larger organizations is this concept of a platform team. The platform team often is trying to empower the organization around them.

    That includes the infrastructure and operations team. It also includes the developers. These are the folks that are usually setting up a Kubernetes cluster, usually setting the standard for how infrastructure is done in the organization. And, you know, this can be exciting for two reasons. One, the platform team can really focus on shipping the best platform. And the platform itself is a product for that team. Their customer is the developer within their organization or the operator within the organization. So it’s exciting for the infrastructure team because they can focus on building this amazing platform and really specialize in that. Historically this would have been PAS, or what have you. Kubernetes thankfully has given us a standard foundation to build on.

    And so whether it’s a PAS or just a opinionated assortment of services, often using the building blocks to create these reusable architectures, that that’s typically the approach that these platform teams are taking. And then we’ve got developers, and developers benefit in this world as well, because the developer can spin up infrastructure typically. And usually the interface is code, using infrastructure as software. And platform team can give them these reusable components and developers can spin them up. You know, one customer we work with actually has an opinionated Kubernetes cluster.

    They call it a microservice environment. So if a developer needs a dev or test environment, they can just go spin up one, you know, and be productive and just focus on building their microservices and building their applications. And that’s another form of developer-first, right? Developer-first isn’t just about developers being in the driver’s seat and oh, we don’t need DevOps or infrastructure teams. That’s not what developer-first is about. Developer-first is about thinking of the developer, and in this model the platform team really is thinking deeply about how to empower the developer.

    And I think that’s an exciting pattern as well. But there are a lot of different ways to organize, and frankly, people oscillate between these models as they grow and as they scale. And I think really the number one takeaway here is, you know, one plus one equals three. We’re seeing that thanks to this new approach to thinking about the cloud as an operating system, thinking about infrastructure as something we use software to deal with, that we’re breaking down the barriers between infrastructure teams and engineers and developers, and really enabling teams to build better things together. And that’s the best possible outcome that we can see.

    And so in summary, just to reiterate kind of some of the things we’ve chatted about today, so developer-first infrastructure really is about empowering builders to build, first and foremost. And empowering developers and infrastructure experts to build great cloud software. Infrastructure as software, not code, is the gateway into that world. You know, it’s what gives you programmable building blocks that can be assembled in infinite ways to build infinite new capabilities into your applications. Really, that foundation allows us to move beyond just the building blocks to reusable architectures.

    And I think that’s the way we stop reinventing the wheel. Every time I sit down to spin up a network in Amazon Web Services, I don’t wanna have to go read that 15 page white paper, I just wanna use an architecture off the shelf that’s written in software that I can compose, just like I do any of my application components as well. And we’re finally there, that we can do that with infrastructure as software. And finally, cloud engineering is the practice. I wish it were as easy as just sprinkling some technology magic pixie dust and everything just works, but it turns out actually how we work together as a team is the most important thing, and often the most difficult to get right.

    So I think empowering developers and empowering the builders, we’ve covered that, but that really is the first step in terms of getting to cloud engineering. And really working together and breaking down that wall between infrastructure teams and developers, is what leads to this one plus one equals three. So thank you very much for being here today. It was great to take you through the journey of developer-first infrastructure, the cloud as a new operating system, infrastructure as software, and cloud engineering. I hope you learned a thing or two, and I hope you enjoy the Cloud Engineering Summit. Thank you.

Get started today

Pulumi is open source and free to get started. Deploy your first stack today.