Giant Robots Smashing Into Other Giant Robots
00:00:00
/
00:45:55

490: Datadog with Sean O'Connor

August 31st, 2023

Sean O'Connor is the Director of Engineering at Datadog. Datadog is the essential monitoring and security platform for cloud applications.

Sean discusses his transition from an individual contributor to management and shares why he chose Datadog, emphasizing the appeal of high-scale problems and the real business nature of the company. They delve into the importance of performance management and observability and cover the cultural and technical challenges Sean faces in managing a diverse, geographically spread team, and discuss the transition at Datadog from a decentralized model to more centralized platforms, the corresponding changes in both technical strategies and people management, and what excites him about Datadog's future, including the integration of security offerings into developers' daily experiences, and the evolution of Kubernetes and internal build and release tooling.

__

Become a Sponsor of Giant Robots!

Transcript:

VICTORIA: This is the Giant Robots Smashing Into Other Giant Robots Podcast, where we explore the design, development, and business of great products. I'm your host, Victoria Guido.

WILL: And I'm your other host, Will Larry. And with us today is Sean O'Connor. He is the Director of Engineering at Datadog. Datadog is the essential monitoring and security platform for cloud applications. Sean, thank you for joining us.

SEAN: Hi, thanks for having me on.

VICTORIA: Yeah, I'm super excited to get to talking with you about everything cloud, and DevOps, and engineering. But why don't we first start with just a conversation about what's going on in your life? Is there any exciting personal moment coming up for you soon?

SEAN: Yeah, my wife and I are expecting our first kiddo in the next few weeks, so getting us prepared for that as we can and trying to get as much sleep as we can. [laughs]

WILL: Get as much sleep as you can now, so...[laughs] I have a question around that. When you first found out that you're going to be a dad, what was your feeling? Because I remember the feeling that I had; it was a mixed reaction of just everything. So, I just wanted to see what was your reaction whenever you found out that you're going to be a dad for the first time.

SEAN: Yeah, I was pretty excited. My wife and I had been kind of trying for this for a little while. We're both kind of at the older end for new parents in our late 30s. So, yeah, excited but definitely, I don't know, maybe a certain amount of, I don't know about fear but, you know, maybe just concerned with change and how different life will be, but mostly excitement and happiness. [laughs]

WILL: Yeah, I remember the excitement and happiness. But I also remember, like, wait, I don't know exactly what to do in this situation. And what about the situations that I have no idea about and things like that? So, I will tell you, kids are resilient. You're going to do great as a dad.

[laughter]

SEAN: Yep. Yeah, definitely; I think I feel much more comfortable about the idea of being a parent now than I may have been in my 20s. But yeah, definitely, the idea of being responsible for and raising a whole other human is intimidating. [laughs]

VICTORIA: I think the fact that you're worried about it is a good sign [laughs], right?

SEAN: I hope so. [laughs]

VICTORIA: Like, you understand that it's difficult. You're going to be a great parent just by the fact that you understand it's difficult and there's a lot of work ahead. So, I think I'm really excited for you. And I'm glad we get to talk to you at this point because probably when the episode comes out, you'll be able to listen to it with your new baby in hand. So...

WILL: Good. Excited for it. [laughs]

VICTORIA: Yeah, love that. Well, great. Well, why don't you tell me a little bit more about your other background, your professional background? What brought you to the role you're into today?

SEAN: Yeah. Well, like we mentioned in the beginning; currently, I'm a Director of Engineering at Datadog. I run our computing cloud team. It's responsible for all of our Kubernetes infrastructure, as well as kind of all the tooling for dealing with the cloud providers that we run on and as well as kind of [inaudible 02:54] crypto infrastructure.

Within Datadog, I've always been in management roles though I've kind of bounced around. I've been here for about five and a half years. So, before this, I was running a data store infrastructure team. Before that, when I first came in, I was running the APM product team, kind of bounced around between product and infra. And that's kind of, I guess, been a lot of the story of much of my career is wearing lots of different hats and kind of bouncing around between kind of infrastructure-focused roles and product-focused roles.

So, before this, I was running the back-end engineering and DevOps teams at Bitly. So, I was there for about five and a half years, started there originally as a software engineer. And before that, a lot of early-stage startups and consulting doing whatever needed doing, and getting to learn about lots of different kind of industries and domains, which is always fun. [laughs]

VICTORIA: That's great. So, you had that broad range of experience coming from all different areas of operations in my mind, which is, like, security and infrastructure, and now working your way into a management position. What was the challenge for you in making that switch from being such a strong individual contributor into an effective manager?

SEAN: Sure. You know, I think certainly there is a lot of kind of the classic challenges of learning to let go but still staying involved, right? You know, as a manager, if you're working on critical path tasks hands-on yourself, that's probably not a good sign. [laughs] On the other hand, if you come, like, completely divorced from what your team is doing, especially as, like, a team lead level kind of manager, you know, that's not great either. So, figuring that balancing act definitely was a bit tricky for me.

Similarly, I think time management and learning to accept that, especially as you get into, like, further steps along in your career that, like, you know, it's not even a question of keeping all the balls in the air, but more figuring out, like, what balls are made out of rubber and which ones are made out of glass, and maybe keeping those ones in the air. [laughs] So, just a lot of those kind of, like, you know, prioritization and figuring out, like, what the right level of involvement and context is, is definitely the eternal learning, I think, for me. [laughs]

WILL: I remember whenever I was looking to change jobs, kind of my mindset was I wanted to work at thoughtbot more because of the values. And I wanted to learn and challenge myself and things like that. And it was so much more, but those were some of the main items that I wanted to experience in my next job. So, when you changed, and you went from Bitly to Datadog, what was that thing that made you say, I want to join Datadog?

SEAN: Yeah, that was definitely an interesting job search and transition. So, at that point in time, I was living in New York. I was looking to stay in New York. So, I was kind of talking to a bunch of different companies. Both from personal experience and from talking to some friends, I wasn't super interested in looking at, like, working at mostly, like, the super big, you know, Google, Amazon, Meta type of companies. But also, having done, like, super early stage, you know, like, seed, series A type of companies, having played that game, I wasn't in a place in my life to do that either. [laughs] So, I was looking kind of in between that space.

So, this would have been in 2018. So, I was talking to a lot of, like, series A and series B-type companies. And most of them were, like, real businesses. [laughs] Like, they may not be profitable yet, but, like, they had a very clear idea of how they would get there and, like, what that would look like. And so, that was pleasant compared to some past points in my career.

But a lot of them, you know, I was effectively doing, like, automation of human processes, which is important. It has value. But it means that, like, realistically, this company will never have more than 50 servers. And when I worked at Bitly, I did have a taste for kind of working in those high-scale, high-availability type environments.

So, Datadog initially was appealing because it kind of checked all those boxes of, you know, very high-scale problems, high availability needs, a very real business. [laughs] This is before Datadog had gone public. And then, as I started to talk to them and got to know them, I also really liked a lot of kind of the culture and all the people I interacted with. So, it became a very clear choice very quickly as that process moved along.

VICTORIA: Yeah, a very real business. Datadog is one of the Gartner's Magic leaders for APM and observability in the industry. And I understand you're also one of the larger SaaS solutions running Kubernetes, right?

SEAN: Yep. Yeah, at this point. Five years ago, that story was maybe a little bit different. [laughs] But yeah, no, no, we definitely have a pretty substantial Kubernetes suite that we run everything on top of. And we get the blessings and curses of we get some really cool problems to work on, but there's also a lot of problems that we come across that when we talk to kind of peers in the industry about kind of how they're trying to solve them, they don't have answers yet either. [laughs] So, we get to kind of figure out a lot of that kind of early discovery games. [laughs]

VICTORIA: Yeah. I like how exciting and growing this industry is around kind of your compute and monitoring the performance of your applications. I wonder if you could kind of speak to our audience a little bit, who may not have a big technical background, about just why it's important to think about performance management and observability early on in your application.

SEAN: There can be a few pieces there. One of the bigger ones, I think, is thinking about that kind of early and getting used to working with that kind of tooling early in a project or a product. I think it has an analogous effect to, like, thinking about, like, compounding interest in, like, a savings account or investing or something like that. In that, by having those tools available early on and having that visibility available early on, you can really both initially get a lot of value and just kind of understanding kind of what's happening with your system and very quickly troubleshoot problems and make sure things are running efficiently.

But then that can help get to a place where you get to that, like, flywheel effect as you're kind of building your product of, as you're able to solve things quickly, that means you have more time to invest in other parts of the product, and so on and so forth. So, yeah, it's one of those things where kind of the earlier you can get started on that, the more that benefit gets amplified over time.

And thankfully, with Datadog and other offerings like that now, you can get started with that relatively quickly, right? You're not having to necessarily make the choice of, like, oh, can I justify spending a week, a month, whatever, setting up all my own infrastructure for this, as opposed to, you know, plugging in a credit card and getting going right away? And not necessarily starting with everything from day zero but getting started with something and then being able to build on that definitely can be a worthwhile trade-off. [laughs]

VICTORIA: That makes sense. And I'm curious your perspective, Will, as a developer on our Lift Off team, which is really about the services around that time when you want to start taking it really seriously. Like, you've built an app [laughs]. You know it's a viable product, and there's a market for it. And just, like, how you think about observability when you're doing your app building.

WILL: The approach I really take is, like, what is the end goal? I'm currently on a project right now that we came in later than normal. We're trying to work through that.

SEAN: I haven't come from, you know, that kind of consulting and professional services and support kind of place. I'm curious about, like, what, if any, differences or experiences do you have, like, in that context of, like, how do you use your observability tools or, like, what value they have as opposed to maybe more, like, straight product development?

VICTORIA: Right. So, we recently partnered with, you know, our platform engineering team worked with the Lift Off team to create a product from scratch. And we built in observability tools with Prometheus, and Grafana, and Sentry so that the developers could instrument their app and build metrics around the performance in the way they expected the application to work so that when it goes live and meets real users, they're confident their users are able to actually use the app with a general acceptable level of latency and other things that are really key to the functionality of the app.

And so, I think that the interesting part was, with the founders who don't have a background in IT operations or application monitoring and performance, it sort of makes sense. But it's still maybe a stretch to really see the full value of that, especially when you're just trying to get the app out the door.

SEAN: Nice.

VICTORIA: [chuckles] That's my answer. What kind of challenges do you have in your role managing this large team in a very competitive company, running a ton of Kubernetes clusters? [laughs] What's your challenges in your director of engineering role there?

SEAN: You know, it's definitely a mix of kind of, like, technical or strategic challenges there, as well as people challenges. On the technical and strategic side, the interesting thing for our team right now is we're in the middle of a very interesting transition. Still, today, the teams at Datadog work in very much a 'You build it, you run it' kind of model, right? So, teams working on user-facing features in addition to, like, you know, designing those features and writing the code for that, they're responsible for deploying that code, offering the services that code runs within, being on call for that, so on and so forth.

And until relatively recently, that ownership was very intense to the point where some teams maybe even had their own build and release processes. They were running their own data stores. And, like, that was very valuable for much of our history because that let those teams to be very agile and not have to worry about, like, convincing the entire company to change if they needed to make some kind of change.

But as we've grown and as, you know, we've kind of taken on a lot more complexity in our environment from, you know, running across more providers, running across more regions, taking on more of regulatory concerns, to kind of the viability of running everything entirely [inaudible 12:13] for those product teams, it has become much harder. [laughs]

You start to see a transition where previously the infrastructure teams were much more acting as subject matter experts and consultants to, now, we're increasingly offering more centralized platforms and offerings that can offload a lot of that kind of complexity and the stuff that isn't the core of what the other product-focused teams are trying to do.

And so, as we go through that change, it means internally, a lot of our teams, and how we think about our roles, and how we go about doing our work, changes from, like, a very, you know, traditional reliability type one on one consultation and advising type role to effectively internal product development and internal platform development. So, that's a pretty big both mindset and practice shift. [laughs] So, that's one that we're kind of evolving our way through.

And, of course, as what happens to kind of things, like, you still have to do all the old stuff while you're doing the new thing. [laughs] You don't get to just stop and just do the new thing. So, that's been an interesting kind of journey and one that we're always kind of figuring out as we go. That is a lot of kind of what I focus on.

You know, people wise, you know, we have an interesting aim of...There's about 40 people in my org. They are spread across EMEA and North America with kind of, let's say, hubs in New York and Paris. So, with that, you know, you have a pretty significant time zone difference and some non-trivial cultural differences. [laughs] And so, you know, making sure that everybody is still able to kind of work efficiently, and communicate effectively, and collaborate effectively, while still working within all those constraints is always an ongoing challenge. [laughs]

WILL: Yeah, you mentioned the different cultures, the different types of employees you have, and everyone is not the same. And there's so many cultures, so many...whatever people are going through, you as a leader, how do you navigate through that? Like, how do you constantly challenge yourself to be a better leader, knowing that not everyone can be managed the same way, that there's just so much diversity, probably even in your company among your employees?

SEAN: I think a lot of it starts from a place of listening and paying attention to kind of just see where people are happy, where they feel like they have unmet needs. As an example, I moved from that last kind of data store-focused team to this computing cloud team last November.

And so, as part of that move, probably for the first two or three months that I was in the role, I wasn't particularly driving much in the way of changes or setting much of a vision beyond what the team already had, just because as the new person coming in, it's usually kind of hard to have a lot of credibility and/or even just have the idea of, like, you know, like you're saying, like, what different people are looking for, or what they need, how they will respond best.

I just spend a lot of time just talking to people, getting to know the team, building those relationships, getting to know those people, getting to know those groups. And then, from there, figuring out, you know, both where the kind of the high priority areas where change or investment is needed. But then also figuring out, yeah, kind of based on all that, what's the right way to go about that with the different groups? Because yeah, it's definitely isn't a one size fits all solution.

But for me, it's always kind of starting from a place of listening and understanding and using that to develop, I guess, empathy for the people involved and understanding their perspectives and then figuring it out from there. I imagine–I don't know, but I imagine thoughtbot's a pretty distributed company. How do you all kind of think about some of those challenges of just navigating people coming from very different contexts?

WILL: Yeah, I was going to ask Victoria that because Victoria is one of the leaders of our team here at thoughtbot. So, Victoria, what are your thoughts on it?

VICTORIA: I have also one of the most distributed teams at thoughtbot because we do offer 24/7 support to some clients. And we cover time zones from the Pacific through West Africa. So, we just try to create a lot of opportunities for people to engage, whether it's remotely, especially offering a lot of virtual engagement and social engagement remotely. But then also, offering some in-person, whether it's a company in-person event, or encouraging people to engage with their local community and trying to find conferences, meetups, events that are relevant to us as a business, and a great opportunity for them to go and get some in-person interaction. So, I think then encouraging them to bring those ideas back.

And, of course, thoughtbot is known for having just incredible remote async communication happening all the time. It's actually almost a little oppressive to keep up with, to be honest, [laughs] but I love it. There's just a lot of...there's GitHub issues. There's Slack communications. There's, like, open messages. And people are really encouraged to contribute to the conversation and bring up any idea and any problem they're having, and actively add to and modify our company policies and procedures so that we can do the best work with each other and know how to work with each other, and to put out the best products.

I think that's key to having that conversation, especially for a company that's as big as Datadog and has so many clients, and has become such a leader in this metrics area. Being able to listen within your company and to your clients is probably going to set you up for success for any, like, tech leadership role [laughs].

I'm curious, what are you most excited about now that you've been in the role for a little while? You've heard from a lot of people within the company. Can you share anything in your direction in the next six months or a year that you're super excited about?

SEAN: So, there's usually kind of probably two sides to that question of kind of, like, from a product and business standpoint and from an internal infrastructure standpoint, given that's where my day-to-day focus is. You know, on the product side, one thing that's been definitely interesting to watch in my time at Datadog is we really made the transition from kind of, like, a point solution type product to much more of a platform.

For context, when I joined Datadog, I think logs had just gone GA, and APM was in beta, I think. So, we were just starting to figure out, like, how we expand beyond the initial infrastructure metrics product. And, obviously, at this point, now we have a whole, you know, suite of offerings. And so, kind of the opportunities that come with that, as far as both different spaces that we can jump into, and kind of the value that we can provide by having all those different capabilities play together really nicely, is exciting and is cool.

Like, you know, one of the things that definitely lit an interesting light bulb for me was talking to some of the folks working on our newer security offerings and them talking about how, obviously, you want to meet, you know, your normal requirements in that space, so being able to provide the visibility that, you know, security teams are looking for there.

But also, figuring out how we integrate that information into your developers' everyday experience so that they can have more ownership over that aspect of the systems that they're building and make everybody's job easier and more efficient, right? Instead of having, you know, the nightmare spreadsheet whenever a CVE comes out and having some poor TPM chase half the company to get their libraries updated, you know, being able to make that visible in the product where people are doing their work every day, you know, things like that are always kind of exciting opportunities.

On the internal side, we're starting to think about, like, what the next major evolution of our kind of Kubernetes and kind of internal build and release tooling looks like. Today, a lot of kind of how teams interact with our Kubernetes infrastructure is still pretty raw. Like, they're working directly with specific Kubernetes clusters, and they are exposed to all the individual Kubernetes primitives, which is very powerful, but it's also a pretty steep learning curve. [laughs]

And for a lot of teams, it ends up meaning that there's lots of, you know, knobs that they have to know what they do. But at the end of the day, like, they're not getting a lot of benefit from that, right? There's more just opportunity for them to accidentally put themselves in a bad place. So, we're starting to figure out, like, higher level abstractions and offerings to simplify how all that for teams look like.

So, we're still a bit early days in working through that, but it's exciting to figure out, like, how we can still give teams kind of the flexibility and the power that they need but make those experiences much easier and not have to have them become Kubernetes experts just to deploy a simple process. And, yes, so there's some lots of fun challenges in there. [laughs]

Mid-Roll Ad:

When starting a new project, we understand that you want to make the right choices in technology, features, and investment but that you don’t have all year to do extended research.

In just a few weeks, thoughtbot’s Discovery Sprints deliver a user-centered product journey, a clickable prototype or Proof of Concept, and key market insights from focused user research. We’ll help you to identify the primary user flow, decide which framework should be used to bring it to life, and set a firm estimate on future development efforts.

Maximize impact and minimize risk with a validated roadmap for your new product. Get started at: tbot.io/sprint.

WILL: I have a question around your experience. So, you've been a developer around 20 years. What has been your experience over that 20 years or about of the growth in this market? Because I can only imagine what the market was, you know, in the early 2000s versus right now because I still remember...I still have nightmares of dial-up, dial tone tu-tu-tu. No one could call you, stuff like that. So, what has been your experience, just seeing the market grow from where you started?

SEAN: Sure, yeah. I think probably a lot of the biggest pieces of it are just seeing the extent to which...I want to say it was Cory Doctorow, but I'm not sure who actually originally coined the idea, but the idea that, you know, software is eating the world, right? Like, eventually, to some degree, every company becomes a software company because software ends up becoming involved in pretty much everything that we as a society do.

So, definitely seeing the progression of that, I think, over that time period has been striking, you know, especially when I was working in more consulting contexts and working more in companies and industries where like, you know, the tech isn't really the focus but just how much that, you know, from an engineering standpoint, relatively basic software can fundamentally transform those businesses and those industries has definitely been striking.

And then, you know, I think from a more individual perspective, seeing as, you know, our tools become more sophisticated and easier to access, just seeing how much of a mixed bag that has become [laughs]. And just kind of the flavor of, like, you know, as more people have more powerful tools, that can be very enabling and gives voice to many people. But it also means that the ability of an individual or a small group to abuse those tools in ways that we're maybe not fully ready to deal with as a society has been interesting to see how that's played out.

VICTORIA: Yeah. I think you bring up some really great points there. And it reminds me of one of my favorite quotes is that, like, the future is here—it's just not evenly distributed. [laughs] And so, in some communities that I go to, everyone knows what Kubernetes is; everyone knows what DevOps is. It's kind of, like, old news. [laughs] And then, some people are still just like, "What?" [laughs].

It's interesting to think about that and think about the implications on your last point about just how dangerous the supply chain is in building software and how some of these abstractions and some of these things that just make it so easy to build applications can also introduce a good amount of risk into your product and into your business, right? So, I wonder if you can tell me a little bit more about your perspective on security and DevSecOps and what founders might be thinking about to protect their IP and their client's data in their product.

SEAN: That one is interesting and tricky in that, like, we're in a little bit of, like, things are better and worse than they ever have been before [laughs], right? Like, there is a certain level of, I think, baseline knowledge and competency that I think company leaders really just have to have now, part of, like, kind of table stakes, which can definitely be challenging, and that, like, that probably was much less, if even the case, you know, 10-20 years ago in a lot of businesses.

As an example, right? Like, obviously, like if it's a tech-focused company, like, that can be a thing. But, like, if you're running a plumbing business with a dozen trucks, let's say, like, 20 years ago, you probably didn't have to think that much about data privacy and data security. But, like, now you're almost certainly using some kind of electronic system to kind of manage all your customer records, and your job scheduling, and all that kind of stuff. So, like, now, that is something that's a primary concern for your business.

On the flip side of that, I think there is much better resources, and tools, and practices available out there. I forget the name of the tool now. But I remember recently, I was working with a company on the ISO long string of numbers certifications that you tend to want to do when you're handling certain types of data. There was a tool they were able to work with that basically made it super easy for them to, like, gather all the evidence for that and whatnot, in a way where, like, you know, in the past, you probably just had to hire a compliance person to know what you had to do and how to present that.

But now, you could just sign up for a SaaS product. And, like, obviously, it can't just do it for you. Like, it's about making your policies. But it still gave you enough support where if you're, like, bootstrapping a company, like, yeah, you probably don't need to hire a specialist to [inaudible 25:08], which is a huge deal.

You know, similarly, a lot of things come much safer by default. When you think about, like, the security on something like an iPhone, or an iPad, or an Android device, like, just out of the box, that's light-years ahead of whatever Windows PC you were going to buy ten years ago. [laughs] And so, that kind of gives you a much better starting place. But some interesting challenges that come with that, right? And that we do now, literally, every person on the planet is walking around with microphones and cameras and all kinds of sensors on them. It's an interesting balance, I think.

Similarly, I'm curious how you all think about kind of talking with your clients and your customers about this because I'm sure you all have a non-trivial amount of education to do there. [laughs]

VICTORIA: Yeah, definitely. And I think a lot of it comes in when we have clients who are very early founders, and they don't have a CTO or a technical side of their business, and advising them on exactly what you laid out. Like, here's the baseline. Like, here's where you want to start from. We generally use the CIS controls, this internet for internet security. It puts out a really great tool set, too, for some things you were mentioning earlier. Let's figure out how to report and how to identify all of the things that we're supposed to be doing. It could be overwhelming. It's a lot.

Like, in my past role as VP of Operations at Pluribus Digital, I was responsible for helping our team continue to meet our...we had three different ISO long number certifications [laughs]. We did a CMMI as well, which has come up a few times in my career. And they give you about a couple of hundreds of controls that you're supposed to meet. It's in very kind of, like, legalese that you have to understand. And that's a pretty big gap to solve for someone who doesn't have the technical experience to start.

Like, what you were saying, too, that it's more dangerous and more safer than it has been before. So, if we make choices for those types of clients in very safe, trusted platforms, then they're going to be set up for success and not have to worry about those details as much. And we kind of go forward with confidence that if they are going to have to come up against compliance requirements or local state regulations, which are also...there's more of those every day, and a lot of liability you can face as a founder, especially if you're dealing with, like, health or financial data, in the state of California, for example. [laughs]

It puts you at a really big amount of liability that I don't think we've really seen the impact of how bad it can be and will be coming out in the next couple of years now that that law has passed. But that's kind of the approach that we like to think. It's like, you know, there's a minimum we can do that will mitigate a lot of this risk [laughs], so let's do that. Let's do the basics and start off on the right foot here.

SEAN: Yeah, no, that makes sense. Yeah, it's definitely something I've come to appreciate, especially doing work in regulated spaces is, when you do reach the point where you do need to have some kind of subject matter expert involved, whether it's somebody in-house or a consultant or an advisor, I've definitely learned that usually, like, the better ones are going to talk to you in terms of, like, what are the risk trade-offs you're making here? And what are the principles that all these detailed controls or guidelines are looking to get at?

As opposed to just, like, walking you through the box-checking exercise. In my experience, a really good lawyer or somebody who will talk to you about risk versus just saying whether or not you can do something. [laughs] It has a very similar feeling in my experience.

VICTORIA: Yeah, it's a lot about risk. And someone's got to be able to make those trade-off decisions, and it can be really tough, but it's doable. And I think it shouldn't scare people away. And there's lots of people, lots of ways to do it also, which is exciting. So, I think it's a good space to be in and to see it growing and pay attention to. [laughs]

It's fun for me to be in a different place where we're given the opportunity to kind of educate or bring people along in a security journey versus having it be a top-down executive-level decision that we need to meet this particular security standard, and that's the way it's going to be. [laughs] Yeah, so that I appreciate.

Is there anything that really surprised you in your conversations with Datadog or with other companies around these types of services for, like, platform engineering and observability? Is there anything that surprised you in the discovery process with potential clients for your products?

SEAN: I think one of the biggest surprises, or maybe not a surprise but an interesting thing is, to what extent, you know, for us, I don't know if this is still the case, but I think in many places, like, we're probably more often competing against nothing than a competing product. And by that, I mean, especially as you look at some of our more sophisticated products like APM, or profiling, it's not so much that somebody has an existing tool that we're looking to replace; it's much more than this is just not a thing they do today. [laughs]

And so, that leads to a very interestingly different conversation that I think, you know, relates to some of what we were saying with security where, you know, I think a non-trivial part of what our sales and technical enablement folks do is effectively education for our customers and potential customers of why they might want to use tools like this, and what kind of value they could get from them.

The other one that's been interesting is to see how different customers' attitudes around tools like this have evolved as they've gone through their own migration to the cloud journeys, right? We definitely have a lot of customers that, I think, you know, 5, 10 years ago, when they were running entirely on-prem, using a SaaS product would have been a complete non-starter.

But as they move into the cloud, both as they kind of generally get more comfortable with the idea of delegating some of these responsibilities, as well as they start to understand kind of, like, the complexity of the tooling required as their environment gets more complex, the value of a dedicated product like something like Datadog as opposed to, you know, what you kind of get out of the box with the cloud providers or what you might kind of build on your own has definitely been interesting. [laughs]

VICTORIA: Is there a common point that you find companies get to where they're like, all right, now, I really need something? Can you say a little bit more about, like, what might be going on in the organization at that time?

SEAN: You know, I think there could be a few different paths that companies take to it. Some of it, I think, can come from a place of...I think, especially for kind of larger enterprise customers making a transition like that, they tend to be taking a more holistic look at kind of their distinct practices and seeing what they want to change as they move into the cloud. And often, kind of finding an observability vendor is just kind of, like, part of the checklist there. [laughs] Not to dismiss it, but just, like, that seems to be certainly one path into it.

I think for smaller customers, or maybe customers that are more, say, cloud-native, I think it can generally be a mix of either hitting a point where they're kind of done with the overhead of trying to maintain their own infrastructure of, like, trying to run their own ELK stack and, like, build all the tooling on top of that, and keeping that up and running, and the costs associated with that. Or, it's potentially seeing the sophistication of tooling that, like, a dedicated provider can afford to invest that realistically, you're never going to invest in on your own, right? Like, stuff like live profiling is deeply non-trivial to implement. [laughs]

I think especially once people get some experience with a product like Datadog, they start thinking about, like, okay, how much value are we actually getting out of doing this on our own versus using a more off-the-shelf product? I don't know if we've been doing it post-COVID. But I remember pre-COVID...so Datadog has a huge presence at re:Invent and the other similar major cloud provider things.

And I remember for a few years at re:Invent, you know, we obviously had, like, the giant 60x60 booth in the main expo floor, where we were giving demos and whatnot. But they also would have...AWS would do this, like, I think they call it the interactive hall where companies could have, like, more hands-on booths, and you had, like, a whole spectrum of stuff. And there were, like, some companies just had, like, random, like, RC car setups or Lego tables, just stuff like that.

But we actually did a setup where there was a booth of, I think, like, six stations. People would step up, and they would race each other to solve a kind of faux incident using Datadog. The person who would solve it first would win a switch. I think we gave away a huge number of switches as part of that, which at first I was like, wow, that seems expensive. [laughs]

But then later, you know, I was mostly working the main booth at that re:Invent. So by the, like, Wednesday and Thursday of re:Invent, I'd have people walking up to the main booth being like, "Hey, so I did the thing over at the Aria. And now I installed Datadog in prod last night, and I have questions." I was like, oh, okay. [laughs] So, I think just, like, the power of, like, getting that hands-on time, and using some of the tools, and understanding the difference there is what kind of gets a lot of people to kind of change their mind there. [laughs]

VICTORIA: You'd get me with a switch right now. I kind of want one, but I don't want to buy one.

SEAN: [laughs]

WILL: Same. [laughs]

VICTORIA: Because I know it'll take up all my time.

SEAN: Uh-huh. That's fair. [laughs]

VICTORIA: But I will try to win one at a conference for sure. I think that's true. And it makes sense that because your product is often going with clients that don't have these practices yet, that as soon as you give them exposure to it, you see what you can do with it, that becomes a very powerful selling tool. Like, this is the value of the product, right? [laughs]

SEAN: Yeah, there is also something we see, and I think most of our kind of peers in the industry see is, very often, people come in initially looking for and using a single product, like, you know, infrastructure, metrics, or logs. And then, as they see that and see where that touches other parts of the product, their usage kind of grows and expands over time. I would obviously defer to our earnings calls for exact numbers. But generally speaking, more or less kind of half of our new business is usually expanded usage from existing customers as opposed to new customers coming in. So, I think there's also a lot of just kind of organic discovery and building of trust over time that happens there, which is interesting.

VICTORIA: One of my favorite points to make, which is that SRE sounds very technical and, like, this really extreme thing. But to make it sound a little more easier, is that it is how you validate that the user experience is what you expect it to be. [laughs] I wonder if you have any other thoughts you want to add to that, just about, like, SRE and user experience and how that all connects for real business value.

SEAN: I think a lot of places where, you know, we've both seen internally ourselves and with customers is, you know; obviously, different companies operate in different models and whatnot. Where people have seen success is where, you know, people with formal SRE titles or team names can kind of be coming in as just kind of another perspective on the various kind of things that teams are trying to drive towards.

The places reliability is successfully integrated is when they can kind of make that connection that you were talking about. It's, like, obviously, everybody should go take their vitamins, but, like, what actual value is coming from this, right? Nobody wants to have outages, but, like, to do the work to invest in reliability, often, like, it can be hard to say, like, okay, what's the actual difference between before and after? Having people who can help draw those connections and help weigh those trade-offs, I think, can definitely be super helpful.

But it is generally much more effective, I think, in my experience, when it does come from that perspective of, like, what value are we providing? What are we trading off as part of this? As opposed to just, well, you should do this because it's the right thing to do, kind of a moralistic perspective. [laughs] But, I don't know, how do you all kind of end up having that conversation with your customers and clients?

VICTORIA: That's exactly it. That's the same. It's starting that conversation about, like, well, what happens when this experience fails, which designers don't necessarily think about? What's, like, the most important paths that you want a user to take through your application that we want to make sure works?

And when you tie it all back there, I think then when the developers are understanding how to create those metrics and how to understand user behavior, that's when it becomes really powerful so that they're getting the feedback they need to do the right code, and to make the right changes. Versus just going purely on interviews [laughs] and not necessarily, like, understanding behavior within the app. I think that starts to make it clear.

SEAN: Part of that, I think that's been an interesting experience for us is also just some of the conversation there around, like, almost the flip side of, when are you investing potentially too much in that, right? Because, like, especially after a certain point, the cost of additional gains grows exponentially, right? Each one of those nines gets more and more expensive. [laughs]

And so, having the conversation of, like, do you actually need that level of reliability, or, like, is that...just like what you're saying. Like, you know, kind of giving some of that context and that pressure of, like, yeah, we can do that, but, like, this is what it's going to cost. Is that what you want to be spending your money on? Kind of things can also be an interesting part of that conversation.

VICTORIA: That's a really good point that, you know, you can set goals that are too high [laughs] and not necessary. So, it does take a lot of just understanding about your data and your users to know what are acceptable levels of error.

I think the other thing that you can think about, too, like, what could happen, and we've seen it happen with some startups, is that, like, something within the app is deeply broken, but you don't know. And you just think that you're not having user engagement, or that users are signing off, or, like, you know, not opening the app after the first day.

So, if you don't have any way to really actively monitor it and you're not spending money on an active development team, you can have some method to just be confident that the app is working and to make your life less miserable [laughs] when you have a smaller team supporting, especially if you're trying to really minimize your overhead for running an application.

SEAN: Yep. It's surprisingly hard to know when things are broken sometimes. [laughs]

VICTORIA: Yes, and then extremely painful when you find out later [laughs] because that's when it's become a real problem, yeah. I wonder, are there any other questions you have for me or for Will?

SEAN: How big of an organization is thoughtbot at this time?

VICTORIA: Close to 75 people? We're, yeah, between the Americas and the [inaudible 38:31] region. So, that's where we're at right now, yeah.

SEAN: Nice. At that size, like, and I guess it sounds like you're pretty heavily distributed, so maybe some of this doesn't happen as much, but, like, one of the things I definitely remember...so, when I joined Datadog, it was probably about 500 people. And I think we're just under 5,000 now. There are definitely some points where there were surprisingly, like, physical aspects to where it became a problem of just, like, where certain teams didn't fit into a room anymore. [laughs] Like, I had surprise in the changes in that, like, dynamic. I'm curious if you've all kind of run into any kind of, I don't know, similar interesting thresholds or changes as you've kind of grown and evolved.

WILL: I will say this, we're about 100, I think, Victoria.

VICTORIA: Oh, okay, we're 100 people. I think, you know, I've only been at thoughtbot for just over a year now. And my understanding of the history is that when we were growing before COVID, there's always been a very intentionality about growth. And there was never a goal to get to a huge size or to really grow beyond just, like, a steady, profitable growth. [laughs]

So, when we were growing in person, there were new offices being stood up. So, we, you know, maybe started out of New York and Boston and grew to London. And then, there was Texas, and I think a few other ones that started. Then with COVID, the decision was made to go fully remote, and I think that's opened up a lot of opportunities for us. And from my understanding in the previous and the past, is that there's a big shift to be fully remote.

It's been challenging, where I think a lot of people miss some of the in-person days, and I'm sure it's definitely lonely working remote all day by yourself. So, you have to really proactively find opportunities to see other people and to engage remotely. But I think also, we hire people from so many different places and so much different talent, and then, also, you know, better informs our products and creates a different, you know, energy within the company that I think is really fun and really exciting for us now.

WILL: Yeah, I would agree with that because I think the team that I'm on has about 26 people on the Lift Off team. And we're constantly thinking of new ways to get everyone involved. But as a developer, me myself being remote, I love talking to people. So, I try to be proactive and, like, connect with the people I'm working with and say, "Hey, how can I help you with this?" Let's jump in this room and just work together, chat together, and stuff like that, so...

And it has opened the door because the current project that I'm on, I would never have had an opportunity to be on. I think it's based in Utah, and I'm in South Florida. So, there's just no way if we weren't remote that I'd been a part of it. So...

SEAN: Nice. And I can definitely appreciate that. I remember when we first started COVID lockdown; I think, at that point, Datadog was probably about...Datadog engineering was probably about 30% remote, so certainly a significant remote contingent but mixed. But my teams were pretty remote-heavy. So, in some ways, not a lot changed, right? Like, I think more people on my team were, like, who are all these other people in my house now instead of [laughs], I mean, just transition from being in an office to working from home.

But I do remember maybe, like, about six months in, starting to feel, yeah, some of the loneliness and the separation of just, like, not being able to do, like, quarterly team meetups or stuff like that. So, it's definitely been an interesting transition. For context, at this point, we kind of have a hybrid setup. So, we still have a significant kind of full-time remote contingent, and then four people who are in office locations, people joining for about three days a week in office. So, it's definitely an interesting transition and an interesting new world. [laughs]

VICTORIA: Yeah. And I'm curious how you find the tech scene in Denver versus New York or if you're engaging in the community in the same way since you moved.

SEAN: There definitely is some weirdness since COVID started [laughs] broadly [inaudible 42:21]. So, I moved here in 2020. But I'd been coming out here a lot before that. I helped to build an office here with Bitly. So, I was probably coming out once a quarter for a bunch of years. So, one parallel that is finally similar is, like, in both places, it is a small world. It doesn't take that long for you to be in that community, in either of those communities and start running into the same people in different places. So, that's always been [inaudible 42:42] and especially in New York. New York is a city of what? 8, 9 million people?

But once you're working in New York tech for a few years and you go into some meetups, you start running into the same people, and you have one or two degrees [inaudible 42:52] to a lot of people, surprisingly quickly. [laughs] So, that's similar. But Denver probably is interesting in that it's definitely transplant-heavy. I think Denver tends to check the box for, like, it was part of why Bitly opened an office here and, to a degree, Datadog as well.

I think of like, you know, if you're trying to recruit people and you previously were mostly recruiting in, like, New York or Silicon Valley; if you're based in New York, and you're trying to recruit somebody from Silicon Valley, and part of why they're looking for a new gig is they're burned out on Silicon Valley, asking them to move to New York probably isn't all that attractive. [laughs] But Denver is different enough in that in terms of kind of being a smaller city, easier access to nature, a bunch of that kind of stuff, that a lot of times we were able to attract talent that was a much more appealing prospect. [laughs]

You'll see an interesting mix of industries here. One of the bigger things here is there's a very large government and DOD presence here. I remember I went to DevOps Days Rockies, I think, a few years ago. There was a Birds of a Feather session on trying to apply DevOps principles in air-gapped networks. That was a very interesting conversation. [laughs]

VICTORIA: That's interesting. I would not have thought Colorado would be a big hub for federal technology. But there you go, it's everywhere.

WILL: Yeah.

SEAN: Denver metro, I think, is actually the largest presence of federal offices outside of the D.C. metro.

VICTORIA: That's interesting. Yeah, I'm used to trying to recruit people into D.C., and so, it's definitely not the good weather, [laughs], not a good argument in my favor. So, I just wanted to give you a final chance. Anything else you'd like to promote, Sean?

SEAN: Generally, not super active on social things these days, but you can find whatever I have done at seanoc.com, S-E–A-N-O-C.com for the spelling. And otherwise, if you're interested in some engineering content and hearing about some of those kind of bleeding edge challenges that I was mentioning before, I would definitely check out the Datadog engineering blog. There's lots of kind of really interesting content there on both, you know, things we've learned from incidents and interesting projects that we're working on. There's all kinds of fun stuff there.

VICTORIA: That makes me think I should have asked you more questions, Sean. [laughs] No, I think it was great. Thank you so much for joining us today. I'll definitely check all that stuff out.

You can subscribe to the show and find notes along with a complete transcript for this episode at giantrobots.fm. If you have questions or comments, email us at hosts@giantrobots.fm. You can find me on Twitter @victori_ousg.

WILL: And you can find me on Twitter @will23larry.

This podcast is brought to you by thoughtbot and produced and edited by Mandy Moore. Thanks for listening. See you next time.

ANNOUNCER: This podcast is brought to you by thoughtbot, your expert strategy, design, development, and product management partner. We bring digital products from idea to success and teach you how because we care. Learn more at thoughtbot.com.

Support Giant Robots Smashing Into Other Giant Robots
Sponsors