rss-bridge 2023-05-10T20:53:00+00:00

SE Radio 563: David Cramer on Error Tracking

In this episode, David Cramer, co-founder and CTO of Sentry, joins host Jeremy Jung for a conversation about error tracking. The discussion starts with treating performance problems as errors, why you might not need logs, and how most applications share the same problems. From there they consider other topics including capturing information by hooking into runtimes and frameworks, issues with the quality of Open Telemetry data, how front-end applications are constantly changing and why that makes them hard to instrument. Finally, they discuss how Sentry's architecture has evolved, and why they switched from a permissive license to the Business Source License.

In this episode, David Cramer, co-founder and CTO of Sentry, joins host Jeremy Jung for a conversation about error tracking. The discussion starts with treating performance problems as errors, why you might not need logs, and how most applications share the same problems. From there they consider other topics including capturing information by hooking into runtimes and frameworks, issues with the quality of Open Telemetry data, how front-end applications are constantly changing and why that makes them hard to instrument. Finally, they discuss how Sentry’s architecture has evolved, and why they switched from a permissive license to the Business Source License.

Show Notes

Alex Boten on Open Telemetry

Ben Sigelman on Distributed Tracing

Jon Gifford on Logging and Logging Infrastructure

Transcript

Transcript brought to you by IEEE Software magazine.

This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Jeremy Jung 00:00:16 Today I’m talking to David Cramer, he’s the founder and CTO of Sentry. David, welcome to Software Engineering Radio.

David Cramer 00:00:25 Thanks for having me. Excited for today’s conversation.

Jeremy Jung 00:00:28 I think the first thing we could start with is defining what Sentry is. I know some people refer to it as an error tracker. Some people have referred to it as an application performance monitoring tool. I wonder if you could kind of describe in your words what it is.

David Cramer 00:00:47 You know, as somebody who doesn’t work in marketing, I just tell it how it is. So Sentry started out doing error monitoring, which you know, depending on who you talk to, you might just think of as logging, right? Like that’s the honest truth. It is just logging; just a different shape or form these days. It’s hard to not classify us as just an APM tool that’s like the industry that exists. It’s like the tools people understand. So I would just say it’s an APM tool, right? We do a bunch of things within that space and maybe it’s not item-for-item the same as, say, a product like New Relic, but a lot of the overlap’s there. So it’s like errors performance, which is like latency and sort of throughput. And then we have some stuff that just goes a little bit deeper within that. The one thing I would say that is different for us versus a lot of these tools is we actually only do application monitoring. So we don’t do any SY like systems or infrastructure monitoring, meaning Sentry is not going to tell you when you need to replace a hard drive or even that you need new hard, like more disk space or something like that because it’s just, it’s a domain that we don’t think is relevant for sort of our customers and product.

Jeremy Jung 00:01:48 For people who aren’t familiar with the term application performance monitoring, what is that compared to just error tracking?

David Cramer 00:01:56 The way I always reason about it, this is what I tell new hires and what I would tell like my mother if I had to explain what I do is like you load Uber and it crashes. We all know that’s bad, right? That’s air monitoring. We capture the crash report, we send it to developers, you load Uber and it’s a 32nd spinner, like a loading indicator as a customer, same outcome for me. I assume the app is broken, right? So we also know that’s bad, but that’s different than a crash. Okay. Central captures that same thing, incentive developers. Lastly, the third example we use, which is a little bit more, I think untraditional, but our non-traditional rather, you load the Uber app and it’s like a blank screen or there’s no button to submit, like log in or something like this. So it’s kind of like a, it’s broken but it maybe isn’t airing and it’s not like a slow thing, right? Same outcome. It’s probably a bug of some sorts. Like it’s what an end user would describe it as a bug. So for me, APM just translates to there are bugs, user perceived bugs in your application and we’re able to monitor and and help the software teams sort of prioritize and resolve those, those concerns.

Jeremy Jung 00:02:56 Earlier you were talking about actual crashes and then your second case is may be more of if the app is running slowly then that’s not necessarily a crash, but it’s still something that an APM would monitor.

David Cramer 00:03:11 Yeah, yeah. And I, I think to be fair, apm historically it’s not a very meaningful term. Like I as a, when I was more of just an individual contributor, I would associate APM to like there’s a dashboard that will tell me what’s slow in my application, which it does and that is kind of court apm, but it would also, none of the traditional tools, precent would actually tell you why it’s broken, like when there’s an error, a crash, it was like most of those tools were kind of useless. And I don’t know, I do actually know, but I’m gonna pretend I don’t know about most people and just say for myself. But most of the time my problems are errors. They are not like it’s fast or slow, you know? And so we just think of it as like it’s a holistic thing to say when I’ve changed the application and something’s broken or it’s a bug, you know, what is that bug?

David Cramer 00:03:52 How do we help people fix it? And that comes from a lot of different like data signals and things like that. The end result is still the same. You either are gonna fix it or it’s not important and you ignore it. I don’t know. And so it’s a pretty straightforward premise for us. But again, most companies in the space, like the traditional company is when you grow a big company, what happens is like you build one thing and then you build lots of check boxes to sell more things. And so I think a lot of the APM vendors, like they’ve created a lot of different products. Like RUM is a good example of another acronym that lives with an apm. And I would tell you RUM is completely meaningless. It, it stands for real user monitoring. And so I’m like, well what’s not real about monitoring the application? Well, nothing’s not real, but like they created a new category because that’s how marketing engines work. And that new category is more like analytics than it is like application telemetry. And it’s only because they couldn’t collect the app, the application telemetry at the time. And so there’s just a lot of fluff I would say. But at the end of the day too, like developers or engineering teams, it’s like new version of the application. You broke something, let’s tell you about it so you can fix it.

Jeremy Jung 00:04:51 And so earlier you were saying how this is a kind of logging, but there’s also other companies, other products that are considered like logging infrastructure. Like I I think of companies like Paper Trail or Log Tail. So what space does Sentry fill that’s different than that kind of logging?

[...]

Original source

SE Radio 563: David Cramer on Error Tracking

Show Notes

Links

Transcript