SE Radio 583: Lukas Fittl on Postgres Performance
Lukas Fittl of pganalyze discusses the performance of Postgres, one of the world's most popular database systems. SE Radio host Philip Winston speaks with Fittl about database indexing, queries, maintenance, scaling, and stored procedures. They also discuss some features of pganalyze, such as the index and vacuum advisors.
Lukas Fittl of pganalyze discusses the performance of Postgres, one of the world’s most popular database systems. SE Radio host Philip Winston speaks with Fittl about database indexing, queries, maintenance, scaling, and stored procedures. They also discuss some features of pganalyze, such as the index and vacuum advisors.
Show Notes
Related Episodes
Related Links
- Twitter @LukasFittl
- Mastodon: https://hachyderm.io/@lukas
- LinkedIn: https://www.linkedin.com/in/lfittl/
- 5mins of Postgres https://www.youtube.com/@pganalyze6516
Transcript
Transcript brought to you by IEEE Software magazine and IEEE Computer Society.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Philip Winston 00:00:43 Hi, this is Philip Winston for Software Engineering Radio. Today my guest is Lucas Fittl. Lucas is the founder and CEO of pganalyze. He has worked as a software engineer in the Postgres ecosystem for the last 15 years, including as a founding engineer at the startup Citus Data, which was acquired by Microsoft in 2019. Prior to that, he was an early engineer at Product Hunt, which was acquired by AngelList in 2016. Postgres is one of the most popular database systems in the world with over 30 years of active development. Today we’ll discuss Postgres database performance along with the pganalyze tools. We’ll start by introducing Postgres and pganalyze and then tour through various Postgres performance related features such as indexing, queries, scaling maintenance, and if we have time stored procedures. Welcome to the show Lucas, and can you tell us what is Postgres in your point of view?
Lukas Fittl 00:01:43 Of course. And thank you Phil for having me. So Postgres in general is, if you’re an application engineer, you might think of Postgres as the thing that just does work behind the scenes for you. So Postgres is a relational database system. It’s just turned 27 years old, so it’s old enough to drive a car and to write its own thesis . And Postgres was started at the University of Berkeley 27 years ago and it is extremely popular today, right? So most likely if you’re using application today, Postgres will be the backing data store behind it. The main data in the database is going to be stored in Postgres in many cases. And so really to me, Postgres is one of the most important open-source technologies today.
Philip Winston 00:02:24 What types of applications does Postgres excel at? I know it’s a very general purpose database, but if there is an area of what it’s most known for?
Lukas Fittl 00:02:34 Sure, and I think there isn’t necessarily one type of data that you can store in Postgres. I think generally what I’ve seen is that for system of record type data, right? So if were you, let’s say you build an application and you’re thinking where kind I actually store the data form application, that’s where Postgres excels at. But what Postgres is also good at is being extensible. So for example, PostGIS is a very popular extension for Postgres that allows you to work with geospatial data in Postgres. More recently there’s an extension called pgvector, which allows you to store vector data in Postgres so you can work with your AI, ML embedding type data with Postgres. And so that’s really the power of Postgres is that flexibility of being able to support individual data types that are useful for particular use cases.
Philip Winston 00:03:18 Yeah, I’m glad you mentioned PostGIS. That was going to be one of my questions. I think when we get into the specifics we might come back to PostGIS and how it impacts indexes and other things. How about the cloud computing angle on Postgres? I know Amazon has this RDS service and Aurora is a type of database. Are these Postgres under the hood or are they compatible with Postgres?
Lukas Fittl 00:03:45 Yeah, and that’s a good question. So generally you could run Postgres yourself on a virtual machine, right? So I think what I would always emphasize similar to the Linux kernel, you can just run Linux yourself, right? You can build it yourself. But what’s interesting of course is that roughly I would say 15 years ago, database as a service kind of took off as a concept. And so I would say one of the most prominent early starters in that space was Heroku Postgres. Heroku these days, unfortunately not that many people are using it anymore, but it I think was one of the first places that really brought developer experience as the main thing that they cared about and the experience. And so Heroku Postgres was a managed database service where instead of you having to run your own merchant machine, you were actually able to kind of have just an API that you could use. API in this case, meaning you run your SQL queries against a connection string.
Lukas Fittl 00:04:31 And so nowadays of course AWS and Amazon RDS service as well as Amazon Aurora is probably the most popular Postgres as a service database, managed database that you could find. What’s notable there I would say is that Aurora, for example, is a fork of Postgres. So what they’ve done essentially is they’ve taken the open-source Postgres code, which is BSD license, so like very liberally licensed. And so they took that code and they essentially made modifications to the storage engine. And part of the motivation there was to increase resiliency. So when you have different availability zones in AWS, they want to have a better concept of how the kind of resilient architecture of the database works. And so Aurora is essentially, very tight to AWS’s approach of doing things versus RDS as theyíre it’s still a forked version, but it’s less much less patched essentially than Aurora.
Lukas Fittl 00:05:19 And so if you want to be portable, right? If you want to have your postcards work similarly on GCP on AWS on Azure, then it’s certainly better to, in my opinion, focus on the RDS type service or Google for example. You have Google’s LWDB service, which is again their fork version of Postgres and they also have Cloud SQL, which is their more standard Postgres. And so I think oftentimes it’s actually better to not stray too much into the forks of Postgres because you get kind of special behaviors that are unique to each provider’s infrastructure.
Philip Winston 00:05:48 This might be self-evident, but why is database performance so important?
[...]