DEV Community

Franck Pachot for AWS Heroes

Posted on

Book review: Mastering PostgreSQL 15

Generally, I enjoy books that include easy-to-understand and reproducible code examples, as well as a wealth of real-life experience-based information. Mastering PostgreSQL 15 by Hans-Jürgen Schönig is one such book that provides a valuable learning resource for PostgreSQL.

The first chapter, which focuses on PostgreSQL 15, is particularly useful for readers who are already familiar with PostgreSQL. In my opinion, the most crucial aspect is "ICU locales." All database administrators should comprehend this concept, even if they have not yet upgraded to PG15. For developers, the MERGE statement is likely the most important feature to learn about.

The second chapter delves into transactions, locking, and concurrency. Reading this chapter, I appreciate that the book aligns with my perspective on how people should learn about SQL databases. A SQL database allows you to forget about concurrent users, which is one of its most appealing aspects. You can code and test as if you are the only one using the database, and the database ensures the consistency of the data you read and write, even in scenarios where other users are working on the same dataset. However, it is essential to understand these concepts before benefiting from them. As a developer, you need to define transaction boundaries (the "A" in ACID) and ensure serializability (the "I" in ACID). Additionally, the application developer must handle errors by determining whether to wait, fail, or retry. Once you grasp these concepts, you will no longer require excessive code or testing. I highly recommend spending time on this chapter, as it offers clear explanations and examples. It also elucidates PostgreSQL's MVCC (Multi-Version Concurrency Control) and the crucial role of VACUUM and all its side effects. Before deploying your application to production, it is vital to comprehend the consequences of PostgreSQL's MVCC and how to effectively manage it.

The third chapter focuses on Indexes and commences with a discussion on query cost and execution plans. Once again, I appreciate that it begins with practical aspects, such as creating the necessary indexes, before delving into the underlying theory as needed. I strongly advise against skipping this chapter. One of SQL's remarkable features is that you do not need to instruct the database on how to access the data. It is a declarative language that allows you to describe the desired result. However, this does not imply that you can overlook the understanding of how the database actually retrieves the data. The book also explores PostgreSQL's extensibility, such as creating operators, and provides an example. This introduction leads to subsequent discussions on the various index types available in PostgreSQL, including Hash, GiST, GIN, SP-GiST, and BRIN. Furthermore, the book showcases text-search functionalities, as PostgreSQL not only provides the necessary functions but also optimizes them using the appropriate index types.

The fourth chapter delves into "Advanced SQL." While some topics may not be immediately relevant to first-time users, it is crucial to familiarize yourself with them, as they may prove invaluable in solving future problems. If you are building a data warehouse for analytics, this chapter is indispensable, as it covers window functions and partitioning. Additionally, for modern OLTP applications, understanding JSONB is highly recommended, as it eliminates the need for using a separate document-oriented database.

The subsequent chapters cater to DBAs, but they are also relevant when using managed services. "Observability and troubleshooting" addresses topics such as "Log Files and System Statistics". "SQL Optimization: Optimizing Queries for Good Performance" provides more insights into execution plans and partitioning. I suggest reading these chapters before encountering any issues in a production environment, as that is not an ideal time to learn about execution plans or partitioning a terabyte-sized table. I appreciate that system tuning with parameters is covered towards the end, as it is important to first understand your queries before fine-tuning those parameters.

The next chapter on procedural languages, specifically "stored procedures," which play a crucial role in maximizing performance through effective application design. By processing data directly where it is stored, significant performance gains can be achieved. The chapter also covers triggers, which are procedures triggered by specific SQL events.

Stored procedures serve multiple purposes, including encapsulation and providing a secure API to the database. The following chapter delves into the topic of security, highlighting the fact that a database is not solely responsible for storing and processing data. It also offers a robust security model to manage access privileges effectively.

For the DBA responsible for the infrastructure, the chapters on Backup and Recovery, Replication, and HA cluster with Patroni are of utmost importance. While these chapters provide a comprehensive overview, it's worth noting that they may not be sufficient for ensuring a safe production environment without prior DBA experience. Nonetheless, the book does an excellent job of describing PostgreSQL features in these areas.

I would like to add that I often feel uncomfortable when pg_dump is presented as a backup solution, as it primarily exports data and does not encompass the backup of the entire database with its structures and transactions. Similarly, describing streaming replication as a high availability (HA) solution can be misleading since its recovery mechanism involves complete downtime and the potential for data loss. However, it's important to note that these aspects go beyond the scope of a single book on PostgreSQL and may require additional components that are not part of the PostgreSQL core. Nevertheless, the chapters in question provide clear explanations of the PostgreSQL features in these areas.

The chapter on extensions provides a comprehensive overview of available extensions in the contrib module, which can be immensely helpful for developers and DBAs in solving various problems. The subsequent chapter on troubleshooting serves as a valuable starting point for effectively managing a PostgreSQL database and addressing common issues that may arise.

Lastly, the chapter on migrating from other databases to PostgreSQL offers valuable ideas and considerations for the migration process. However, it's important to recognize that database migration is a complex topic with additional factors to consider. While the chapter provides a solid foundation, seeking further expertise and guidance for specific migration or backup concerns is always beneficial.

In summary, "Mastering PostgreSQL 15" offers comprehensive coverage of various aspects of PostgreSQL, delivering valuable insights and practical examples. The book successfully enhances readers' understanding of the database, serving as a valuable resource for utilizing PostgreSQL effectively. Although certain complexities may necessitate additional expertise, the chapters provide a strong foundation and valuable guidance. One particularly commendable aspect of the book is its well-structured flow, beginning with essential knowledge for developers and gradually progressing to more intricate topics related to database management. This approach ensures that readers can establish a solid understanding of PostgreSQL and subsequently delve into complex areas with confidence. Overall, "Mastering PostgreSQL 15" is a valuable companion for beginners and experienced users alike, equipping them with the knowledge and tools to leverage the power of PostgreSQL effectively.

Top comments (0)