Introduction
Data available on the internet has been growing fast in recent years. Therefore, we have seen the importance of studying the design of an RDBMS. In this post, you will discover how its life cycle works, having an overview of the stages from modeling to data storage. By understanding this operation, we can more closely examine each part of its architecture.
Database Life Cycle
The database life cycle consists of basic steps from the conceptual model to implementation for designing a database and maximizing its performance. This definition applies to relational databases, such as PostgreSQL. Other types, such as distributed databases, are beyond the scope of this article.
The authors of the book "Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more" describe the database life cycle in four parts, as defined in Figure 1:
1. Requirements Analysis
To ensure that your database meets the needs of your business, it is essential to define which data will be stored, how they relate to each other, their characteristics, and additional important information. This process usually takes the form of an interview with those who produce and use this data.
It's also at this stage that the choice of the right platform for implementation comes into play. So, suppose that we need to create a database to store articles for a research project. Each article has:
- Title;
- At least one author;
- Publication date;
- Publication location;
- The number of citations.
Each author and place of publication also have their characteristics and relationships, but to simplify the example, let's focus only on the articles. Their relationships are through citations. The articles must strongly correlate through them to ensure well-founded research. Therefore, we can say that articles cite (refer to) other articles. An excellent platform to implement this database is PostgreSQL. In addition to storing data securely and efficiently, we can use the Apache AGE extension to visualize the citation graph and evaluate if the research is well-founded and if there are "loose" references without relationships, among other important information.
2. Logical Design
In the logical design phase, all requirements transform into a model, which can be either Conceptual or Logical. The Conceptual model represents the data and its relationships in an abstract form, while the Logical model provides a more detailed and implementation-oriented representation. It is important to note, however, that the purpose of this text is not to detail each model but to present its role in the database design process.
The logical design captures the reality of the user's world in terms of data elements and their relationships. In the example from the previous section, we would represent all the relationships between the articles in the form of a diagram. We will explore this example further in an upcoming post. The design aims to facilitate query programming and data updating. At this stage, the normalization of SQL tables already occurs. If you want to learn more about normalization, this article is a good example.
3. Physical design
The focus of the physical design is on operating data with high efficiency. This stage involves defining methods for storing and accessing tables on disk. The goal is to maximize the database's performance. Therefore, it is essential to have a clear understanding of how the database system works. We can measure database performance in two ways:
- Through the time it takes for the database to respond to a query or complete an update for a specific application;
- Through throughput, which is the number of transactions per second that the database system can handle in a specified period, considering all applications that use the database.
Physical resources that can cause delays when executing database applications include CPU, I/O (such as a disk), and Computer networks.
4. Implementation, Monitoring and Maintenance
This is the final stage of the database system life cycle, where we create and maintain the database. We can briefly describe it as follows:
Implementation: Creation of tables in the database using Data Definition Language (DDL) and updating, performing queries, and configuring indexes and constraints with Data Manipulation Language (DML). The SQL language encompasses both languages.
- Example of DDL:
CREATE TABLE
command; - Example of DML:
SELECT
command.
Monitoring: Monitoring the performance of the database. PostgreSQL implements the EXPLAIN
command (Section 3.2 of the book "The Internals of PostgreSQL”), which helps monitor the performance of a query by providing its cost.
Maintenance: Modifications to the database to meet performance requirements and changes in user requirements, for example. In this phase, there are design adjustments, which continue the life cycle of the DBMS.
Conclusion
In this article, we understood the importance of studying the resources and implementation of an RDBMS to ensure proper management of its life cycle. We also saw that this consists of a requirements analysis, logical design, physical design, implementation, monitoring, and maintenance. In the next post, I will apply some of these concepts by implementing the article database given as an example in the requirements analysis section, using PostgreSQL with Apache AGE for graph visualization. So, if you like to get your hands dirty, don't miss the next post!
Errata
My intention is to provide access to technology information through reliable sources. If you have found any incorrect information, please let your contribution in the comments below 😊.
Related Articles
How an RDBMS works #1: Lessons from “The Internals of PostgreSQL”
How an RDBMS works #2: Reasons to study its design
Collaborate to Innovate: How a Social Network Can Help You Find Your Dream Pair Programming Team
References
Lightstone, Sam S., Toby J. Teorey, and Tom Nadeau. Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more. Morgan Kaufmann, 2010.
Microsoft. Description of the database normalization basics. 2023. Available at https://learn.microsoft.com/en-us/office/troubleshoot/access/database-normalization-description. Acessed on 03/31/2023.
SUZUKI, Hironobu. The Internals of PostgreSQL for database administrators and system developers. Interdb. 2015. Available at https://www.interdb.jp/pg/. Acessed on 03/25/2023.
Souza, Wander Inácio de. Database I Class Notes - Concepts and Architecture of Database Systems. Federal University of Ouro Preto. 2016. Unpublished.
Banner designed by Freepik.
Contribute to Apache AGE
Apache AGE website: https://age.apache.org/
Apache AGE Github: https://github.com/apache/age
Top comments (0)