Terminologies
Here are defined some terms of special importance when dealing with the PostgreSYS Platform. This list presents the most important terms used in this documentation, along with their definitions and additional details where applicable.
Asynchronous Data Replication
Asynchronous replication in PostgreSQL allows data to be copied to replica servers without blocking operations on the primary server, supporting high availability and geographic distribution of data.
Authentication Methods
PostgreSQL supports several authentication methods, including trust
, password
, md5
, scram-sha-256, and
reject``, each providing different levels of security and convenience.
Below we detail some of the main authentication methods supported by PostgreSQL:
trust: The trust
authentication method is one of the simplest and least secure configurations for access control, allowing users to connect to the database without requiring a password. It is generally used in development environments or situations where the network is highly reliable and secure. By using trust
, the database administrator assumes that all connections to the database are legitimate and authorized, without the need to validate user credentials, which can pose a significant security risk if misconfigured or used in insecure environments.
password: PostgreSQL's password
authentication method is a simple mechanism that requires users to provide a plain-text password to connect to the database. This method is less secure than options such as md5
or scram-sha-256
because passwords sent during the authentication process can be intercepted if the connection is not secured by an additional layer of protection, such as SSL/TLS. The password
authentication method can be used in testing or development environments where convenience is a priority, and security risks are minimal, but it is not recommended for production environments.
MD5: The md5
authentication method is a secure access control mechanism that requires users to provide a password, which is encrypted using the MD5 algorithm before being transmitted over the network. It is widely used for offering a good balance between security and ease of use, protecting access credentials from being intercepted during authentication. By employing md5
, PostgreSQL
ensures that even if network traffic is intercepted, the actual password will not be exposed, as only the encrypted hash of the password is transmitted and verified against the value stored on the server, helping to prevent "man-in-the-middle" attacks.
SCRAM-SHA-256: The scram-sha-256
authentication method represents a significant improvement in terms of security for user authentication. Based on the Salted Challenge Response Authentication Mechanism (SCRAM) with the SHA-256 hash, this method provides a more secure way of credential verification by using a combination of a salt (a random value added to the password before hashing) and multiple hashing iterations to protect passwords against brute-force and dictionary attacks. Additionally, scram-sha-256
implements a challenge-response authentication process in which the password is never sent over the network, not even as a hash, enhancing security against interception and "man-in-the-middle" attacks. It is recommended for environments requiring high levels of security in user authentication.
reject: PostgreSQL's reject
authentication method is used to explicitly deny connection access to the database, regardless of any authentication attempt. When configured for a specific access rule, this method blocks all connection attempts from specified users or IP addresses without even considering credentials or other authentication methods. The use of reject
is particularly useful in security scenarios where proactive prevention of access from certain users or networks is required, serving as an access control and security tool for system administrators who want to ensure that only authorized entities can attempt to connect to the database system.
B-tree Index
B-tree indexes are the standard index structure in PostgreSQL, optimized for fast processing of queries involving sorting and range searches.
Catalogs
Catalogs in PostgreSQL contain metadata that describes and tracks data structures, such as tables, indexes, and functions. They are essential for the internal management and operation of the database.
Checksum
A Checksum is a verification sum used to verify the integrity of data. Algorithms such as MD5 and SHA are used to generate a unique hash value from input data, which can be used to detect changes in the data.
Crontab
Crontab is a Unix tool for scheduling commands via the cron service using a crontab file with time-based execution rules. Modifies a file detailing commands, execution times, and dates, managed by cron, a service scheduling commands on Unix-Type OS. Cron checks the time to see if there's a program to run, executing it as scheduled.
chroot jail
A security technique that isolates processes, by restricting their view of the file system to the specific directory, thereby limiting the process's access to a restricted environment.
Cluster
A Cluster in PostgreSQL refers to a collection of databases managed by a single instance of the PostgreSQL server.
Connection Type
PostgreSQL supports several connection types including local
, host
, hostssl
, hostnossl
, hostgssenc
, and hostnogssenc
, each of which is suitable for different security and connectivity requirements.
In PostgreSQL, the type of connection to be used for a specific access rule is configured in the pg_hba.conf file.
The supported connection types are:
local: Unix domain socket; host: TCP/IP connection, with or without SSL encryption, depending on the authentication method configuration; hostssl: TCP/IP connection with SSL encryption; hostnossl: TCP/IP connection without SSL encryption; hostgssenc: TCP/IP connection with GSSAPI encryption. This type of connection is intended for environments requiring advanced authentication and encryption, such as Kerberos; hostnogssenc: TCP/IP connection without GSSAPI encryption.
Continuous Archiving
Continuous archiving in PostgreSQL involves writing copies of the transaction log files (WAL) to a secondary location. This process is critical for data loss prevention and allows the database to be recovered up to a point in time (PITR).
Database
A database is an organized set of electronically stored data, usually accessed and managed by a Database Management System (DBMS).
Database Objects
Database objects in PostgreSQL include tables, views, indexes, functions, and others, used to store, organize, and manipulate data.
Delta Restore
Delta restore is a recovery technique that allows you to restore only the changes that have occurred since the last full or incremental backup to be restored, saving time and resources by applying only the changed data to the previous state of the database. This method is particularly useful in environments with large volumes of data, where frequent full backups may be impractical.
Differential Backup
A copy of the data that has been changed or added since the last full backup.
Distributed Consensus
Distributed consensus refers to the process of achieving collective agreement in decentralized or distributed systems. It is essential for systems that require reliability and consistency, such as blockchains and distributed database systems.
Encoding
Encoding refers to the character set used to store text in the database. PostgreSQL supports several types of encoding, including UTF-8, which allows for the representation of a wide range of characters from different languages.
EPEL
EPEL (Extra Packages for Enterprise Linux) is a repository of additional software packages for Enterprise Linux and compatible distributions, such as CentOS and Red Hat Enterprise Linux.
Failover
An automatic or manual process of switching to a secondary or backup system when the primary system fails or becomes unavailable, ensuring business continuity and data availability and minimizing downtime.
Fallback
The process of returning to a previous state or to a primary system after an unsuccessful failover attempt or an update that is no longer needed or successful.
Foreign Key
A foreign key is a column or set of columns in a table that references the primary key of another table, used to ensure referential integrity between tables.
Full Backup
An complete copy of all the system or database data at a given point in time.
Generalized Inverted Index (GIN)
GIN indexes are used in PostgreSQL to index elements within composite values, such as arrays and JSON documents, improving the performance of queries searching for specific elements.
Generalized Search Tree (GiST)
GiST is a flexible data structure that allows the implementation of a variety of search trees, such as B+ trees, R-trees, among others. It is particularly useful for indexing complex data and multidimensional queries.
gRPC
gRPC is a high-performance Remote Procedure Call (RPC) framework that supports communication between services in different programming languages, making it easier to build distributed systems.
Graceful Shutdown
Refers to the process of shutting down a system or application in a controlled manner, terminating active processes and saving data as needed to prevent data loss or corruption.
Huge Pages
Huge pages are a feature of modern operating systems that allow memory allocation using significantly larger pages than the standard size (typically from 4 KB to 2 MB or more, depending on the system). This feature is designed to improve the performance of applications that require large amounts of memory by reducing the overhead of page management in the operating system's page table and decreasing TLB (Translation Lookaside Buffer) misses. By using huge pages, memory-intensive applications such as databases and application servers can achieve more efficient use of system resources, resulting in faster response times and higher throughput.
Incremental Backup
A copy of the data that has been changed or added since the last backup, whether full or incremental.
Just in Time Compilation (JIT)
JIT compilation in PostgreSQL compiles parts of the SQL code at runtime, aiming to improve query performance.
Key-Value Storage
A key-value storage system stores data as a set of key-value pairs, where the key serves as a unique identifier for the value. This type of storage provides simplicity and flexibility.
Listen addresses
Specifies the TCP/IP addresses on which the server will be "listening" for client application connections. The values take the form of a list of server names or numeric IP addresses separated by commas.
Load Balancer
A load balancer distributes network or application requests across a group of servers, optimizing resource usage, maximizing response speed, and avoid overloading on any single server.
Locale
Locale refers to the regional setting that affects the formatting of text, dates, numbers, and other data conventions, allowing customization according to local preferences.
LZ Compression Algorithm
The LZ (Lempel-Ziv) algorithm, specifically LZ77, is a compression technique based on replacing repeated data sequences with references to a single previous occurrence. It works by generating fixed-size codes for variable-size sequences (words), stored in a dictionary table.
Materialized View
A materialized view in PostgreSQL is a view whose results are physically stored and can be updated on demand, providing better performance for
Memory Management Unit (MMU)
The MMU is a hardware component that manages the mapping of virtual addresses to physical addresses, which is essential for the operation of virtual memory in computersystems.
Multiversion Concurrency Control (MVCC)
MVCC is a concurrency control model used by PostgreSQL to allow multiple transactions to occur simultaneously without locking, while maintaining data consistency.
PGDATA
PGDATA is the directory where all the data, including configuration files and databases, of a PostgreSQL instance are stored.
Normally, this directory contains:
-
Configuration Files;
- postgresql.conf: main configuration file of the PostgreSQL instance
- pg_hba.conf: access control file based on rules.
- pg_ident.con: file used to map operating system users to PostgreSQL instance users.
- recovery.conf: configuration file used to set up the recovery of a PostgreSQL instance.
- Other Files
- postmaster.pid: contains the PID of the running PostgreSQL instance process. Used to prevent running a second instance or process in the same directory.
- postmaster.opts: contains the startup command of the PostgreSQL instance.
- PG_VERSION: contains the PostgreSQL version.
-
Directories
- Base: contains, by default, the data files arranged in subdirectories for each database. Within these subdirectories, the files of each data page or index of a specific database are stored.
- Global: contains the catalog files (pg_database, pg_class, etc.) and are globally visible.
- pg_wal: stores the continuously generated WAL (Write Ahead Log) files.
- log: stores the PostgreSQL instance operation logs.
PGPASS
The .pgpass file allows for the secure storage of passwords to facilitate automatic authentication in scripts and tools accessing PostgreSQL.
PGPORT
PGPORT is the TCP port number used by the PostgreSQL server to listen for network connections, with the default port being 5432.
Point-In-Time Recovery
Point-In-Time Recovery (PITR) is an advanced technique that allows restoring a database to an exact state at a specific point in time, offering fine-grained recovery against accidental data loss or damage. This is achieved through a combination of full database backups and transaction logs (also known as WAL in PostgreSQL), which record all changes made to the database after the last full backup. PITR is particularly valuable in production environments, as it enables database administrators to reverse only the effects of specific unwanted or malicious operations, minimizing the loss of valid data and system downtime while maximizing data availability and integrity.
Primary Key
The primary key is one or more columns that uniquely identify each row (tuple) in a table is used to ensure data integrity and is often referenced by foreign keys in other tables.
Referential Integrity
Referential integrity is a fundamental concept in relational database management systems that ensures the consistency and accuracy of relationships between tables. It is maintained through rules and constraints, such as primary keys and foreign keys, which guarantee that references between tables remain valid and that actions like inserting, updating, and deleting data do not violate data integrity. For instance, a foreign key constraint prevents a record in one table that references another record in a different table from being deleted if the referenced record is essential for maintaining data consistency between the tables. This is crucial to avoid orphaned data and to ensure the reliability of data relationships within the database.
Regression Testing
Regression testing is a software testing technique aimed at verifying whether recent changes in the code, such as new features, bug fixes, or updates, have negatively impacted the functionality of existing features. This approach is essential to ensure the stability and quality of the software throughout development and maintenance, minimizing the risk of introducing new errors in parts of the system that previously worked correctly. Regression tests can be performed manually or automatically, with automation being highly recommended for large-scale projects or software that receives frequent updates, enabling the quick and efficient execution of a comprehensive set of tests whenever changes are made.
Reverse Proxy
A reverse proxy is a type of proxy server that forwards client requests to other servers, often used for load balancing, security, and caching.
RPM
RPM is a package management system used in Red Hat-based Linux distributions, to facilitate the installation, update, and removal of software.
Savepoint
Savepoint allows you to create a recovery point within a transaction in PostgreSQL, providing greater control over database operations and allowing you to undo part of the transaction without aborting it completely.
Schema
In databases, a schema is a logical structure that defines the organization, relationships, and constraints of the data, acting as a container for grouping database objects such as tables, views, indexes, and functions. Schemas help with data organization and security, allowing different users or applications to operate within separate logical spaces within the same database, without interfering with one another. Additionally, they provide a way to implement concept separation, facilitating permission management, database maintenance, and application scalability by enabling the logical organization of data in a clear and structured manner.
SIGHUP
SIGHUP is a signal sent to operating system processes to indicate terminal closure or configuration reload. In PostgreSQL, it is often used to reload configuration files without restarting the server.
Spatial Reference Identifier
SRID is an integer that identifies the definitions of a CRS - Coordinate Reference System (projected or geographic).
The CRS gives meaning and location to the coordinate values in a local, regional, or global system. It also defines the type of projection (when projected) as well as the transformation parameters between different CRSs.
Subqueries
Subqueries are SQL queries within other queries, allowing the composition of complex queries to be assembled and intermediate results to be reused.
Switchover
Switchover is the controlled process of changing roles between a primary and a standby server in PostgreSQL, allowing for maintenance or updates with minimal disruption.
Table Inheritance
Table inheritance in PostgreSQL allows a table to "inherit" columns from one or more tables, facilitating schema reuse and the organization of hierarchical data organization.
Translation Lookaside Buffer (TLB)
A special cache area used to improve the efficiency of virtual to physical address translation, reducing memory access time.
Trigger
Triggers are stored procedures that are automatically executed in response to specific events on tables or views, such as inserts, updates, or deletes.
User
In PostgreSQL, a user is an account that can connect and interact with the database instance, including regular users and superusers, with different levels of privileges.
Updateable View
An updatable view in PostgreSQL is a view on which data insert, update, and delete operations can be performed as if it were a regular table.
WALDIR
WALDIR is the directory in PGDATA where the transaction log files (WAL) are stored, essential for data recovery and replication in PostgreSQL.
Write-Ahead Logging (WAL)
Write-Ahead Logging (WAL) in PostgreSQL is a mechanism that ensures transaction durability and data consistency by writing all transaction modifications to a write-ahead log before they are actually applied to the database's data files. This process allows the system to recover its consistent state after a failure, using the logs to replay committed transactions that have not yet been written to the data files or to roll back uncommitted transactions. WAL also plays a crucial role in the implementation of synchronous and asynchronous replication, facilitating the transfer of transactions from the primary server to secondary servers without the need to block read and write operations during synchronization.
Zstandard (zstd) Compression Algorithm
Zstandard, or zstd, is a powerful open-source data compression algorithm developed by Facebook. It provides lossless compression and is designed to be fast for both compression and decompression.