Introduction to Tibero
Tibero is a relational databases and database management system utility developed by TmaxSoft. TmaxSoft has been developing Tibero since 2003, and in 2008 it was the second company in the world to deliver a shared-disk-based cluster, TAC. The main products are Tibero, Tibero MMDB, Tibero ProSync, Tibero InfiniData and Tibero DataHub.
Tibero, a Relational Database Management System (RDBMS) is considered an alternative to Oracle Databases due to its complete compatibility with Oracle products, including SQL.
Tibero guarantees reliable database transactions, which are logical sets of SQL statements, by supporting ACID (Atomicity, Consistency, Isolation, and Durability). Providing enhanced synchronization between databases, Tibero 5 enables reliable database service operation in a multi node environment.
Tibero has implemented a unique Tibero Thread Architecture to address the disadvantages of previous DBMS. As a result, Tibero can make efficient use of system resources, such as CPU and memory, through fewer server processes. This ensures that Tibero offers a combination of performance, stability, and expandability, while facilitating development and administration functions. Additionally, it provides users and developers with various standard development interface to easily integrate with other DBMS and 3rd party tools.
In addition, the block transfer technology has been applied to improve ‘Tibero Active Cluster’- the shared DB clustering technology which is similar to Oracle RAC. Tibero supports self-tuning based performance optimization, reliable database monitoring, and performance management.
Tibero RDBMS, which manages massive data and ensures stable business continuity, provides the following key features required for the RDBMS environment:
Distributed database links
Stores data in a different database instance. By using this function, a read or write operation can be performed for data in a remote database across a network.
Other vendors’ RDBMS solutions can also be used for read and write operations.
This function copies all changed contents of the operating database to a standby database. This can be done by sending change logs through a network to a standby database, which then applies the changes to its data.
This function resolves the biggest issues for any enterprise DB, which are high availability and high performance. To achieve this,Tibero DB implements a technology called Tibero Active Cluster.
Database clustering allows multiple database instances to share a database with a shared disk. It is important that clustering maintain consistency among the instances’ internal database caches. This is also implemented in Tibero Active Cluster. For more detailed information, see “Chapter 10. Tibero Active Cluster”.
Parallel query processing
Data volumes for businesses are continually rising. Because of this, it is necessary to have parallel processing technology which provides maximum usage of server resources for massive data processing.
To meet these needs, Tibero supports transaction parallel processing functions optimized for OLTP (Online Transaction Processing) and SQL parallel processing functions optimized for OLAP (Online Analytical Processing). This allows queries to complete more quickly.
The query optimizer
The query optimizer decides the most efficient plan by considering various data handling methods based on statistics for the schema objects.
Query optimizer performs the following:
Creates various execution plans to handle given SQL statements.
Calculates each execution plan’s cost by considering statistics for how much data is distributed, tables, indexes, and partitions as well as computer resources such as CPU and memory, where the cost is the relative time needed to perform a certain execution plan.
Selects the lowest cost execution plan.
Major features of the query optimizer are as follows:
Purpose of the query optimizer
The final purpose of the query optimizer can be changed. The following table shows two examples:
Total handling time The query optimizer can reduce the time to retrieve all rows by using the hint ALL_ROWS.
Initial response time The query optimizer can reduce the time to retrieve the first row by using the hint FIRST_ROWS.
A query can be transformed to make a better execution plan. The following are examples of query transformation: merging views, unnesting subqueries, and using materialized views.
Determining a data access method
Retrieving data from a database can be performed through a variety of methods such as a full table scan, an index scan, or a rowid scan. Because each method has different benefits depending on the amount of data and the filtering type, the best data access method may vary.
Determining a join handling method
When joining data from multiple tables, the order in which to join the tables and a join method such as a nested loop join, merge join, and hash join should be determined. The order and method have a large effect on performance.
Each execution plan’s cost in each execution phase is estimated based on statistics such as predicate selectivity and the number of data rows.
Tibero guarantees reliable database transactions, which are logical sets of SQL statements, by supporting the following four properties:
Each transaction is all or nothing; all results of a transaction are applied or nothing is applied. To accomplish this, Tibero uses Undo data.
Every transaction complies with rules defined in the database regardless of its result, and the database is always consistent. There are many reasons why a transaction could break a database’s consistency. For example, an inconsistency occurs when a table and an index have different data. To prevent this, Tibero does not allow applying only a part of a transaction. That is, even if a table has been modified and a related index has not yet been modified, other transactions can only see the unmodified data. The other transactions think that there is consistency between the table and the index.
A transaction is not interrupted by another transaction. When a transaction is handling data, other transactions wait to access the data and any operation in another transaction cannot use the data. However, an error does not occur. To accomplish this, Tibero uses two methods: multiversion concurrency control (MVCC) and row-level locking.
When reading data, MVCC can be used without interrupting other transactions. When modifying data, row level fine-grained lock controls can be used to minimize conflicts and to make a transaction wait. That is, concurrency is guaranteed by locking a row, which is the smallest data unit. Even if multiple rows are modified, Data Manipulation Language (DML) can still be executed despite a table lock. With these methods, Tibero provides high performance in an OLTP environment.
Once a transaction has been committed, it must be permanent even if there is a failure such as a loss of power or a breakdown. To accomplish this, Tibero uses Redo logs and write-ahead logging. When a transaction is committed, relevant Redo logs are written to disk to guarantee the transaction’s durability. Before a block is loaded on a disk, the Redo logs are loaded first to guarantee that the database has consistency.
Row Level Locking
Tiberouses row level locking to guarantee fine-grained lock control. That is, it maximizes concurrency by locking a row, the smallest unit of data. Even if multiple rows are modified, concurrent DMLs can be performed because the table is not locked. Through this method, Tibero provides high performance in an OLTP environment.
Tibero has a multi-process and multi-thread based architecture, which allows access by a large amount of users.
The following figure shows the process structure of Tibero:
Tibero has the following three processes:
Working process or foreground process
Listener receives requests for new connections from clients and assigns them to an available working thread. Listener plays an intermediate role between clients and working threads using an independent executable file, tblistener.
A new connection from a client is handled in the following way (refer to [Figure 1.1]):
Listener searches for a working process that has an available working thread and then sends the connection request to the working process (①).
Because connection information is sent along with the request, the client starts operations as soon as it connects to the working thread regardless of a server’s internal operation.
A control thread that receives a request from listener checks the status (②) of working threads included in the same working process and assigns an available working thread (③) to the connection from the client.
The working thread authenticates the client and starts a session (④).
Working Process or Foreground Process
A working process communicates with client processes and handles user requests. Tibero creates multiple working processes when a server starts to support connections from multiple client processes. The number of working processes can be adjusted with the initialization parameter WTHR_PROC_CNT. This number cannot be modified after Tibero RDBMS starts. If the number of working threads increases, more CPU and memory resources are used. Therefore, the proper number of working processes should be specified based on the number of concurrent users and the system environment.
Background processes are independent processes that primarily perform time-consuming disk operations at specified intervals or at the request of a working thread or another background process.
The following are the processes that belong to the background process group:
Monitor Thread (MTHR)
The monitor thread is a single independent process despite being named Monitor Thread. It is the first thing created after Listener when Tibero starts. It is the last process to finish when Tibero terminates.
The monitor thread creates other processes when Tibero starts and checks each process status and deadlocks periodically.
Sequence Writer (AGENT or SEQW)
The sequence process performs internal jobs for Tibero that are needed for system maintenance. Until version 5.0, this process stored sequence cache values on disk, but now each working thread stores them individually. To avoid user confusion, the name SEQW continues to be used.
For more details on how to use the sequence process, refer to “4.6.1. Creating, Changing, and Deleting Sequences”.
Data Block Writer (DBWR or BLKW)
This process writes changed data blocks to disk. The written data blocks are usually read directly by working threads.
Checkpoint Process (CKPT)
The checkpoint process manages Checkpoint. Checkpoint is a job that periodically writes all changed data blocks in memory to disk, or when a client requests it. Checkpoint prevents the recovery time from exceeding a certain limit if a failure occurs in Tibero.
For information on how a client can request a checkpoint directly, refer to the ALTER SYSTEM CHECKPOINT statement of the “Tibero SQL Reference Guide”.
Log Writer (LGWR or LOGW)
This process writes redo log files to disk. Log files contain all information about changes in the database’s data. They are used for fast transaction processing and restoration.