The Architecture of CUBRID

CONTENTS
1. Introduction ____________________________________________________________________ 3
1.1 Overall Architecture of the CUBRID System _______________________________________________5
1.2 Process Architecture ______________________________________________________________________6
1.2.1 CONNECTION CONFIGURATION ______________________________________________________7
2. Broker __________________________________________________________________________ 8
2.1 The cub_broker Process __________________________________________________________________8
2.2 The cub_cas Process _____________________________________________________________________8
3. Client and Server Modules_____________________________________________________ 10
3.1 Module Configuration______________________________________________________________________11
3.1.1 TRANSACTION MANAGEMENT COMPONENT __________________________________________11
3.1.2 SERVER STORAGE MANAGEMENT COMPONENT ______________________________________13
3.1.3 CLIENT STORAGE MANAGEMENT COMPONENT _______________________________________14
3.1.4 OBJECT MANAGEMENT COMPONENT ________________________________________________15
3.1.5 CLIENT-SERVER COMMUNICATIONS__________________________________________________17
3.1.6 THREAD MANAGEMENT COMPONENT ________________________________________________18
3.1.7 QUERY PROCESSING _______________________________________________________________18
3.2 Detailed Description for the Modules ______________________________________________________19
3.2.1 TRANSACTION MANAGEMENT COMPONENT __________________________________________19
3.2.2 OBJECT MANAGEMENT COMPONENT ________________________________________________21
3.2.3 QUERY PROCESSING _______________________________________________________________22

CUBRID is an object-relational database management system (DBMS) consisting of the Database Server, the Broker,
and the CUBRID Manager.
 As the core component of the CUBRID Database Management System, the Database Server saves and manages
data in a multi-threaded client/server architecture. The Database Server processes the queries entered by users
and manages objects in the database. The CUBRID Database Server provides seamless transactions using locking
and logging methods even when multiple users use the database at the same time. It also supports database
backup and restore for the operation.
 The Broker is a CUBRID-specific middleware that relays the communication between the Database Server and
external applications. It provides functions including connection pooling, monitoring, and log tracing and analysis.
 The CUBRID Manager is a GUI tool that manages database and broker. It also provides the Query Editor, a tool
that allows users to execute SQL queries on the Database Server.
The basic configuration of CUBRID is shown in Figure 1 below.
1.Introduction

1. Introduction
Figure 1. Basic Configuration of CUBRID

1. Introduction
1.1 Overall Architecture of the CUBRID System
Figure 2. Overall Architecture of the CUBRID System
Figure 2 shows the overall architecture of the CUBRID system.
The CUBRID system follows the client/server model that allows multiple applications to access the same database
simultaneously. The client module (the Broker in Figure 2) and the server module (the Server in Figure 2) on separate
systems (computers) are connected through a network. Even when a broker and a server on the same system are
connected, the same architecture as above is configured because they are connected via socket IPC. A server
performs the requests from multiple clients in a single process/multi-threaded environment, and each server process
manages one database.
The client module analyzes SQL queries on the database from users or applications and executes them to the
optimization level. Then it generates a query plan tree and sends it to the server. And it receives the execution results
from the server by using the cursor navigation and delivers them to the users or applications. The client caches object
instances from the database to its memory to provide fast access to data by using the query execution results or
directly by users/applications. In addition, it caches locks as well as objects from the server for concurrency control.
The execution of triggers or methods specified by users or applications is also performed in the client module.
The server module receives and processes requests from the client module (e.g., object requests or query execution
requests from a query execution tree) and then returns the query execution results. The server can execute the
requests from multiple clients in a single process/multi-threaded environment. To support multiple client modules with
the appropriate number of threads, the server threads are allocated to each broker request, not to each broker. The
server performs input and output operations for database and log volume and provides a file access method to the
database volume in a file or page. In addition, it manages page buffer in a memory and uses a B+-tree index to

1. Introduction
increase retrieval speed. The server also provides concurrency control, deadlock detection, and failover between
multiple transactions.
1.2 Process Architecture
Figure 3. Process Architecture of the CUBRID System
Figure 3 shows the process architecture of the CUBRID system. In the server host, there can be one master process
(cub_master) and more than one database server process (cub_server). Each client process (cub_cas) that exists in
multiple broker hosts connects to each single database server process.
The cub_broker process allocates cub_cas, passes a connection and manages cub_cas for a connect request from an
application. The cub_cas process executes database queries from the application.

1. Introduction
1.2.1 Connection Configuration
The cub_cas process connects to the defined connection port number of the master process. The master process
checks whether the requested database server is running; the connection request is rejected if the server is not
running. If the requested database server is running, the master process passes the connected socket to the requested
server process. Then, the server process communicates with the client process (cub_cas) directly through the socket.
The database server process connects to the master process's port and then registers its server name (database
name) and establishes a UNIX Domain Socket (or Named Pipe) connection to the master process. In this connection,
the master process passes a socket descriptor to the client (cub_cas); the connection is maintained for server
shutdown and other future operations. After the connection between the server and client processes (cub_cas) is
established, the server process allocates threads for each client request and performs tasks.
 Master Process (cub_master)
1. Checks whether other master process is running by connecting to cubrid_port_id
2. Switches to the demon process, opens a socket to the port defined as cubrid_port_id, and waits for the connection
between the client and the server.
3. Registers a server name and establishes a UNIX domain socket connection to the server process if the connection
is from the database server process.
4. Passes the connected socket number (socket descriptor) to the database server requested by the client to
establish a socket connection between the client and the server if it is connected from the client process (cub_cas).
 Database Server Process (cub_server)
1. Connects to the designated port of the master process. If the connection fails, the connection attempt is aborted,
assuming that the master process is not running.
2. Registers its server name (database name) to the master process if the connection to the master process is
established. At this time, if a server with the same name already exists, the registration is rejected, and the server
is terminated.
3. Creates a UNIX Domain socket (or Named Pipe), sends a connection path (socket file path) to the master process
and terminates the socket connection to the designated port when the master process is connected.
4. Waits for task requests from the connected client. At this time, a connection relay of a new client from the master
process is processed, if any.
5. Accepts requests from the connected client and performs tasks by allocating threads.
 Client Process (cub_cas)
1. Connects to the master process that exists on a remote or local server through the port defined as cubrid_port_id.
2. Sends the name of the database to connect when the connection to the master process is established and checks
whether the database server process is registered and running. At this time, the connection is rejected if there is no
corresponding server.
3. Receives response messages directly from the server because the master process passes the socket connection
between the client and the master process to the corresponding server process.

The Broker is a middleware that relays the communication between the database server and applications. It consists of
cub_broker and cub_cas.
2.1 The cub_broker Process
The cub_broker process allocates cub_cas, passes a connection and manages cub_cas for a connection request from
an application. cub_broker has a multi-threaded architecture and consists of the following threads:
 main
This thread creates other threads and manages the number of cub_cas processes. It increases or decreases the
number of cub_cas processes depending on the number of requests in the job queue.
 receiver_thread
As a thread waiting for the accept() system call, this thread puts a connection request from an application into the
job queue.
 dispatch_thread
This thread finds cub_cas available to allocate to the connection requests in the job queue and passes the
connection to cub_cas.
 cas_monitor_thread
If cub_cas is abnormally terminated, this thread restarts cub_cas.
2.2 The cub_cas Process
The cub_cas process executes database queries from an application and has a single thread architecture. This
process connects to the database server when it receives a “connection” request from an application and calls a
function corresponding to the request from the application. After the connection with the application is terminated, this
process can receive a connection from another application. When disconnecting an application, the connection to the
database server is not terminated. If next application uses the same database as the current one, the existing
database connection is reused.
Depending on the application's connection status, cub_cas has four statuses: IDLE, BUSY, CLIENT WAIT, or CLOSE
WAIT.
- IDLE: No connection is made to an application.
- BUSY: A connection is made to an application, and the request from the application is being processed.
2.Broker

2. Broker
- CLIENT WAIT: A request from an application is waited for, and a transaction is being processed.
- CLOSE WAIT: A request from an application is waited for but a transaction has been terminated. If the
connection between cub_cas and an application is disconnected in this status, the application attempts
reconnection.
The cub_cas process waits for the select() call after a connection to the application is established and processes each
function passed by the application. Main functions that respond to requests from an application are as follows:
 fn_end_tran
This function performs commit/rollback. If KEEP_CONNECTION is set to off in the cubrid_broker.conf file, it
terminates the connection the application when a transaction is terminated; establishes a new connection when a
new transaction starts. If KEEP_CONNECTION is set to auto, the status of cub_cas changes to CLOSE_WAIT
when a transaction is terminated. In this case, if the application connected to cub_cas has not sent a new request,
and a new application has sent a "connection" request, the cub_broker process can select the cub_cas whose
status is CLOSE_WAIT to terminate the connection to the previous application and send a request to cub_cas
asking for the connection to a new application.
 fn_prepare
This function processes a prepare request from an application. It compiles the queries, creates a handle for the
compiled query and sends it to the application. Then, the application sends an execution request by using the
created handle. After the queries are compiled, if they are the SELECT queries, meta information on columns is
extracted and sent to the application.
 fn_execute
This function executes a prepared query statement. If the query statement is SELECT, it sends the query results
as the specified buffer size and sends the query execution results for other query statements. If JDBC RESULT
CACHE is in use and the executed query already exists in JDBC RESULT CACHE, this function determines
whether the stored query results can be reused. If they can be reused, the query results are not sent. Instead,
only a flag indicating reusability is sent to the JDBC.
 fn_fetch
This function copies the query results of the SELECT statement as the specified buffer size and sends them to an
application.

This chapter describes the components of the entire server (hereinafter, the server) and the native C API & other
modules (hereinafter, the client) in the Client Library of the Broker as shown in Figure 4.
Figure 4. Detailed Architecture of the CUBRID System
3. Client and Server Modules

3.1 Module Configuration
The CUBRID client and server modules consist of the following components:
 Transaction Management Component
Handles system transactions across the client and server (including system failover).
 Server Storage Management Component
Accesses and manages database and log volume on the server (including page buffering).
 Client Storage Management Component
Allocates and manages a workspace for the object cache and access on the client.
 Object Management Component
Defines a class object, creates and modifies an object, converts the object representation structure between the
disk and the memory.
 Client-Server Communications
Manages the network communication between the client and the server.
 Thread Management
Manages threads of a server process.
 Query Processing
Executes query plans on the server, which are created by translating, analyzing and optimizing SQL statements
on the client.
The module configuration of each component is described in the following section.
3.1.1 Transaction Management Component
The Transaction Management Component consists of the modules in dark blue in Figure 5.

Figure 5. Module Configuration of Transaction Management Component
 Object Locator
As a module passing object data between a workspace on the clients and the page buffer pool on the server, it
caches an object and acquires a lock to a workspace.
 Transaction Manager
As a module performing transaction start, commit, and rollback, it initializes other modules (lock/log/recovery
manager) of Transaction Management Component. This module also supports commit, rollback, and savepoint
including 2PC (2-phase commit).
 Lock Manager
As a module performing lock management based on the 2PL (2-phase locking) protocol, it supports a granularity
locking protocols.
 Recovery Manager

As a module protecting database consistency from the system failure, it employs a failover method that uses
UNDO/REDO logging and the WAL (Write Ahead Logging) protocol. This module supports total rollback, partial
rollback (to savepoint), and nested top operation, and uses LSA (Log Sequence Address) and CLR
(Compensation Log Record), etc.
3.1.2 Server Storage Management Component
The Server Storage Management Component consists of the modules shown in Figure 6.
Figure 6. Module Configuration of Server Storage Management Component
 I/O Manager
As a module performing I/O tasks for the disk volume (or volume file), it performs a volume mount/unmount
process and locks a volume. This module performs write synchronization for a log volume.
 Page Buffer Management
As a module managing the page buffer in a virtual memory that is used for disk page buffering, it employs the
LRU page replacement algorithm and the FIX/UNFIX protocol to use page buffer. In addition, this module uses a
hash table to quickly retrieve a requested page in the buffer pool.
 Disk Manager
It is a module managing the internal structure of the disk volume (or volume file). A volume consists of sectors,
and a sector is a group of continuous pages. Each volume consists of system area and user area. The bit
allocation map is used for page allocation in the volume.

 File Manager
As a module helping access to a database only in a file and page regardless of internal structure of the volume
(volume, sector and page), it is used in a file structure such as B+-tree, heap, or hash. The File Manager module
keeps and manages information on the sector that is allocated to a file in a file header.
 Slotted Page Manager
As a module inserting, deleting and updating records in a file page, it provides slot structure that indicates the
position (offset) of records in a page; it can move records in a page through a slot.
 Overflow Page Manager
A module inserting, deleting and updating records with the size of over one page in an overflow page area. With
this module, you can treat a large size data atomically.
 Object Heap Manager
It is a module inserting, deleting and, updating an object in a file through the heap structure. The instances
(records) of a class (table) are stored into an object heap file, and a unique OID (object identifier) is allocated to
each record. The OID consists of "Volume ID | + Page ID + Slot ID," and it is not reused except for a special case.
This OID expression is the same as disk addressing in the Disk Manager. That is, the OID indicates the physical
location of a disk where a record is stored.
 Extendible Hash Manager
As a module providing the extendible hashing to access data quickly, it is used to retrieve class OIDs with a class
name.
 B+-tree Manager
As a module providing an index file structure based on the prefix B+-tree, it inserts, deletes, and retrieves a key
for B+-tree.
 Long Data Manager
As a module processing ad-hoc large objects such as multimedia data, it can modify part of the data.
3.1.3 Client Storage Management Component
The Client Storage Management Component consists of the modules shown in Figure 7.

Figure 7. Module Configuration of Client Storage Management Component
 Workspace Manager
A module managing the database objects cached in the workspace of the client process. Through an object table
implemented as a hash, it converts a disk object identifier OID to a memory object pointer (MOP). The MOP has a
memory pointer that helps access to objects cached in the client memory.
 Garbage Collector
A module collecting garbage for the client workspace. This module releases the memory that is allocated to MOPs
and cached objects.
 Quick Fit Storage Allocator
A module allocating a memory to the workspace for an object.
3.1.4 Object Management Component
The Object Management Component consists of the modules in Figure 8.

Figure 8. Module Configuration of Object Management Component
 Representation Manager
This is a module performing conversion between disk expression structure and memory expression structure of
an object. An object data is suitable to query execution in a disk and it has a structure which helps an application
access it in a memory. The Representation Manager does conversion between these two expression formats. It
also performs byte ordering during conversion.
 Schema Manager
As a module defining and changing a class, it creates, modifies, or manages the inheritance of a column, method,
or class.
 Object Access Manager
As a module creating, deleting, modifying, checking an object or calling a method, it is closely related to the
Schema Manager.
 Dynamic Loader
A module providing a dynamic link to an application that is executing methods written in C.
 Trigger Manager
A module implementing a trigger feature with a system object. This module is closely related to the Schema
Manager and Object Access Manager.

 Authorization Manager
A module checking the authority of a database user. This module is implemented on top of the API provided by
the Object Access Manager.
 Data Type and Domain
A module manipulating internal data structure (representation format) for data type and domain information. This
module caches the information about the used domain to a connection list and has a domain conversion matrix.
3.1.5 Client-Server Communications
Client-Server Communications consists of the modules in Figure 9.
Figure 9. Module Configuration of Client-Server Communications
 Socket Manager
A module managing communications in the client, the server and the master process (cub_master). This module
manages the procedures of connection to the client or server through the master process.
 Packet Manager
A module processing a packet that is used to exchange information between the client and the server. The packet
types include request packet, data packet, close packet, out-of-band packet, or error packet. The request packet
and data packet can communicate asynchronously by using a queue in the client and server.
 Client-Server Interface
A module providing an interface to use Client-Server Communications in the system. This module processes an
exception that occurs during communications as well as out-of-band such as user interrupt, etc.

3.1.6 Thread Management Component
Thread Management Component manages multiple threads in the server process; it is implemented by using pthread.
This component detects a request from the client by using the select() system call and allocates a task to the threads
per each request. Similarly, the worker thread processing a request from the client waits for a task in the Job Queue
and wakes up when a task enters the process. After it processes the task, it waits for another task in the Job Queue.
There are also system threads that process only special system tasks as well as this worker thread.
 Deadlock detection thread
This thread checks whether a deadlock occurs at a fixed interval or when there is a lock request, and it solves a
problem when there is a deadlock.
 Checkpoint thread
This thread performs a checkpoint feature that flushes the data page, which is already committed at a fixed
interval but not reflected to the disk and cached in the page buffer. Performing a periodic checkpoint reduces the
restore time during failover.
 OOB (out-of-band) thread
This thread receives the OOB signal and passes it to thread.
 Page-flush thread
This thread periodically flushes the dirty pages in the page buffer to the disk. This improves system performance
by reducing flushing dirty pages to the disk during page replacement.
 Log-flush thread
This thread flushes the log page to the log volume. It provides group and asynchronous commit methods by using
the log flush thread.
3.1.7 Query Processing
The Query Processing consists of the following modules.
 Scanner/Parser
As a module translating queries (SQL) from users or applications, it creates a parse tree.
 Semantic Checker
A module performing node typing, name resolution, semantic checking, or view translation, etc.
 XASL Generator/Optimizer
A module creating XASL (eXtended Access Specification Language) tree which is a query execution plan and
performing query optimization by using schema information and database statistics. The XASL tree includes scan
information (heap scan, index scan, list file scan, set scan, and method scan), a value list (values required for
query results) and predicate. The query optimization employs cost-based optimization and rewrite optimization.
 Query Manager
A server module executing a given XASL_tree from the client. This module consists of the Query File Manager
that stores the query's XASL plan and its results as well as the Query Evaluator that evaluates queries and

creates a result list file. This module interfaces with the Transaction Manager or Recovery Manager to approve or
cancel a transaction.
 Cursor Manager
A module fetching data from the list file that is created as the retrieval results.
3.2 Detailed Description for the Modules
3.2.1 Transaction Management Component
A. Object Locator
The Object Locator is a module delivering object data between a workspace on the clients and the page buffer pool on
the server. The Object Locator provides simultaneous access, use, and failover for database objects by using the
Transaction Management Component's locking and restore algorithm.
The Object Locator is divided into Object Locator on the client, Object Locator on the server, and Object Locator on the
client/server. The Client Object Locator executes its tasks by using Workspace Manager, Representation Manager
(Transformation Manager), and Heap File Manager. The Authorization Manager, Schema Manager, Object Access
Manager and Query Parser (Scanner/Parser) use the functions of Client Object Locator. The Server Object Locator
executes tasks by using Object Heap Manager, Representation Manager (Transformation Manager), Lock Manager,
Catalog Manager, and B+-tree Manager. In the Client Object Locator, the functions of Server Object Locator module is
used for object fetch and flush.
The objects that are cached to the workspace of a client by the Object Locator maintains coherency with the objects in
a server by using cache coherency number. If the cache coherency number of an object, that is cached into the
workspace of a client, is not the same as the cache coherency number of an object that exists in the page buffer (or
disk) of a server, the cached object becomes invalid (invalidation). The Server Object Locator increases the cache
coherency number of an object whenever an object is flushed from a server and it is sent to a server.
Validation check for a cached object is performed when the object is first used by transaction. Because lock is also
cached (set up) when an object is cached, the validation of an object is effective while one transaction is being
executed. When a transaction requests an object, the Client Object Locator checks whether the object and its lock are
cached. If both the object and lock are cached, the transaction can use the cached objects in the workspace memory
much faster. If neither the object nor lock is cached, send a request to the Server Object Locator. The Server Object
Locator sets up lock that is requested for an object by using the Lock Manager. When lock is acquired, the cache
coherency number of an object in the workspace and the cache coherency number of an object that exists in the
database (page buffer or disk) of a server are compared. If these two values are different, a new object data from the
server is sent to the client and it replaces the old cached object.
When a transaction is terminated, the cached objects are flushed to a server. When a transaction is rolled back, the
objects are all de-cached. In addition, when a class object is invalidated (e.g., a schema is changed by a transaction of
another client), all the instance objects in the class are flushed/de-cached all together. And all the objects are flushed
to a server together with query execution requests because queries are executed in a server.
To reduce the communication amount between a client and a server, the Object Locator sends flush data together with
object fetch request packet or pre-fetches related class objects or other surrounding objects when caching objects.

The Server Object Locator fetches an object from database and updates it to the database upon the request of Client
Object Locator by using the Heap File Manager. In addition, it manages lock setting by using the Lock Manager.
B. Transaction Manager
The Transaction Manager is a module which does transaction start, approval, and rollback, etc. The Transaction
Manager calls the Object Locator to flush an object that is used for transaction, the Lock Manager to release a cached
lock, or the Log Manager (Recovery Manager) for transaction approval/rollback.
The Transaction Manager is divided into a client and a server. When an application requests transaction termination
(approval, rollback), the Client Transaction Manager flushes the objects (among the objects in the workspace) that are
changed during transaction execution to the page buffer of a server. (If it is rollback request, the changed objects are
not flushed to a server. Instead, they are immediately removed from the workspace.) Next, the Client Transaction
Manager requests approval/rollback to the Server Transaction Manager. In case of approval, the Server Transaction
Manager calls the Log Manager (Recovery Manager) executes postpone action to the database in a server and also
loose_end postpone action in a client. After that, it releases all the acquired locks and closes all the open cursors. In
case of rollback, the Log Manager (Recovery Manager) returns the tasks that are executed by transaction by using
UNDO log and releases all the acquired locks. When a transaction is approved or rolled back, the locks that are
cached by the Client Transaction Manager are all released.
It supports 2PC (2-phase commit) protocol for global transaction.
C. Lock Manager
The Lock Manager is a module that manages locks according to the 2PL (2 Phase Locking) protocol and Granularity
Locking protocol. The Lock Manager searches for a transaction identifier, calls the Log Manager (Recovery Manager)
to get the lock waiting time of a transaction, and calls the Server Transaction Manager to roll back a transaction to
handle deadlock. The Server Object Locator uses the Lock Manager to acquire and release a lock for an object and
the Log Manager uses the Lock Manager to release locks all together.
When accessing an instance object, lock setting is necessary for the class objects that define the all attributes of the
instance and also for the upper class objects that are inherited. In case of the schema change for a class object,
eXclusive lock must be set for the class and its lower classes.
In case of query execution, the instance of a class and the instance of its lower classes are all searched. In addition,
because a class object is a domain that defines the corresponding instance, the domain class and its lower classes are
all accessed. Therefore, set up shared lock for the class to search and its lower classes and also the domain class that
defines an instance and its lower classes during query execution.
To detect a deadlock, WFG (Waits-For-Graph) method is used. If WFG detects a deadlock, one of the involved
transactions is forcibly terminated by the system.
The Lock Manager manages Lock Table. The Lock Table is implemented with hash table for OID and access to the
table is set up as critical section to maintain consistency.
D. Recovery Manager
The Recovery Manager reflects the status of all the committed transactions to the database and does not reflect the
effect of transactions that are not committed when any fault to transaction, system, or media occurs. For this, the
Recovery Manager records a log and restores database from diverse faults based on the log. The CUBRID Recovery
Manager uses UNDO/REDO restore protocol and this protocol is based on the following rules:

 UNDO Rule
Record data value before it is changed. It is assured the last committed value is recorded into a log before it is
overwritten by a value that is not yet committed.
 REDO Rule
The values updated by a transaction are surely recorded into a log before the transaction is committed. That is,
the data value before committing is recorded into a log.
A log is a file in which data is appended in an arbitrary length. To implement a log file with infinite length, recent log
data is recorded into an active log and previous log data is archived into an archive log.
The UNDO/REDO logging is designed to achieve the maximum efficiency during general operation, rather than
database system fault restore time. The flush of data page can be avoided as much as possible during commit or
rollback due to the logging protocol. The data page is only written to a disk only when it is replaced by another page.
3.2.2 Object Management Component
The Object Management Component defines a table, creates or modifies an object, and formats an object in a disk or
memory.
A. Representation Manager
This is a module performing conversion between disk expression structure and memory expression structure of an
object. An object data is suitable to query execution in a disk and it has a structure which helps an application access it
in a memory. The Representation Manager does conversion between these two expression formats.
Figure 10. Disk Expression Format of an Object

The disk expression format of an object is shown in Figure 10. The class OID and Representation ID of an object
come first, and these are used to judge which format the object has. The following CHN (Cache Coherency Number) is
used to judge the validity of caches object. In the disk expression format, the columns (attributes) are divided into a
fixed length type column where all the values have the same length just like an integer and a variable length column
where all the values have different lengths just like a string. The fixed length columns are saved into a pre-defined
location, and the location of each column is obtained from the information that is managed by the Catalog Manager.
The location of the variable length column is obtained from the variable length column offset table which has location
information of each variable length column. The last entry of offset table indicates the end of an object. The offset table
is not saved for the object of a table which has no variable length column.
When an object is cached into a memory, the MOP indicates a memory block that has the columns of the object. The
fixed length column values are continuously saved into an object block and the values of a variable length column are
saved into a memory block that is separately allocated. The CHN is also included in the memory expression format.
The object locator compares this CHN value and the CHN value that is stored in a disk to judge the validity of an object.
If two CHN values are different, it means the object that is cached to the memory is not valid. Then, the object locator
de-caches the object and caches the content of a new object.
Figure 11. Memory Expression Format of an Object
The Representation Manager uses the Workspace Manager to receive a storage space for the memory expression of
an object and uses the Schema Manager to determine the size and architecture of an object.
When the CUBRID changes schema, it does not change the expression format of the records in the schema. Therefore,
if you find an object that is saved in the old expression format during the conversion process between two expression
formats, convert it to the recent expression format. At this time, use schema information for the recent expression
format and the old expression format. During expression format conversion process, convert the difference of hardware
architecture between the client equipment and the server equipment, e.g., the byte ordering difference.
3.2.3 Query Processing

Figure 12. The Procedures of Query Compile in a Client
A. Scanner/Parser
The parser keeps the data structure to create a parse tree during parsing process, the data structure to maintain the
created parse tree, and data structure to manage multiple SQL statements, and information about lexer.
B. Semantic Checker
If a parse tree is configured without an error, it means a query statement with correct syntax is input. Semantic
checking is a feature that checks whether the semantics of an input statement is valid. It performs the following tasks:
1. Name resolution and parse tree node type checking
Checks whether an existing table or column is used and infers the type of a column.
2. Semantic checking
Checks whether an operation that is not supported between types is used.
3. View translation
Converts the definition statement of a view.

C. XASL Generator/Optimizer
The query statement input by a user goes through parsing and semantic checking, and then it is converted into the
augmented parse tree where catalog information is listed. When query optimization is performed based on this
augmented parse tree, the XASL tree, i.e. action plan, is created as a result. The XASL tree is a tree where the most
optimized access sequence and method are specified for the tables to access during query execution. It consists of
action plans which has the lowest access path cost among many other possible plans. With a parse tree and catalog
statistics information, one XASL tree can be created as follows:
1. Classifying terms to configure search conditions in table units
A term becomes a search condition for one or more tables. When there is one table to which the term is applied,
the term is scan term (sarg). If there are two, the term is join term (edge). If there are three, the term is other term.
For the terms specified in the where clause of a parse tree, divide them into join terms or scan terms. Classify the
scan terms according to the table to which each term is applied.
2. Determining the most optimized access method to each table
For the scan terms that will be applied to an arbitrary table, calculate the selectivity of each scan term and select a
search method of a term whose selectivity is lowest as a table search method. That is, determine whether to use
sequential scan or index scan for a table. If the index scan is used, determine which index to use.
3. Calculating selectivity for each table
Calculate the selectivity of each table by using the selectivity of each scan term that is calculated in the step 2.
4. Determining access sequence among tables
To determine the access sequence among tables, list various access sequences and calculate access path cost
of each case. Select the execution sequence whose access cost is lowest as the final execution plan.
5. Creating XASL tree for the final execution plan
D. Query Manager
This is a server module that executes a XASL tree from a client. During Query Processing, a client sends a XASL tree
that is created through the XASL Generator/Optimizer module to a server. A query is executed when the server
receives and executes this XASL tree. Actually, it is undesirable, in terms of performance, to go through the XASL
Generator/Optimizer whenever there is a query of the same pattern, the CUBRID saves the XASL tree into the Query
Plan Cache and reuses it. In addition, when the same query is executed repeatedly, it saves the query result into the
Query Cache and returns the result without query execution next time.

Figure 13. Query Execution on the Server
The procedure of query processing through these components is shown in Figure 14.

Figure 14. Query Execution Steps

The Architecture of CUBRID

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (9)

Similaire à The Architecture of CUBRID

Similaire à The Architecture of CUBRID (20)

Plus de CUBRID

Plus de CUBRID (6)

Dernier

Dernier (20)

The Architecture of CUBRID