9

MySQL HeatWave : Scale Out Management

 3 years ago
source link: https://blogs.oracle.com/mysql/mysql-heatwave-%3a-scale-out-management
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Introduction

MySQL is the world's most popular open source database because of its reliability, high-performance, and ease of use. MySQL has been designed and optimized for transaction processing and enterprises around the world rely on it. With the introduction of HeatWave in MySQL Database Service, customers now have a single database which is efficient for both transaction processing and analytics. It eliminates the need for ETL to a specialized analytic database and provides support for real-time analytics. HeatWave is built on an innovative, in-memory query engine which is architected for scalability and performance and is optimized for the cloud. MySQL HeatWave service is faster than other database services – Snowflake, Redshift, Aurora, Synapse, Big Query, Clickhouse - at a fraction of the cost.

In order to process data with MySQL HeatWave, data needs to be first inserted into the MySQL database where the data is persisted and then data needs to be read from the database into the memory of the HeatWave cluster. Loading data into the MySQL database typically occurs once, but data may need to be loaded from the database to the HeatWave cluster often for a variety of operations (Figure 1). This includes scheduled operations like database upgrade, stop and resume of a cluster; or it could be an unscheduled event like recovering from a node failure.

Figure 1. Loading data into MySQL database and HeatWave

Scale-out Data Management

MySQL InnoDB database stores data in row-based format, while HeatWave stores data in memory in a hybrid columnar format. Loading data from MySQL to HeatWave involves transforming data into HeatWave columnar format.

To increase service uptime, HeatWave introduced a new storage layer for optimization that is built on OCI Object Storage. The first time data is loaded from the MySQL database to the memory of the HeatWave cluster, a copy of the in-memory representation is made to the HeatWave storage layer. If there is a need to reload data into the HeatWave cluster, which could happen for a variety of reasons like recovering from a node failure, the data is read from the HeatWave storage layer instead of the MySQL database. Any changes which are made to the data in MySQL database are transparently propagated to the HeatWave storage layer.

Figure 2. HeatWave scale-out storage layer

Figure 2 shows the new HeatWave architecture with the scale-out storage layer. In the HeatWave storage layer, persisted data is organized in the same way as that of in-memory data.
Here are the advantages of this architecture:

1.      Since the data in the HeatWave storage is stored in the in-memory format, there is no transformation of data required when loading the data from the HeatWave storage into the memory of the HeatWave cluster. As a result data can be loaded at near object store / network bandwidth.

2.      Data in the HeatWave storage is stored in the OCI object store in multiple volumes. When data is loaded into the HeatWave cluster, it can be read by all the nodes of the HeatWave nodes in parallel. So if the data size is large, more HeatWave nodes can read the data. This scale out design, where the time it takes to load any amount of data is constant and is equal to the amount of time it takes of load one HeatWave node as shown in Figure 3.

Figure 3. Reloading data from HeatWave store is done in parallel

3.      Data can be read from the HeatWave storage in fine grained chunks – i.e. the entire table does not need to be read. For example, if one of the nodes in the HeatWave cluster needs to be loaded while recovering from a node failure, only the data for that node needs to be read from the HeatWave storage (Figure 4). This significantly cuts down on the network traffic and improves performance.

Figure 4. Data from HeatWave storage can be loaded selectively

350x Performance Improvement for Reloading Data

The time it takes to load data from the HeatWave storage is constant and data can be reloaded at near OCI object store bandwidth. In the larger OCI regions, the time to load data is about 4 min. The chart below compares the performance of reloading time from MySQL InnoDB versus when data is reloaded from HeatWave storage. For a HeatWave cluster with 64 nodes with 32TB of data, the time reduces from 24 hours down to 4 min – a 350x improvement. 

Figure 5. Improved performance and availability with HeatWave storage 

This reduction in time to reload data into the HeatWave cluster, improves the availability of the HeatWave cluster.

Auto Change Propagation

Data updated in MySQL is propagated and persisted to the HeatWave data layer as change logs. During data reload, HeatWave first restores data from the base data, then applies the data from the change logs. Over time, the persisted changelog volume increases, which can result in an increased reload time as all the change logs need to be applied to the base data. So, the change logs are consolidated from time-to-time to alleviate increased reload latency as shown in Figure 6. However, determining when to consolidate is not an easy task, which depends on several factors such as transaction rate, system load, failure probability.

Figure 6. Auto change propagation

To minimize consolidation time during reloading from the storage layer, auto change propagation uses a data driven mechanism to determine the best change propagation interval and choice. Auto change propagation analyzes rate of changes, incoming DMLs, object storage resources, and previously seen change activity.  As a result, the changes are propagated at the best time interval, which results in optimized consolidation time for critical system operations.

Conclusion

MySQL HeatWave is the only MySQL based database which provides efficient support for transaction processing, analytics and mixed workloads. As a result, customers are moving larger data sets to the MySQL database service. MySQL HeatWave scale out storage provides an efficient mechanism to reload data into the HeatWave cluster and can reduce the load time by 350x. This reduction in the load time translates to improved availability of the HeatWave cluster. MySQL Autopilot ensures that the data in the HeatWave storage is kept in sync with the MySQL database.

Additional References:

Watch the MySQL HeatWave Scale Out Management webinar

Learn more about MySQL Database Service

Learn more about MySQL HeatWave

Try it free today!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK