Jean-Jerome SchmidtHow to Bootstrap MySQL/MariaDB Galera Cluster (26.1.2015, 18:55 UTC)
January 27, 2015
By Severalnines

Unlike standard MySQL server and MySQL Cluster, the way to start a MySQL/MariaDB Galera Cluster is a bit different. Galera requires you to start a node in a cluster as a reference point, before the remaining nodes are able to join and form the cluster. This process is known as cluster bootstrap. Bootstrapping is an initial step to introduce a database node as primary component, before others see it as a reference point to sync up data.

 

How does it work?

 

When Galera starts with the bootstrap command on a node, that particular node will reach Primary state (check the value of wsrep_cluster_status). The remaining nodes will just require a normal start command and they will automatically look for existing Primary Component (PC) in the cluster and join to form a cluster. Data synchronization then happens through either incremental state transfer (IST) or snapshot state transfer (SST) between the joiner and the donor.

 

So basically, you should only bootstrap the cluster if you want to start a new cluster or when no other nodes in the cluster is in PRIMARY state. Care should be taken when choosing the action to take, or else you might end up with split clusters or loss of data.

 

The following example scenarios illustrate when to bootstrap the cluster:

 

How to start Galera cluster?

 

The 3 Galera vendors use different bootstrapping commands (based on the software’s latest version). On the first node, run:

  • Codership:
    $ service mysql bootstrap
    
  • Percona XtraDB Cluster:
    $ service mysql bootstrap-pxc
    
  • MariaDB Galera Cluster:
    $ service mysql bootstrap
    

 

The above command is just a wrapper and what it actually does is to start the MySQL instance on that node with gcomm:// as the wsrep_cluster_address variable. You can also manually define the variables inside my.cnf and run the standard start/restart command. However, do not forget to change wsrep_cluster_address back again to contain the addresses to all nodes after the start.

 

read more

Link
Peter ZaitsevPercona Toolkit 2.2.13 is now available (26.1.2015, 16:25 UTC)

Percona ToolkitPercona is pleased to announce the availability of Percona Toolkit 2.2.13.  Released January 26, 2015. Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release is the current GA (Generally Available) stable release in the 2.2 series. It includes multiple bug fixes for pt-table-checksum with better support for Percona XtraDB Cluster, various other fixes, as well as continued preparation for MySQL 5.7 compatibility. Full details are below. Downloads are available here and from the Percona Software Repositories.

New Features:

  • pt-kill now supports new --query-id option. This option can be used to print a query fingerprint hash after killing a query to enable the cross-referencing with the pt-query-digest output. This option can be used along with --print option as well.

Bugs Fixed:

  • Fixed bug 1408375: Percona Toolkit was vulnerable to MITM attack which could allow exfiltration of MySQL configuration information via --version-check option. This vulnerability was logged as CVE 2015-1027
  • Fixed bug 1019479: pt-table-checksum now works with ONLY_FULL_GROUP_BY SQL mode.
  • Fixed bug 1394934: running pt-table-checksum in debug mode would cause an error.
  • Fixed bug 1396868: regression introduced in Percona Toolkit 2.2.12 caused pt-online-schema-change not to honor --ask-pass option.
  • Fixed bug 1399789: pt-table-checksum would fail to find Percona XtraDB Cluster nodes when variable wsrep_node_incoming_address was set to AUTO.
  • Fixed bug 1321297: pt-table-checksum was reporting differences on timestamp columns with replication from 5.5 to 5.6 server version, although the data was identical.
  • Fixed bug 1388870: pt-table-checksum was showing differences if the master and slave were in different time zone.
  • Fixed bug 1402668: pt-mysql-summary would exit if Percona XtraDB Cluster was in Donor/Desynced state.
  • Fixed bug 1266869: pt-stalk would fail to start if $HOME environment variable was not set.

Details of the release can be found in the release notes and the 2.2.13 milestone at Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

The post Percona Toolkit 2.2.13 is now available appeared first on MySQL Performance Blog.

Link
Shlomi NoachReading RBR binary logs with pt-query-digest (26.1.2015, 15:50 UTC)

For purposes of auditing anything that goes on our servers we're looking to parse the binary logs of all servers (masters), as with "Anemomaster". With Row Based Replication this is problematic since pt-query-digest does not support parsing RBR binary logs (true for 2.2.12, latest at this time).

I've written a simple script that translates RBR logs to SBR-like logs, with a little bit of cheating. My interest is that pt-query-digest is able to capture and count the queries, nothing else. By doing some minimal text manipulation on the binary log I'm able to now feed it to pt-query-digest which seems to be happy.

The script of course does not parse the binary log directly; furthermore, it requires the binary log to be extracted via:

mysqlbinlog --verbose --base64-output=DECODE-ROWS your-mysql-binlog-filemame.000001

The above adds the interpretation of the RBR entires in the form of (unconventional) statements, commented, and strips out the cryptic RBR text. All that is left is to do a little manipulation on entry headers and uncomment the interpreted queries.

The script can be found in my gist repositories. Current version is as follows:

#!/usr/bin/python
#
# Convert a Row-Based-Replication binary log to Statement-Based-Replication format, cheating a little.
# This script exists since Percona Toolkit's pt-query-digest cannot digest RBR format. The script
# generates enough for it to work with.
# Expecting standard input
# Expected input is the output of "mysqlbinlog --verbose --base64-output=DECODE-ROWS <binlog_file_name>"
# For example:
# $ mysqlbinlog --verbose --base64-output=DECODE-ROWS mysql-bin.000006 | python binlog-rbr-to-sbr.py | pt-query-digest --type=binlog --order-by Query_time:cnt --group-by fingerprint
#

import fileinput

def convert_rbr_to_pseudo_sbr():
    inside_rbr_statement = False
    for line in fileinput.input():
        line = line.strip()
        if line.startswith("#") and "end_log_pos" in line:
            for rbr_token in ["Update_rows:", "Write_rows:", "Delete_rows:", "Rows_query:", "Table_map:",]:
                if rbr_token in line:
                    line = "%s%s" % (line.split(rbr_token)[0], "Query\tthread_id=1\texec_time=0\terror_code=0")
        if line.startswith("### "):
            inside_rbr_statement = True
            # The "### " commented rows are the pseudo-statement interpreted by mysqlbinlog's "--verbose",
            # and which we will feed into pt-query-digest
            line = line.split(" ", 1)[1].strip()
        else:
            if inside_rbr_statement:
                print("/*!*/;")
            inside_rbr_statement = False
        print(line) 

convert_rbr_to_pseudo_sbr()

 

 

 

 

 

Link
Peter ZaitsevMySQL benchmarks on eXFlash DIMMs (26.1.2015, 13:00 UTC)

In this blog post, we will discuss MySQL performance on eXFlash DIMMs. Earlier we measured the IO performance of these storage devices with sysbench fileio.

Environment

The benchmarking environment was the same as the one we did sysbench fileio in.

CPU: 2x Intel Xeon E5-2690 (hyper threading enabled)
FusionIO driver version: 3.2.6 build 1212
Operating system: CentOS 6.5
Kernel version: 2.6.32-431.el6.x86_64

In this case, we used a separate machine for testing which had a 10G ethernet connection to this server. This server executed sysbench. The client was not the bottleneck in this case. The environment is described in greater detail at the end of the blog post.

Sysbench OLTP write workload

exflash_sysbench_oltp_tp_partial

The graph shows throughput for sysbench OLTP, we will examine properties only for the dark areas of this graph: which is the read/write case for high concurrency.

Each table in the following sections has the following columns

column explanation
storage The device that was used for the measurement.
threads The number of sysbench client threads were used in the benchmark.
ro_rw Read-only or read-write. In the whitepaper you can find detailed information about read-only data as well.
sd The standard deviation of the metric in question.
mean The mean of the metric in question.
95thpct The 95th percentile of the metric in question (the maximum without the highest 5 percent of the samples).
max The maximum of the metric in question.

Sysbench OLTP throughput

storage threads ro_rw sd mean 95thpct max
eXFlash DIMM_4 128 rw 714.09605 5996.5105 7172.0725 7674.87
eXFlash DIMM_4 256 rw 470.95410 6162.4271 6673.0205 7467.99
eXFlash DIMM_8 128 rw 195.57857 7140.5038 7493.4780 7723.13
eXFlash DIMM_8 256 rw 173.51373 6498.1460 6736.1710 7490.95
fio 128 rw 588.14282 1855.4304 2280.2780 7179.95
fio 256 rw 599.88510 2187.5271 2584.1995 7467.13

Going from 4 to 8 eXFlash DIMMs will mostly mean more consistent throughput. The mean throughput is significantly higher in case of 8 DIMMs used, but the 95th percentile and the maximum values are not much different (the difference in standard deviation also shows this). The reason they are not much different is that these benchmark are CPU bound (check CPU idle time table later in this post or the graphs in the whitepaper). The PCI-E flash drive on the other hand can do less than half of the throughput of the eXFlash DIMMs (the most relevant is comparing the 95th percentile value).

Sysbench OLTP response time

storage threads ro_rw sd mean 95thpct max
eXFlash DIMM_4 128 rw 4.4187784 37.931489 44.2600 64.54
eXFlash DIMM_4 256 rw 9.6642741 90.789317 109.0450 176.45
eXFlash DIMM_8 128 rw 2.1004085 28.796017 32.1600 67.10
eXFlash DIMM_8 256 rw 5.5932572 94.060628 101.6300 121.92
fio 128 rw 51.2343587 138.052150 203.1160 766.11
fio 256 rw 72.9901355 304.851844 392.7660 862.00

Truncated by Planet PHP, read more at the original (another 1999 bytes)

Link
Oli SennhauserMySQL table Point-in-Time-Recovery from mysqldump backup (25.1.2015, 18:42 UTC)

Sometimes we face the situation where we have a full MySQL database backup done with mysqldump and then we have to restore and recover just one single table out of our huge mysqldump file.
Further our mysqldump backup was taken hours ago so we want to recover all the changes on that table since our backup was taken up to the end.

In this blog article we cover all the steps needed to achieve this goal for MySQL and MariaDB.

Recommendation: It is recommended to do theses steps on a testing system and then dump and restore your table back to the production system. If you do it directly on your production system you have to know exactly what you are doing...
Further this process should be tested carefully and regularly to get familiar with it and to assure your backup/restore/recovery procedure works properly.

The table we want to recover is called test.test from our backup full_dump.sql.gz. As a first step we have to do the recovery with the following command to our test database:

shell> zcat full_dump.sql.gz | extract_table.py --database=test --table=test | mysql -u root

The script extract_table.py is part of the FromDual Recovery Manager to extract one single table from a mysqldump backup.

As a next step we have to extract the binary log file and its position where to start recovery from out of our dump:

shell> zcat full_dump.sql.gz | head -n 25 | grep CHANGE
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000216', MASTER_LOG_POS=1300976;

Then we have to find out where we want to stop our Point-in-Time-Recovery. The need for recover is possibly due to a TRUCATE TABLE command or similar operation executed on the wrong system or it is just a time somebody has indicated us to recover to. The position to stop we can find as follows:

shell> mysqlbinlog -v mysql-bin.000216 | grep -B5 TRUNCATE --color
#150123 19:53:14 server id 35622  end_log_pos 1302950 CRC32 0x24471494  Xid = 3803
COMMIT/*!*/;
# at 1302950
#150123 19:53:14 server id 35622  end_log_pos 1303036 CRC32 0xf9ac63a6  Query   thread_id=54    exec_time=0       error_code=0
SET TIMESTAMP=1422039194/*!*/;
TRUNCATE TABLE test

And as a last step we have to apply all the changes from the binary log to our testing database:

shell> mysqlbinlog --disable-log-bin --database=test --start-position=1300976 --stop-position=1302950 mysql-bin.000216 | mysql -u root --force

Now the table test.test is recovered to the wanted point in time and we can dump and restore it to its final location back to the production database.

shell> mysqldump --user=root --host=testing test test | mysql --user=root --host=production test

This process has been tested on MySQL 5.1.73, 5.5.38, 5.6.22 and 5.7.5 and also on MariaDB 10.0.10 and 10.1.0.

Link
Peter ZaitsevUsing Percona Cloud Tools to solve real-world MySQL problems (23.1.2015, 19:49 UTC)

For months when speaking with customers I have been positioning Percona Cloud Tools (PCT) as a valuable tool for the DBA/Developer/SysAdmin but only recently have I truly been able to harness the data and make a technical recommendation to a customer that I feel would have been very difficult to accomplish otherwise.

Using Percona Cloud Tools to solve real-world MySQL problemsLet me provide some background: I was tasked with performing a Performance Audit for one of our customers (Performance Audits are extremely popular as they allow you to have a MySQL Expert confirm or reveal challenges within your MySQL environment and make your database run faster!) and as part of our conversation we discussed and agreed to install Percona Cloud Tools. We let the site run for a few days, and then I started my audit. What I noticed was that at regular intervals there was often a CPU spike, along with a corresponding drop in Queries Per Second (QPS), but that lasted only for a few seconds. We decided that further investigation was warranted as the customer was concerned the spikes impacted their users’ experience with the application.

Here are the tasks that Percona Cloud Tools made easy while I worked to identify the source of the CPU spike and QPS drop:

  1. Per-second granularity data capture of PCT allowed me to identify how significant the spike and QPS actually were – if I was looking at the 1 minute or higher average values (such as Cacti would provide) I probably wouldn’t have been able to detect the spike or stall as clearly in the first place, it would have been lost in the average. In the case of PCT the current graphs group at the 1 minute range but you have the ability to view the min and max values during this 1 minute range since they are the true highest and lowest observed 1s intervals during the 1 minute group.
  2. Ability for all graphs to maintain the same resolution time allowed me to zero-in on the problematic time period and then quickly look across all graphs for corresponding deflections. This analysis led me to discover a significant spike in InnoDB disk reads.
  3. Ability to use the Query Analytics functionality to zoom-in again on the problematic query. By adjusting Query Analytics to an appropriate time period narrowed down the range of unique queries that could be considered the cause. This task in my opinion is the best part of using PCT.
  4. Query Analytics allowed me to view the Rows Examined in Total for each query based on just this shortened interval. I then tagged those that had higher than 10k Rows Examined (arbitrary but most queries for this customer seemed to fall below this) so that I could then review in real-time with the customer before making a decision on what to do next. We can only view this sort of information by leveraging the slow query log – this data is not available via Performance_Schema or via network sniffing.

Once we were able to identify the problematic queries then the rest was routine query optimization – 10 minutes work using Percona Cloud Tools for what might have been an hour using traditional methods!

For those of you wondering how else this can be done, assuming you detected the CPU spike / QPS drop (perhaps you are using Graphite or other tool that can deliver per-second resolution) then you’d also need to be capturing the slow query log at a good enough resolution level (I prefer long_query_time=0 to just get it all), and then be adept at leveraging pt-query-digest with –since and –until options to narrow down your range of queries.  The significant drawback to this approach is that each time you want to tweak your time range you probably need to stream through a fairly large slow log file multiple times which can be both CPU and disk intensive operations, which means it can take some time (probably minutes, maybe hours) depending on the size of your log file.  Certainly a workable approach but nowhere near as quick as reloading a page in your browser :)

So what are you waiting for? Start using Percona Cloud Tools today, it’s free! Register for the free beta here.

The post Using Percona Cloud Tools to solve real-world MySQL problems appeared first on MySQL Performance Blog.

Link
Chris CalenderOne More Possible Reason for ERROR 1959 (OP000) Invalid role specification (21.1.2015, 21:25 UTC)

I ran into this error today while working with ROLES, bundled privileges for a group of users (a nice addition in MariaDB 10.0):

ERROR 1959 (OP000) Invalid role specification

The manual suggests that this error occurs “if a role that doesn’t exist, or to which the user has not been assigned, is specified”.

However, I ran into a case where the role was created and assigned properly (examine SHOW GRANTS for user in question).

In this case, the problem was just due to a bug that affected remote connections on older versions.

Thus if you encounter this error, check if you’re using a remote connection, and if so, this bug is your problem (assuming you created and assigned the role properly).

If interested in reading more about roles, please refer to:

https://mariadb.com/kb/en/mariadb/documentation/managing-mariadb/user-account-management/roles/

Hope this helps.

 

Link
Peter ZaitsevImporting big tables with large indexes with Myloader MySQL tool (21.1.2015, 14:00 UTC)

Mydumper is known as the faster (much faster) mysqldump alternative. So, if you take a logical backup you will choose Mydumper instead of mysqldump. But what about the restore? Well, who needs to restore a logical backup? It takes ages! Even with Myloader. But this could change just a bit if we are able to take advantage of Fast Index Creation.

As you probably know, Mydumper and mysqldump export the struct of a table, with all the indexes and the constraints, and of course, the data. Then, Myloader and MySQL import the struct of the table and import the data. The most important difference is that you can configure Myloader to import the data using a certain amount of threads. The import steps are:

  1. Create the complete struct of the table
  2. Import the data

When you execute Myloader, internally it first creates the tables executing the “-schema.sql” files and then takes all the filenames without “schema.sql” and puts them in a task queue. Every thread takes a filename from the queue, which actually is a chunk of the table, and executes it.  When finished it takes another chunk from the queue, but if the queue is empty it just ends.

This import procedure works fast for small tables, but with big tables with large indexes the inserts are getting slower caused by the overhead of insert the new values in secondary indexes. Another way to import the data is:

  1. Split the table structure into table creation with primary key, indexes creation and constraint creation
  2. Create tables with primary key
  3. Per table do:
    1. Load the data
    2. Create index
  4. Create constraints

This import procedure is implemented in a branch of Myloader that can be downloaded from here or directly executing bzr with the repository:

bzr branch lp:~david-ducos/mydumper/mydumper

The tool reads the schema files and splits them into three separate statements which create the tables with the primary key, the indexes and the constraints. The primary key is kept in the table creation in order to avoid the recreation of the table when a primary key is added and the “KEY” and “CONSTRAINT” lines are removed. These lines are added to the index and constraint statements, respectively.

It processes tables according to their size starting with the largest because creating the indexes of a big table could take hours and is single-threaded. While we cannot process other indexes at the time, we are potentially able to create other tables with the remaining threads.

It has a new thread (monitor_process) that decides which chunk of data will be put in the task queue and a communication queue which is used by the task processes to tell the monitor_process which chunk has been completed.

I run multiple imports on an AWS m1.xlarge machine with one table comparing Myloader and this branch and I found that with large indexes the times were:

myloader

As you can see, when you have less than 150M rows, import the data and then create the indexes is higher than import the table with the indexes all at once. But everything changes after 150M rows, import 200M takes 64 minutes more for Myloader but just 24 minutes for the new branch.

On a table of 200M rows with a integer primary key and 9 integer columns, you will see how the time increases as the index gets larger:

myloader2

Where:

2-2-0: two 1-column and two 2-column index
2-2-1: two 1-column, two 2-column and one 3-column index
2-3-1: two 1-column, three 2-column and one 3-column index
2-3-2: two 1-column, three 2-column and two 3-column index

Conclusion

This branch can only import all the tables with this same strategy, but with this new logic in Myloader, in a future version it could be able to import each table with the best strategy reducing the time of the restore considerably.

The post Importing big tables with large indexes with Myloader MySQL tool appeared first on MySQL Performance Blog.

Link
Peter ZaitsevIdentifying useful info from MySQL row-based binary logs (20.1.2015, 15:34 UTC)

As a MySQL DBA/consultant, it is part of my job to decode the MySQL binary logs – and there are a number of reasons for doing that. In this post, I’ll explain how you can get the important information about your write workload using MySQL row-based binary logs and a simple awk script.

First, it is important to understand that row-based binary logs contain the actual changes done by a query. For example, if I run a delete query against a table, the binary log will contain the rows that were deleted. MySQL provides the mysqlbinlog utility to decode the events stored in MySQL binary logs. You can read more about mysqlbinlog in detail in the reference manual here.

The following example illustrates how mysqlbinlog displays row events that specify data modifications. These correspond to events with the WRITE_ROWS_EVENT, UPDATE_ROWS_EVENT, and DELETE_ROWS_EVENT type codes.

We will use following options of mysqlbinlog.
–base64-output=decode-rows
–verbose, -v
–start-datetime=”datetime”
–stop-datetime=”datetime”

We have a server running with row based binary logging.

mysql> show variables like '%binlog_format%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.00 sec)

We created a test table and executed the following sequence of statements under a transaction.

use test;
CREATE TABLE t
(
  id   INT NOT NULL,
  name VARCHAR(20) NOT NULL,
  date DATE NULL
) ENGINE = InnoDB;
START TRANSACTION;
INSERT INTO t VALUES(1, 'apple', NULL);
UPDATE t SET name = 'pear', date = '2009-01-01' WHERE id = 1;
DELETE FROM t WHERE id = 1;
COMMIT;

Now let’s see how it is represented in binary logs.

# mysqlbinlog --base64-output=decode-rows -vv  --start-datetime="2015-01-12 21:40:00"  --stop-datetime="2015-01-12 21:45:00"  mysqld-bin.000023
/*!*/;
# at 295
#150112 21:40:14 server id 1  end_log_pos 367 CRC32 0x19ab4f0f  Query   thread_id=108   exec_time=0     error_code=0
SET TIMESTAMP=1421079014/*!*/;
BEGIN
/*!*/;
# at 367
#150112 21:40:14 server id 1  end_log_pos 415 CRC32 0x6b1f2240  Table_map: `test`.`t` mapped to number 251
# at 415
#150112 21:40:14 server id 1  end_log_pos 461 CRC32 0x7725d174  Write_rows: table id 251 flags: STMT_END_F
### INSERT INTO `test`.`t`
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */
# at 461
#150112 21:40:14 server id 1  end_log_pos 509 CRC32 0x7e44d741  Table_map: `test`.`t` mapped to number 251
# at 509
#150112 21:40:14 server id 1  end_log_pos 569 CRC32 0x0cd1363a  Update_rows: table id 251 flags: STMT_END_F
### UPDATE `test`.`t`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
# at 569
#150112 21:40:14 server id 1  end_log_pos 617 CRC32 0xf134ad89  Table_map: `test`.`t` mapped to number 251
# at 617
#150112 21:40:14 server id 1  end_log_pos 665 CRC32 0x87047106  Delete_rows: table id 251 flags: STMT_END_F
### DELETE FROM `test`.`t`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
# at 665
#150112 21:40:15 server id 1  end_log_pos 696 CRC32 0x85ffc9ff  Xid = 465
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
  • Row with “Table_map: test.t” defines the table name used by query.
  • Rows with “Write_rows/Update_rows/Delete_rows” defines the event type.
  • Lines that start with “###” defines the actual row that got changed.
  • Columns are represented as @1, @2 and so on.

Now have a look at our simple awk script that will use the mysqlbinlog output and print a beautiful summary for INSERT/UPDATE/DELETE events from row-based binary logs. Just replace the “mysqld-bin.000023″ with your binary log file. The string “#15″ in third line is for year 2015. If you are decoding a binary log file from 2014, just change it to “#14″. It is also recommended to use “–start-datetime” and ” –stop-datetime” options to decode

Truncated by Planet PHP, read more at the original (another 10997 bytes)

Link
Jean-Jerome SchmidtHow to Deploy High Availability CloudStack with MariaDB Galera Cluster (20.1.2015, 13:24 UTC)
January 20, 2015
By Severalnines

Apache CloudStack is a turnkey orchestration platform to deliver IaaS clouds. As a cloud infrastructure scales to hundreds or thousands of servers, with production environments supporting multiple applications and services, high availability becomes a key requirement. Just as for OpenStack, it is possible to deploy the CloudStack management server in a multi-node installation. Since the management server uses a MySQL database to store the state of all its objects, the database could become a single point of failure. The CloudStack manual recommends MySQL replication with manual failover in the event of database loss. We believe there is a better way.

In this blog post, we’ll show you how to deploy redundant CloudStack management servers with MariaDB Galera Cluster on CentOS 6.5 64bit. We will have two load balancer nodes fronting the management servers and the database servers. Since CloudStack relies on MySQL’s GET_LOCK and RELEASE LOCK, which are not supported by Galera, we will redirect all database requests to only one MariaDB node and automatically failover to another node in case the former goes down. So, we’re effectively getting the HA benefits of Galera clustering (auto-failover, full consistency between DB nodes, no slave lag), while avoiding the Galera limitations as we’re not concurrently accessing all the nodes. We will deploy a two-node Galera Cluster (plus an arbitrator on a separate ClusterControl node). 

 

Our setup will look like this:

Note that this blog post does not cover the installation of hypervisor and storage hosts. Our setup consists of 4 servers:

  • lb1: HAproxy + keepalived (master)
  • lb2: HAproxy + keepalived (backup) + ClusterControl + garbd
  • mgm1: CloudStack Management + database server
  • mgm2: CloudStack Management + database server

 

read more

Link
LinksRSS 0.92   RDF 1.
Atom Feed