Peter ZaitsevPrevent MySQL downtime: Set max_user_connections (29.7.2014, 12:05 UTC)

One of the common causes of downtime with MySQL is running out of connections. Have you ever seen this error? “ERROR 1040 (00000): Too many connections.” If you’re working with MySQL long enough you surely have. This is quite a nasty error as it might cause complete downtime… transient errors with successful transactions mixed with failing ones as well as only some processes stopping to run properly causing various kinds of effects if not monitored properly.

There are number of causes for running out of connections, the most common ones involving when the Web/App server is creating unexpectedly large numbers of connections due to a miss-configuration or some script/application leaking connections or creating too many connections in error.

The solution I see some people employ is just to increase max_connections to some very high number so MySQL “never” runs out of connections. This however can cause resource utilization problems – if a large number of connections become truly active it may use a lot of memory and cause the MySQL server to swap or be killed by OOM killer process, or cause very poor performance due to high contention.

There is a better solution: use different user accounts for different scripts and applications and implement resource limiting for them. Specifically set max_user_connections:

mysql> GRANT USAGE ON *.* TO 'batchjob1'@'localhost'

This approach (available since MySQL 5.0) has multiple benefits:

Security – different user accounts with only required permissions make your system safer from development errors and more secure from intruders
Preventing Running out of Connections – if there is a bug or miss-configuration the application/script will run out of connections of course but it will be the only part of the system affected and all other applications will be able to use the database normally.
Overload Protection – Additional numbers of connections limits how much queries you can run concurrently. Too much concurrency is often the cause of downtime and limiting it can reduce the impact of unexpected heavy queries running concurrently by the application.

In addition to configuring max_user_connections for given accounts you can set it globally in my.cnf as “max_user_connections=20.” This is too coarse though in my opinion – you’re most likely going to need a different number for different applications/scripts. Where max_user_connections is most helpful is in multi-tenant environments with many equivalent users sharing the system.

The post Prevent MySQL downtime: Set max_user_connections appeared first on MySQL Performance Blog.

Shlomi NoachOrchestrator 1.0.4 released (29.7.2014, 11:42 UTC)

Outbrain's orchestrator Version 1.0.4 is released.

Quick links: Orchestrator Manual, FAQ, Downloads

What's new?


orchestrator now does a much better visualization of Master-Master replication:




The work on making the visualization more appealing also resulted in making a better distinction between the visual tree topology and the replication topology. This in turn also fixes the ruleset for moving slaves in a co-master topology, and lays the ground for future work on co-masters (i.e. Galera; unscheduled).


Had a few reports on orchestrator not being able to connect to some topology instances. It seems like the problem is with name resolving. To prove/disprove this, the resolve command or resolve API call now allows checking for instance connectivity. Orchestrator will first test whether the CNAME at all resolves, then try to dial the TCP address (host + port) to see whether it can make a connection.

A sample API call would be:

A command line invocation would be:

orchestrator -c resolve -i myhost.mydomain:3306

chef cookbook

Silvia Botros of SendGrid has published a generic cookbook for deploying orchestrator via chef:

Thanks, Silvia!

Topology Annonymizer Cheatsheet

If you want to share your topology graph, but do not wish to expose your host names, open your cluster page and execute the following in your browser's JavaScript console (jQuery is already included by orchestrator):

var _=function() {
  var counter = 0;  
  var port = 3306;
  jQuery("h3.popover-title").each(function() {

This results in a modified topology such as the one presented above ("instance-11", "instance-12" and so on).

Other enhancements since 1.0:

  • Proper error log entry when backend database is not found; also terminates execution.
  • Added --stack command line. Combined with --debug this prints the stack trace upon error.
  • When a known instance is not found (gone from the radar), now showing time since last seen instead of irrelevant last known slave lag.
  • Various UI enhancements/fixes

Orchestrator is released as open source under the Apache 2.0 license and is available at:

Erkan YanarGalera Cluster using GTID: MySQL vs. MariaDB (28.7.2014, 22:02 UTC)

Using GTID to attach an asynchronous Slave sounds promising. Lets have a look at the two existing GTID implementations and their integration with Galera.


There is one GTID used by the cluster and every node increments the common seqno by itself. This works well as long all transactions are replicated by Galera (simplified InnoDB). Because Galera takes care of the Commit Order of the transactions on all nodes. So having identical GTID/seqno from the start there are no problems.

  node1> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-504 |

  node2> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-504 |

  node3> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-504 |

But think about having a DML not replicated by Galera. Lets assume we write into a MyISAM/MEMORY table on node1. Then only the seqno of node1 is increased:

  node1> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-505 |

  node2> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-504 |

  node3> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-504 |

Galera does not care about the different seqno on the hosts. The next transaction replicated by Galera increases the seqno of all nodes by 1:

  node1> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-506 |

  node2> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-505 |

  node3> show global variables like 'gtid_binlog_pos';
  | Variable_name   | Value   |
  | gtid_binlog_pos | 0-1-505 |

So we have different mapping between GTID@MariaDB and GTID@Galera on the different hosts. This is far from optimal.

Having that situation think about a slave switching the Master. You will do one of two outcomes:

  • Loosing transactions
  • Reapplying transactions

If you are lucky replication fails. As the data is inconsistent.

So it is up to you to make sure you have only DML’s on the cluster, being replicated by Galera only.


MySQL-Galera integration looks different:

GTID@MySQL uses server_uuid combined with the seqno to build its GTID. Galera makes a little trick in using a separate servre_uuid for transactions written/replicated by Galera. All other transactions use the original server_uuid of the server.

Let’s have a look on one node:

 node2> show global variables like 'gtid_executed';
 | Variable_name | Value                                           |
 | gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-10,
 933c5612-12c8-11e4-82d2-00163e014ea9:1-6 |

6d75ac01-ed37-ee1b-6048-592af289b902 ist die server_uuid für Galera. 933c5612-12c8-11e4-82d2-00163e014ea9 ist die server_uuid für alle anderen Transaktionen.

So lets write into an InnoDB table:

node2> node2> show global variables like 'gtid_executed';
| Variable_name | Value                                            |
| gtid_executed | 6d75ac01-ed37-ee1b-6048-592af289b902:1-11,
933c5612-12c8-11e4-82d2-00163e014ea9:1-6 |

Truncated by Planet PHP, read more at the original (another 1740 bytes)

Erkan YanarMariadB Galera: Attaching an asynchronous Slave using GTID (28.7.2014, 12:17 UTC)

Galera the synchronous Master-Master replication is quite popular. It is used by Percona XtraDB Cluster, MariaDB Galera Cluster and even patched MySQL binaries exist. Quite often you want to add a slave to a Galera Cluster. This is going to work quite well. All you need is at least configure log_bin, log_slave_updates and server_id on the designated Masters and attach your Slave.


Even you can use traditional (non GTID) replication. Using non GTID replication is a hassle. As you need to search for the right offset on the new Master to attach your Slave on.

Using GTID promises to be easier. As you simply switch to the new Master and the replication framework finds the new position based on the GTiD automatically.

As a fact we have two GTID implementations

  • GTID@MySQL/Percona
  • GTID@MariaDB

There are already blogpost about attaching a Slave to a Galera Cluster not using GTID.

And even using GTID@MySQL

We are going to provide a post using GITD@MariaDB :)

We assume there is already a running Galera Cluster. Building one is already explained:

Both are quite similar :D

In opposite to the blog please use wsrep_sst_method=xtrabackup-v2. The current MariaDB release 10.0.12-MariaDB-1~trusty-wsrep-log has a bug preventing you to use wsrep_sst_method=rsync

Additional Configuration on the Galera nodes

server_id        = 1

log_bin activates the binlog, while log_slave_updates make sure to write all transactions replicated via Galera into that binlog. On didactic purpose we set the server_id on the same value on all Galera nodes.

Configuring the Slave

binlog_format      = ROW
server_id          = 2

GTID@Mariadb still lacks good operational integration. So building a slave is done not using GTID.

There is a fix for mysqldump and a patch for xtrabackup exists.

Attaching a Slave

MariaDB replicates - in opposite to MySQL - always the GTID. So we first attach the slave using mysqldump (master-data) and are going to have a running replication:

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 537
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 536
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 537
              Relay_Log_Space: 834
              Until_Condition: None
                Until_Log_Pos: 0
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
             Master_Server_Id: 1
                   Using_Gtid: No

Switching to replication using GTID is quite simple:

slave> stop slave;
slave> change master to master_use_gtid=slave_pos;
slave> start slave;
slave> show slave status\G

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_User: replication
                  Master_Port: 3306

Truncated by Planet PHP, read more at the original (another 2231 bytes)

Peter ZaitsevWhat I learned while migrating a customer MySQL installation to Amazon RDS (28.7.2014, 11:00 UTC)

Hi, I recently had the experience of assisting with a migration of a customer MySQL installation to Amazon RDS (Relational Database Service). Amazon RDS is a great platform for hosting your MySQL installation and offers the following list of pros and cons:

  • You can scale your CPU, IOPS, and storage space separately by using Amazon RDS. Otherwise you need to take downtime and upgrade physical components of a rack-mounted server.
  • Backups, software version patching, failure detection, and (some) recovery is automated with Amazon RDS.
  • You lose shell access to your DB instance
  • You lose SUPER privilege for regular users. Many SUPER-type statements and commands are provided for as a Stored Procedure.
  • It is easy to set up multiple read replicas (slaves in READ_ONLY=1 mode).
  • You can set up a secondary sychronous instance for failover in the event your primary instance fails.

While this article is written to be Amazon RDS-specific it also has implications for any sort of migration.

  1. The only way to interface with RDS is through mysql client, which means loading data must be done using SQL. This means you need to use mysqldump or mydumper, which can be a large endeavour should your dataset be north of 500GB — this is a lot of single threaded activity!  Think about not only how long dumping and loading will take, but also factor in how much time it will take for replication to catch up on the hours/days/weeks your dumping and loading procedure took.  You might need to allocate more disk space and Provisioned IOPS to your RDS node in order to improve disk throughput, along with a change to innodb_flush_log_at_trx_commit, and sync_binlog.
  2. RDS is set to UTC (system_time_zone=UTC) and this cannot be changed as in Parameter Groups you will see that default_time_zone is set as Modifiable=false. This can bite you if you are planning to use RDS as a slave for a short while before failing the application over to Amazon RDS.  If you have configured binlog_format=STATEMENT on your master and you have TIMESTAMP columns, this will lead to differences in RDS data set for absolute values ’2014-07-24 10:15:00′ vs NOW(). It is also a concern for the Developer who may not be explicitly declaring their MySQL connections to set an appropriate time zone. Often the best piece of advice can be to leave all database data in UTC no matter where the server is physically located, and deal with localization at the presentation layer.
  3. Amazon RDS by default has max_allowed_packet=1MB. This is pretty low as most other configs are 16MB so if you’re using extended-insert (by default, you are), the size of each insert statement will be close to 16MB and thus can lead to errors related to “packet too big” on Amazon RDS side, thus failing out an import.
  4. Amazon RDS does not support the SUPER privilege for regular users. For example, this becomes quite a challenge as many tools (Percona Toolkit) are authored to assume you have SUPER-level access on all nodes — simple tasks become vastly more complicated as you need to think of clever workarounds (I’m looking at you pt-table-sync!).
  5. Triggers and views thus cannot be applied using the default mysqldump syntax which includes SQL DEFINER entries — these lines are there so that a user with SUPER can “grant” another user ability to execute the trigger/view. Your load will fail if you forget this.
  6. Consider running your load with –force to the mysql client, and log to disk stderr/stdout so you can review errors later. It is painful to spend 4 days loading a 500GB database only to have it fail partially through because you forgot about SQL DEFINER issue..
  7. Consider splitting the mysqldump into two phases: –no-data so you dump schema only, and then –data-only so you get just the rows. This way you can isolate faults and solve them along the way.
  8. Skipping replication events is SLOW. You don’t have ability to do sql_slave_skip_counter (since this requires SUPER), instead you need to use an Amazon RDS function of mysql.rds_skip_repl_error. Sadly this Stored Procedure takes no argument and thus it only skips one event at a time. It takes about 2-3 seconds for each execution, so if you have a lot of events to skip, that’s a problem. Having to skip ANYTHING is indication that something went wrong in the process, so if you find yourself in the unenviable position of skipping events, know that pt-table-checksum should be able to give you an idea how widespread is the data divergence issue.
  9. pt-table-sync doesn’t work against Amazon RDS as it is written t

Truncated by Planet PHP, read more at the original (another 3251 bytes)

Shlomi NoachSome MySQL security tips (28.7.2014, 09:13 UTC)

This is a brief list of security tips for MySQL. It is by no means complete.

  • Follow the sudo example. Don't let all you DBAs and Ops have the password for the root account. Have each and every one of them have their own personal super-duper account, with their own personal and private password. This makes it so easy when someone leaves the company. No need to change passwords, just to remove the employee's account.
  • Block root. Either remove it completely or forbid it from logging in. Yes, there's a way hack in MySQL to have a valid account blocked from logging in. One way of making this happen is via common_schema's sql_accounts. Here's how to block root account using common_schema:
mysql> CALL common_schema.eval("SELECT sql_block_account FROM sql_accounts WHERE USER = 'root'");
  • Make lots of small users. Give nagios its own user. Give collectd its own user. Give orchestrator its own user. Give innotop its own user. Give whatever its own user. Yes, it's more users to create, but you get to have each user as limited in privileges as possible, and you don't get to wonder why your heartbeat script has SUPER, LOCK and SHUTDOWN privileges.
  • Verify: set @@old_passwords=0; before setting a new password. Make sure your configuration does not specify old_passwords = 1. There is no reason to use "old passwords". In fact, a 5.6 client will refuse connecting with an "old password".
  • Give no access to mysql.*. No one should be tampering directly with the mysql system tables.
  • Run oak-security-audit or, if you have common_schema installed (you mean you don't?), just CALL security_audit(); I can (almost) guarantee you'd be surprised and thankful for the security breakdown. Users without passwords, users sharing same passwords, users with unreasonable privileges, and more... You'll see them all.
  • If you have web interfaces to your database or some of its aspects (e.g. Anemometer, Propagator, Orchestrator, monitoring, ...), protect it via LDAP group or similar. Not everyone who has access to your network needs to see you database. Neither does every employee.
Peter ZaitsevMonitoring MySQL flow control in Percona XtraDB Cluster 5.6 (25.7.2014, 14:41 UTC)

Monitoring flow control in a Galera cluster is very important. If you do not, you will not understand why writes may sometimes be stalled. Percona XtraDB Cluster 5.6 provides 2 status variables for such monitoring: wsrep_flow_control_paused and wsrep_flow_control_paused_ns. Which one should you use?

What is flow control?

Flow control does not exist with regular MySQL replication, but only with Galera replication. It is simply the mechanism nodes are using when they are not able to keep up with the write load: to keep replication synchronous, the node that is starting to lag instructs the other nodes that writes should be paused for some time so it does not get too far behind.

If you are not familiar with this notion, you should read this blogpost.

Triggering flow control and graphing it

For this test, we’ll use a 3-node Percona XtraDB Cluster 5.6 cluster. On node 3, we will adjust gcs.fc_limit so that flow control is triggered very quickly and then we will lock the node:

pxc3> set global wsrep_provider_options="gcs.fc_limit=1";
pxc3> flush tables with read lock;

Now we will use sysbench to insert rows on node 1:

$ sysbench --test=oltp --oltp-table-size=50000 --mysql-user=root --mysql-socket=/tmp/pxc1.sock prepare

Because of flow control, writes will be stalled and sysbench will hang. So after some time, we will release the lock on node 3:

pxc3> unlock tables;

During the whole process, wsrep_flow_control_paused and wsrep_flow_control_paused_ns are recorded every second with mysqladmin ext -i1. We can then build a graph of the evolution of both variables:


While we can clearly see when flow control was triggered on both graphs, it is much easier to know when flow control was stopped with wsrep_flow_control_paused_ns. It would be even more obvious if we have had several timeframes when flow control is in effect.


Monitoring a server is obviously necessary if you want to be able to catch issues. But you need to look at the right metrics. So don’t be scared if you are seeing that wsrep_flow_control_paused is not 0: it simply means that flow control has been triggered at some point since the server started up. If you want to know what is happening right now, prefer wsrep_flow_control_paused_ns.

The post Monitoring MySQL flow control in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

Peter ZaitsevPutting MySQL Fabric to Use: July 30 webinar (24.7.2014, 22:13 UTC)

Percona MySQL webinarsMartin and I have recently been blogging together about MySQL Fabric (in case you’ve missed this, you can find the first post of the series here), and on July 30th, we’re going to be presenting a webinar on this topic titled “Putting MySQL Fabric to Use.”

The focus of the webinar is to help you get started quickly on this technology, so we’ll include very few slides (mostly just a diagram or two) and then jump straight into shared screen mode, with lots of live console and source code examples.

In order to make the best use of time, we won’t show you how to install and configure MySQL Fabric. However, we can point you to a few resources to help you get ready and even follow our examples as we go:

  • The official manual is an obvious starting point
  • Our second post in the series includes configuration instructions
  • This git repo contains the test environment we’ll use to run our demos. Specifically, we’ll use the sharding branch, so if you intend to follow our examples as we go, we recommend checking that one out.

If you’re interested, you can register for this webinar here, and if there’s something specific you’d like to see (we had a request for PHP examples in the comments to our last post) feel free to post that as a comment. We can’t promise we’ll be able to cover all requests during the webinar, but we’ll incorporate examples to the repo as time allows.

Hope to see you then!

The post Putting MySQL Fabric to Use: July 30 webinar appeared first on MySQL Performance Blog.

Federico RazzoliHow MariaDB makes Stored Procedures usable (24.7.2014, 11:14 UTC)
Peter ZaitsevDBaaS, OpenStack and Trove 101: Introduction to the basics (24.7.2014, 07:00 UTC)

We’ll be publishing a series of posts on OpenStack and Trove over the next few weeks, diving into their usage and purpose. For readers who are already familiar with these technologies, there should be no doubt as to why we are incredibly excited about them, but for those who aren’t, consider this a small introduction to the basics and concepts.

What is Database as a Service (DBaaS)?
In a nutshell, DBaaS – as it is frequently referred to – is a loose moniker to the concept of providing a managed cloud-based database environment accessible by users, applications or developers. Its aim is to provide a full-fledged database environment, while minimizing the administrative turmoil and pains of managing the surrounding infrastructure.

Real life example: Imagine you are working on a new application that has to be accessible from multiple regions. Building and maintaining a large multiregion setup can be very expensive. Furthermore, it introduces additional complexity and strain on your system engineers once timezones start to come into play. The challenge of having to manage machines in multiple datacenters won’t simplify your release cycle, nor increase your engineers’ happiness.

Let’s take a look at some of the questions DBaaS could answer in a situation like this:

- How do I need to size my machines, and where should I locate them?
Small environments require less computing power and can be a good starting point, although this also means they may not be as well-prepared for future growth. Buying larger-scale and more expensive hardware and hosting can be very expensive and can be a big stumbling block for a brand new development project. Hosting machines in multiple DC’s could also introduce administrative difficulties, like having different SLA’s and potential issues setting up WAN or VPN communications. DBaaS introduces an abstraction layer, so these consideration aren’t yours, but those of the company offering it, while you get to reap all the rewards.

- Who will manage my environment from an operational standpoint?
Staffing considerations and taking on the required knowledge to properly maintain a production database are often either temporarily sweeped under the rug or, when the situation turns out badly, a cause for the untimely demise of quite a few young projects. Rather than think about how long ago you should have applied that security patch, wouldn’t it be nice to just focus on managing the data itself, and be otherwise confident that the layers beyond it are managed responsibly?

- Have a sudden need to scale out?
Once you’re up and running, enjoying the success of a growing use base, your environment will need to scale accordingly. Rather than think long and hard on the many options available, as well as the logistics attached to those changes, your DBaaS provider could handle this transparently.

Popular public options: Here are a few names of public services you may have come across already that fall under the DBaaS moniker:

- Amazon RDS
- Rackspace cloud databases
- Microsoft SQLAzure
- Heroku
- Clustrix DBaaS

What differentiates these services from a standard remote database is the abstraction layer that fully automates their backend, while still offering an environment that is familiar to what your development team is used to (be it MySQL, MongoDB, Microsoft SQLServer, or otherwise). A big tradeoff to using these services is that you are effectively trusting an external company with all of your data, which might make your legal team a bit nervous.

Private cloud options?
What if you could offer your team the best of both worlds? Or even provide a similar type of service to your own customers? Over the years, a lot of platforms have been popping up to allow effective management and automation of virtual environments such as these, allowing you to effectively “roll your own” DBaaS. To get there, there are two important layers to consider:

  • Infrastructure Management, also referred to as Infrastructure-as-a-Service (IaaS), focusing on the logistics of spinning up virtual machines and keeping their required software packages running.
  • Database Management, previously referred to DBaaS, transparently coordinating multiple database instances to work together and present themselves as a single, coherent data repository.

Examples of IaaS products:
- OpenStack
- OpenQRM

Ecample of DBaaS:
- Trove

Main Advantages of DBaaS
For reference, the main reasons why you might want to consider using an existing DBaaS are as follows:

- Reduced Database management costs

DBaaS rem

Truncated by Planet PHP, read more at the original (another 4022 bytes)

LinksRSS 0.92   RDF 1.
Atom Feed