Planet MariaDB

November 21, 2017

Jean-Jerome Schmidt

ClusterControl 1.5 - Automatic Backup Verification, Build Slave from Backup and Cloud Integration

At the core of ClusterControl is its automation, as is ensuring that your data is is securely backed up and ready for restoration whenever something goes wrong. Having an effective backup strategy and disaster recovery plan is key to the success of any application or environment.

In our latest release, ClusterControl 1.5, we have introduced a number of enhancements for backing up MySQL and MariaDB-based systems.

One of the key improvements is the ability to backup from ClusterControl to the cloud provider of your choice. Cloud providers like Google Cloud Services and Amazon S3 each offer virtually unlimited storage, reducing local space needs. This allows you to retain your backup files longer, for as long as you would like and not have concerns around local disk space.

Let’s explore all the exciting new backup features for ClusterControl 1.5...

Backup/Restore Wizard Redesign

First of all, you will notice backup and restore wizards have been revamped to better improve the user experience. It will now load as a side menu on the right of the screen:

The backup list is also getting a minor tweak where backup details are displayed when you click on the particular backup:

You will be able to view backup location and which databases are inside the backup. There are also options to restore the backup or upload it into the cloud.

PITR Compatible Backup

ClusterControl performs the standard mysqldump backup with separate schema and data dumps. This makes it easy to restore partial backups. However, it breaks the consistency of the backup (schema and data are dumped in two separate sessions), thus it cannot be used to provision a slave or point-in-time recovery.

A mysqldump PITR-compatible backup contains one single dump file, with GTID info, binlog file and position. Thus, only the database node that produces binary log will have the "PITR compatible" option available, as highlighted in the screenshot below:

When PITR compatible option is toggled, the database and table fields are greyed out since ClusterControl will always perform the backup against all databases, events, triggers and routines of the target MySQL server.

The following lines will appear in the first ~50 lines of the completed dump file:

$ head -50 mysqldump_2017-11-07_072250_complete.sql
...
-- GTID state at the beginning of the backup
--
SET @@GLOBAL.GTID_PURGED='20dc5247-4a98-ee18-73af-5c79373388ee:1-1681';

--
-- Position to start replication or point-in-time recovery from
--
CHANGE MASTER TO MASTER_LOG_FILE='binlog.000001', MASTER_LOG_POS=2457790;
...

The information can be used to build slaves from backup, or perform point-in-time recovery together with binary logs, where you can start the recovery from the MASTER_LOG_FILE and MASTER_LOG_POS reported in the dump file using "mysqlbinlog" utility. Note that binary logs are not backed up by ClusterControl.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Build Slaves from Backup

Another feature is the ability to build a slave directly from a PITR-compatible backup, instead of doing it from a chosen master. This is a huge advantage as it offloads the master server. This option can be used with MySQL Replication or Galera Cluster. An existing backup can be used to rebuild an existing replication slave or add a new replication slave during the staging phase, as shown in the following screenshot:

Once the staging completes, the slave will connect to the chosen master and start catching up. Previously, ClusterControl performed a streaming backup directly from the chosen master using Percona Xtrabackup. This could impact performance of the master when scaling out a large dataset, despite the operation being non blocking on the master. With the new option, if the backup is stored on ClusterControl, only these hosts (ClusterControl + the slave) will be busy when staging the data on the slave.

Backup to Cloud

Backups can now be automatically uploaded in the cloud. This requires a ClusterControl module to be installed, called clustercontrol-cloud (Cloud integration module) and clustercontrol-clud (Cloud download/upload CLI) which are available in v1.5 and later. The upgrade instructions have been included with these packages and they come without any extra configuration. At the moment, the supported cloud platforms are Amazon Web Services and Google Cloud Platform. Cloud credentials are configured under ClusterControl -> Settings -> Integrations -> Cloud Providers.

When creating or scheduling a backup, you should see the following additional options when "Upload Backup to the cloud" is toggled:

The feature allows a one time upload or to schedule backups to be uploaded after completion (Amazon S3 or Google Cloud Storage). You can then download and restore the backups as required.

Custom Compression for mysqldump

This feature was in fact first introduced with ClusterControl v1.4.2 after its release. We added a backup compression level based on gzip. Previously, ClusterControl used the default backup compression (level 6) if the backup destination was on the controller node. The lowest compression (level 1 - fastest, less compression) was used if the backup destination was on the database host itself, to ensure minimal impact to the database during the compressing operation.

In this version, we have polished the compression aspect and you can now customize the compression level, regardless of the backup destination. When upgrading your ClusterControl instance, all the scheduled backups will be automatically converted to use level 6, unless you explicitly edit them in v1.5.

Backup compression is vital when your dataset is large, combined with a long backup retention policy, while storage space is limited. Mysqldump, which is text-based, can benefit from compression with savings of up to 60% of disk space of the original file size. On some occasions, the highest compression ratio is the best option to go, although it comes at the price of longer decompression when restoring.

Bonus Feature: Automatic Backup Verification

As old sysadmins say - A backup is not a backup if it's not restorable. Backup verification is something that is usually neglected by many. Some sysadmins have developed in-house routines for this, usually more manual than automated. Automating it is hard, mainly due to the complexity of the operation as a whole - starting from host provisioning, MySQL installation and preparation, backup files transfer, decompression, restore operation, verification procedures and finally cleaning up the system after the process. All these hassles make people neglect such an important aspect of a reliable backup. In general a backup restore test should be done at least once a month, or in case of significant changes in data size or database structure. Find a schedule that works for you and formalize it with a scheduled event.

ClusterControl can automate the backup verification by performing the restoration on a fresh host, without compromising any of the verification procedures mentioned above. This can be done after some delay, or right after the backup has completed. It will report the backup status based on the exit code of the restore operation, perform automatic shutdown if the backup is verified, or simply let the restored host run so you perform additional manual verifications on the data.

When creating or scheduling a backup, you will have additional options if "Verify Backup" is toggled:

If "Install Database Software" is enabled, ClusterControl will remove any existing MySQL installation on the target host and reinstall the database software with the same version as the existing MySQL server. Otherwise, if you have a specific setup for the restored host, you can skip this option. The rest of the options are self-explanatory.

Bonus Feature: Don’t Forget PostgreSQL

In addition to all this great functionality for MySQL and MariaDB ClusterControl 1.5 also now provides PostgreSQL with an additional backup method (pg_basebackup) that can be used for online binary backups. Backups taken with pg_basebackup can be used later for point-in-time recovery and as the starting point for a log shipping or streaming replication standby servers.


That’s it for now. Do give ClusterControl v1.5 a try, play around with the new features and let us know what you think.

by ashraf at November 21, 2017 10:59 AM

November 20, 2017

Valeriy Kravchuk

How lsof Utility May Help MySQL DBAs

While working in Support, I noticed that probably at least once a week I have to use or mention lsof utility in some context. This week, for example, we had a customer trying to find out if his mysqld process running is linked with tcmalloc library. He started it different ways, using LD_PRELOAD directly and --malloc-lib option of mysqld_safe script etc, but wanted to verify that his attempts really worked as expected. My immediate comment in the internal chat was: "Just let them run lsof -p `pidof mysqld` | grep mall and check!" My MariaDB 10.2 instance uses jemalloc and this can be checked exactly the same way:
openxs@ao756:~/dbs/maria10.2$ ps aux | grep mysqld...
openxs    4619  0.0  0.0   4452   804 pts/2    S    17:02   0:00 /bin/sh bin/mysqld_safe --no-defaults --port=3308 --malloc-lib=/usr/lib/x86_64-linux-gnu/libjemalloc.so
openxs    4734  0.5  2.9 876368 115156 pts/2   Sl   17:02   0:00 /home/openxs/dbs/maria10.2/bin/mysqld --no-defaults --basedir=/home/openxs/dbs/maria10.2 --datadir=/home/openxs/dbs/maria10.2/data --plugin-dir=/home/openxs/dbs/maria10.2/lib/plugin --log-error=/home/openxs/dbs/maria10.2/data/ao756.err --pid-file=ao756.pid --port=3308
openxs    5391  0.0  0.0  14652   964 pts/2    S+   17:05   0:00 grep --color=auto mysqld
openxs@ao756:~/dbs/maria10.2$ lsof -p 4734 | grep mall
mysqld  4734 openxs  mem    REG              252,2    219776 12058822 /usr/lib/x86_64-linux-gnu/libjemalloc.so.1
openxs@ao756:~/dbs/maria10.2$
I think it's time to summarize most important use cases for lsof utility for MySQL DBAs. I am going to show different cases when it can be useful based on public MySQL bug reports. 

As one can read in the manual, lsof "lists on its standard output file information about files opened by processes". In one of the simplest possible calls presented above, we just pass PID of the process after -p option and get list of open files for this process. This includes shared libraries the process uses. By default the following format of the output is used:
openxs@ao756:~/dbs/maria10.2$ lsof -p 4734 | more
COMMAND  PID   USER   FD   TYPE             DEVICE  SIZE/OFF     NODE NAMEmysqld  4734 openxs  cwd    DIR              252,2      4096 29638597 /home/openxs/dbs/maria10.2/data
mysqld  4734 openxs  rtd    DIR              252,2      4096        2 /
mysqld  4734 openxs  txt    REG              252,2 147257671 29514843 /home/openxs/dbs/maria10.2/bin/mysqld
mysqld  4734 openxs  mem    REG              252,2     31792  1311130 /lib/x86_64-linux-gnu/librt-2.19.so
mysqld  4734 openxs  mem    REG              252,2 101270905 29241175 /home/openxs/dbs/maria10.2/lib/plugin/ha_rocksdb.so

...
mysqld  4734 openxs  DEL    REG               0,11             443265 /[aio]
mysqld  4734 openxs    0r   CHR                1,3       0t0     1050 /dev/null
mysqld  4734 openxs    1w   REG              252,2     52623  5255961 /home/openxs/dbs/maria10.2/data/.rocksdb/LOG
mysqld  4734 openxs    2w   REG              252,2    458880 29647192 /home/openxs/dbs/maria10.2/data/ao756.err
mysqld  4734 openxs    3r   DIR              252,2      4096  5255249 /home/openxs/dbs/maria10.2/data/.rocksdb
...
mysqld  4734 openxs  451u  IPv6             443558       0t0      TCP *:3308 (LISTEN)
mysqld  4734 openxs  452u  unix 0x0000000000000000       0t0   443559 /tmp/mysql.sock
...
mysqld  4734 openxs  470u   REG              252,2         0 29756970 /home/openxs/dbs/maria10.2/data/mysql/event.MYD
mysqld  4734 openxs  471u   REG              252,2         0 29647195 /home/openxs/dbs/maria10.2/data/multi-master.info
Most columns have obvious meaning, so let me concentrate on the few. FD should be numeric file descriptor, and it is for normal files. In this case it is also followed by a letter describing the mode under which the file is open (r for read, w for write and u for update). There may be one more letter describing a type of lock applied to the file. But we can see values without any single digit in the output above, so obviously some special values can be present there, like cwd for current working directory, rtd for root directory, txt for program text (code and data) or mem for memory mapped file, etc.

TYPE column is also interesting and may have plenty of values (as there are many types of files in Linux). REG means regular file, DIR is, obviously, a directory. Note also unix for a socket and IPv6 for the TCP port mysqld process listens to.

In SIZE/OFF column for normal files we usually see their size in bytes. Values for offset in file are usually prefixed with 0t if the value is decimal, or 0x if it's hex. NAME is obviously a fully specified file name (with symbolic links resolved). Some more details about the output format are discussed in the following examples.

Another usual way to use lsof is to pass a file name and get details about processes that have it opened, like this:
openxs@ao756:~/dbs/maria10.2$ lsof /tmp/mysql.sock
COMMAND  PID   USER   FD   TYPE             DEVICE SIZE/OFF   NODE NAME
mysqld  4734 openxs  452u  unix 0x0000000000000000      0t0 443559 /tmp/mysql.sock
openxs@ao756:~/dbs/maria10.2$ lsof /home/openxs/dbs/maria10.2
COMMAND     PID   USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
mysqld_sa  4619 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
lsof      14354 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
lsof      14355 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
bash      29244 openxs  cwd    DIR  252,2     4096 29235594 /home/openxs/dbs/maria10.2
In this case we see that /home/openxs/dbs/maria10.2 is used as a current working directory by 4 processes. Usually this kind of check is used when we can not unmount some directory, but it may be also useful in context of MySQL when you get error messages that some file is already used by other process. In the first example above I was checking what process could use /tmp/mysql.sock file.

Now, with the above details on basic usage in mind, let's check several recent enough MySQL bug reports that demonstrate typical and more advanced usage of lsof:
  • Bug #66237 - "Temporary files created by binary log cache are not purged after transaction commit". My former colleague and mentor from Percona, Miguel Angel Nieto (who recently joined a dark side of MongoDB employees) used lsof to show numerous files with names ML* created and left (until connection is closed) by mysqld process in /tmp directory (tmpdir to be precize) of a server with binary logging enabled, when transaction size was larger that binlog cache size. The bug is fixed in 5.6.17+ and 5.7.2+. It shows us a usual way of creating temporary files by MySQL server:
    # lsof -p 6112|grep ML
    mysqld 6112 root 38u REG 7,0 106594304 18 /tmp/MLjw4ecJ (deleted)
    mysqld 6112 root 39u REG 7,0 237314310 17 /tmp/MLwdWDGW (deleted)
    Notice (deleted) above. This is a result of immediate call to unlink() when temporary files are created. Check this in the source code, as well as my_delete() implementation.
  • Bug #82870 - "mysqld opens too many descriptors for slow query log". This bug (that is still "Verified") was opened by my former colleague Sveta Smirnova (now in Percona). Basically, mysqld opens too many descriptors for slow query log (and general query log) if it is turned ON and OFF while concurrent sessions are running. lsof allowed to see multiple descriptors created for the same file, until eventually open_files_limit is hit.
  • Bug #83434 - "Select statement with partition selection against MyISAM table opens all partitions". This bug (later declared a duplicate of older one and, eventually, a documented, even if unexpected, behavior by design) was opened by my colleague from MariaDB Geoff Montee. lsof utility helped to show that all partitions are actually opened by the mysqld process in this case.
  • Bug #74145 - "FLUSH LOGS improperly disables the logging if the log file cannot be accessed". This bug (still "Verified") was reported by Jean Weisbuch. Here we can see how lsof was used to find out if slow log is open after FLUSH. The logging has to be disabled, but MySQL continue to lie that it is enabled. I remember many cases when lsof also helped to find out where the error log (file with descriptor 2w) is really located/redirected to.
  • Bug #77752 - "bind-address wrongly prefers IPv4 over IPv6". This was not a bug (more like a configuration issue), but see how lsof -i is used by Daniël van Eeden to find out what process listens to a specific port, and does it listen to IPv4 or IPv6 address.
  • Bug #87589 - "Documentation incorrectly states that LOAD DATA LOCAL INFILE does not use tmpdir". In this "Verified" bug report Geoff Montee used lsof to show that temporary files are really created in tmpdir, not in /tmp (OS temporary directory). This is how you can find out when MySQL manual lies...
  • Bug #77519 - "Reported location of Innodb Merge Temp File is wrong". One more bug from Daniël van Eeden, this time "Verified". By calling lsof +L1 during an online alter table, he demonstrated that two temp files are created in tmpdir instead of in the datadir (as described by the manual), while events_waits_history_long table in performance_schema seems to claim it waited ion temporary file in the datadir. Note that in other his bug report, Bug #76225, fixed since 5.7.9 and 5.8.0, he had also shown ML* binlog cache files created that were not instrumented by performance_schema.
  • Bug #75706 - "alter table import tablespace creates a temporary table". This bug report by BJ Quinn is formally still "Verified", but according to my former colleague Przemyslaw Malkowski from Percona in recent 5.6.x and 5.7.x versions lsof does NOT show temporary table created. Time to re-verify this bug maybe, if the decision is made on how to implement this?
  • Bug #83717 - "Manual does not explain when ddl_log.log file is deleted and how large it can be". My own bug report where lsof was used to show that the ddl_log.log file remains open even after online ALTER completes. Manual is clear about this now.
To summarize, lsof may help MySQL DBA to find out:
  • what dynamic libraries are really used by the mysqld process
  • where the error log and other logs are really located
  • what other process may have some file, port or socket opened that is needed for current MySQL instance
  • why you may hit open_files_limit or use all free space in some filesystem unexpectedly
  • where all kinds of temporary files are created during specific operations
  • how MySQL really works with files, ports and sockets
It also allows to find MySQL bugs and clarify missing details in MySQL manual.

by Valeriy Kravchuk (noreply@blogger.com) at November 20, 2017 11:00 AM

November 17, 2017

Jean-Jerome Schmidt

Several Ways to Intentionally Fail or Crash your MySQL Instances for Testing

You can take down a MySQL database in multiple ways. Some obvious ways are to shut down the host, pull out the power cable, or hard kill the mysqld process with SIGKILL to simulate an unclean MySQL shutdown behaviour. But there are also less subtle ways to deliberately crash your MySQL server, and then see what kind of chain reaction it triggers. Why would you want to do this? Failure and recovery can have many corner cases, and understanding them can help reduce the element of surprise when things happen in production. Ideally, you would want to simulate failures in a controlled environment, and then design and test database failover procedures.

There are several areas in MySQL that we can tackle, depending on how you want it to fail or crash. You can corrupt the tablespace, overflow the MySQL buffers and caches, limit the resources to starve the server, and also mess around with permissions. In this blog post, we are going to show you some examples of how to crash a MySQL server in a Linux environment. Some of them would be suitable for e.g. Amazon RDS instances, where you would have no access to the underlying host.

Kill, Kill, Kill, Die, Die, Die

The easiest way to fail a MySQL server is to simply kill the process or host, and not give MySQL a chance to do a graceful shutdown. To simulate a mysqld crash, just send signal 4, 6, 7, 8 or 11 to the process:

$ kill -11 $(pidof mysqld)

When looking at the MySQL error log, you can see the following lines:

11:06:09 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
..
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...

You can also use kill -9 (SIGKILL) to kill the process immediately. More details on Linux signal can be found here. Alternatively, you can use a meaner way on the hardware side like pulling off the power cable, pressing down the hard reset button or using a fencing device to STONITH.

Triggering OOM

Popular MySQL in the cloud offerings like Amazon RDS and Google Cloud SQL have no straightforward way to crash them. Firstly because you won't get any OS-level access to the database instance, and secondly because the provider uses a proprietary patched MySQL server. One ways is to overflow some buffers, and let the out-of-memory (OOM) manager to kick out the MySQL process.

You can increase the sort buffer size to something bigger than what the RAM can handle, and shoot a number of mysql sort queries against the MySQL server. Let's create a 10 million rows table using sysbench on our Amazon RDS instance, so we can build a huge sort:

$ sysbench \
--db-driver=mysql \
--oltp-table-size=10000000 \
--oltp-tables-count=1 \
--threads=1 \
--mysql-host=dbtest.cdw9q2wnb00s.ap-tokyo-1.rds.amazonaws.com \
--mysql-port=3306 \
--mysql-user=rdsroot \
--mysql-password=password \
/usr/share/sysbench/tests/include/oltp_legacy/parallel_prepare.lua \
run

Change the sort_buffer_size to 5G (our test instance is db.t2.micro - 1GB, 1vCPU) by going to Amazon RDS Dashboard -> Parameter Groups -> Create Parameter Group -> specify the group name -> Edit Parameters -> choose "sort_buffer_size" and specify the value as 5368709120.

Apply the parameter group changes by going to Instances -> Instance Action -> Modify -> Database Options -> Database Parameter Group -> and choose our newly created parameter group. Then, reboot the RDS instance to apply the changes.

Once up, verify the new value of sort_buffer_size:

MySQL [(none)]> select @@sort_buffer_size;
+--------------------+
| @@sort_buffer_size |
+--------------------+
|         5368709120 |
+--------------------+

Then fire 48 simple queries that requires sorting from a client:

$ for i in {1..48}; do (mysql -urdsroot -ppassword -hdbtest.cdw9q2wnb00s.ap-tokyo-1.rds.amazonaws.com -e 'SELECT * FROM sbtest.sbtest1 ORDER BY c DESC >/dev/null &); done

If you run the above on a standard host, you will notice the MySQL server will be terminated and you can see the following lines appear in the OS's syslog or dmesg:

[164199.868060] Out of memory: Kill process 47060 (mysqld) score 847 or sacrifice child
[164199.868109] Killed process 47060 (mysqld) total-vm:265264964kB, anon-rss:3257400kB, file-rss:0kB

With systemd, MySQL or MariaDB will be restarted automatically, so does Amazon RDS. You can see the uptime for our RDS instance will be resetted back to 0 (under mysqladmin status), and the 'Latest restore time' value (under RDS Dashboard) will be updated to the moment it went down.

Corrupting the Data

InnoDB has its own system tablespace to store data dictionary, buffers and rollback segments inside a file named ibdata1. It also stores the shared tablespace if you do not configure innodb_file_per_table (enabled by default in MySQL 5.6.6+). We can just zero this file, send a write operation and flush tables to crash mysqld:

# empty ibdata1
$ cat /dev/null > /var/lib/mysql/ibdata1
# send a write
$ mysql -uroot -p -e 'CREATE TABLE sbtest.test (id INT)'
# flush tables
$ mysql -uroot -p -e 'FLUSH TABLES WITH READ LOCK; UNLOCK TABLES'

After you send a write, in the error log, you will notice:

2017-11-15T06:01:59.345316Z 0 [ERROR] InnoDB: Tried to read 16384 bytes at offset 98304, but was only able to read 0
2017-11-15T06:01:59.345332Z 0 [ERROR] InnoDB: File (unknown): 'read' returned OS error 0. Cannot continue operation
2017-11-15T06:01:59.345343Z 0 [ERROR] InnoDB: Cannot continue operation.

At this point, mysql will hang because it cannot perform any operation, and after the flushing, you will get "mysqld got signal 11" lines and mysqld will shut down. To clean up, you have to remove the corrupted ibdata1, as well as ib_logfile* because the redo log files cannot be used with a new system tablespace that will be generated by mysqld on the next restart. Data loss is expected.

For MyISAM tables, we can mess around with .MYD (MyISAM data file) and .MYI (MyISAM index) under the MySQL datadir. For instance, the following command replaces any occurrence of string "F" with "9" inside a file:

$ replace F 9 -- /var/lib/mysql/sbtest/sbtest1.MYD

Then, send some writes (e.g, using sysbench) to the target table and perform the flushing:

mysql> FLUSH TABLE sbtest.sbtest1;

The following should appear in the MySQL error log:

2017-11-15T06:56:15.021564Z 448 [ERROR] /usr/sbin/mysqld: Incorrect key file for table './sbtest/sbtest1.MYI'; try to repair it
2017-11-15T06:56:15.021572Z 448 [ERROR] Got an error from thread_id=448, /export/home/pb2/build/sb_0-24964902-1505318733.42/rpm/BUILD/mysql-5.7.20/mysql-5.7.20/storage/myisam/mi_update.c:227

The MyISAM table will be marked as crashed and running REPAIR TABLE statement is necessary to make it accessible again.

Limiting the Resources

We can also apply the operating system resource limit to our mysqld process, for example number of open file descriptors. Using open_file_limit variable (default is 5000) allows mysqld to reserve file descriptors using setrlimit() command. You can set this variable relatively small (just enough for mysqld to start up) and then send multiple queries to the MySQL server until it hits the limit.

If mysqld is running in a systemd server, we can set it in the systemd unit file located at /usr/lib/systemd/system/mysqld.service, and change the following value to something lower (systemd default is 6000):

# Sets open_files_limit
LimitNOFILE = 30

Apply the changes to systemd and restart MySQL server:

$ systemctl daemon-reload
$ systemctl restart mysqld

Then, start sending new connections/queries that count in different databases and tables so mysqld has to open multiple files. You will notice the following error:

2017-11-16T04:43:26.179295Z 4 [ERROR] InnoDB: Operating system error number 24 in a file operation.
2017-11-16T04:43:26.179342Z 4 [ERROR] InnoDB: Error number 24 means 'Too many open files'
2017-11-16T04:43:26.179354Z 4 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2017-11-16T04:43:26.179363Z 4 [ERROR] InnoDB: File ./sbtest/sbtest9.ibd: 'open' returned OS error 124. Cannot continue operation
2017-11-16T04:43:26.179371Z 4 [ERROR] InnoDB: Cannot continue operation.
2017-11-16T04:43:26.372605Z 0 [Note] InnoDB: FTS optimize thread exiting.
2017-11-16T04:45:06.816056Z 4 [Warning] InnoDB: 3 threads created by InnoDB had not exited at shutdown!

At this point, when the limit is reached, MySQL will freeze and it will not be able to perform any operation. When trying to connect, you would see the following after a while:

$ mysql -uroot -p
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 104

Messing up with Permissions

The mysqld process runs by "mysql" user, which means all the files and directory that it needs to access are owned by mysql user/group. By messing up with the permission and ownership, we can make the MySQL server useless:

$ chown root:root /var/lib/mysql
$ chmod 600 /var/lib/mysql

Generate some loads to the server and then connect to the MySQL server and flush all tables onto disk:

mysql> FLUSH TABLES WITH READ LOCK; UNLOCK TABLES;

At this moment, mysqld is still running but it's kind of useless. You can access it via a mysql client but you can't do any operation:

mysql> SHOW DATABASES;
ERROR 1018 (HY000): Can't read dir of '.' (errno: 13 - Permission denied)

To clean up the mess, set the correct permissions:

$ chown mysql:mysql /var/lib/mysql
$ chmod 750 /var/lib/mysql
$ systemctl restart mysqld

Lock it Down

FLUSH TABLE WITH READ LOCK (FTWRL) can be destructive in a number of conditions. Like for example, in a Galera cluster where all nodes are able to process writes, you can use this statement to lock down the cluster from within one of the nodes. This statement simply halts other queries to be processed by mysqld during the flushing until the lock is released, which is very handy for backup processes (MyISAM tables) and file system snapshots.

Although this action won't crash or bring down your database server during the locking, the consequence can be huge if the session that holds the lock does not release it. To try this, simply:

mysql> FLUSH TABLES WITH READ LOCK;
mysql> exit

Then send a bunch of new queries to the mysqld until it reaches the max_connections value. Obviously, you can not get back the same session as the previous one once you are out. So the lock will be running infinitely and the only way to release the lock is by killing the query, by another SUPER privilege user (using another session). Or kill the mysqld process itself, or perform a hard reboot.

Disclaimer

This blog is written to give alternatives to sysadmins and DBAs to simulate failure scenarios with MySQL. Do not try these on your production server :-)

by ashraf at November 17, 2017 03:12 PM

November 16, 2017

Shlomi Noach

orchestrator 3.0.3: auto provisioning raft nodes, native Consul support and more

orchestrator 3.0.3 is released! There's been a lot going on since 3.0.2:

orchestrator/raft: auto-provisioning nodes via lightweight snaphsots

In an orchestrator/raft setup, we have n hosts forming a raft cluster. In a 3-node setup, for example, one node can go down, and still the remaining two will form a consensus, keeping the service operational. What happens when the failed node returns?

With 3.0.3 the failed node can go down for as long as it wants. Once it comes back, it attempts to join the raft cluster. A node keeps its own snapshots and its raft log outside the relational backend DB. If it has recent-enough data, it just needs to catch up with raft replication log, which is acquires from one of the active nodes.

If its data is very stale, it will request a snapshot from an active node, which it will import, and will just resume from that point.

If its data is gone, that's not a problem. It gets a snapshot from an active node, improts it, and keeps running from that point.

If it's a newly provisioned box, that's not a problem. It gets a snapshot from an active node, ... etc.

  • SQLite backed setups can just bootstrap new nodes. No need to dump+load or import any data.
    • Side effect: you may actually use :memory:, where SQLite does not persist any data to disk. Remember that the raft snapshots and replication log will cover you. The cheat is that the raft replication log itself is managed and persisted by an independent SQLite database.
  • MySQL backed setups will still need to make sure orchestrator has the privileges to deploy itself.

More info in the docs.

This plays very nicely into the hands of kubernetes, which is on orchestrator's roadmap.

Key Value, native Consul support (Zk TODO)

orchestrator now supports Key-Value stores built-in, and Consul in particular.

At this time the purpose of orchestrator KV is to support master discovery. orchestrator will write the identity of the master of each cluster to KV store. The user will use that information to apply changes to their infrastructure.

For example, the user will rely on Consul KV entries, written by orchestrator, to generate proxy config files via consul-template, such that traffic is directed via the proxy onto the correct master.

orchestrator supports:

  • Manually writing identity of cluster's master to KV store
    • e.g. `orchestrator-client -c submit-masters-to-kv-stores -alias mycluster`
  • Automatically updating master's identify upon failover

Key-value pairs are in the form of <cluster-alias>-&lt;master&gt;. For example:

  • Key is `main_cluster`
  • Value is my-db-0123.my.company.com:3306

Web UI improvements

Using the web UI, you can now:

  • Promote a new master

    graceful takeover via ui

    Dragging onto the left part of the master's box implies promoting a new server. Dragging onto the right side of a master's box means relocation a server below the master.

  • "reverse" replication (take local master)

    take master via UI

    Dragging onto the left part of a server's local master implies taking over the master. Dragging onto the right part of a server's local master implies relocating a server below that local master.

  • Work in quiet mode: click `mute` icon on the left sidebar to avoid being prompted when relocating replicas. You'll still be prompted for risky operations such as master promotion.

Other noteworthy changes

  • Raft advertise addresses: a contribution by Sami Ahlroos allows orchestrator/raft to work over NAT, and `kubernetes` in particular.
  • Sparser histories: especially for the `orchestrator/raft` setup, but true in general, we wish to keep the `orchestrator` backend database lightweight. orchestrator will now keep less history than it used to.
    • Detection/recovery history is kept for 7 days
    • Encouraging general audit to go to log file instead of `audit` table.
  • Building via go1.9 which will soon become a requirement for developers wishing to build `orchestrator` on their own.

Roadmap

We're looking to provision orchestrator on kubernetes, and will publish as much of that work as possible.

There's many incoming feature requests from the community and we'll try and address them where it makes sense and time allows. We greatly appreciate all input from the community!

Download

orchestrator is free and open source, released under the Apache 2 license.

Source & binary releases are available from the GitHub repository:

Packages are also available in package cloud.

by shlomi at November 16, 2017 09:38 AM

November 15, 2017

MariaDB Foundation

Microsoft joins the MariaDB Foundation as a Platinum level sponsor

MariaDB Foundation today announced that Microsoft has become a platinum sponsor. The sponsorship will help the Foundation in its goals to support continuity and open collaboration in the MariaDB ecosystem, and to drive adoption, serving an ever growing community of users and developers. “Joining the MariaDB Foundation as a Platinum member is a natural next […]

The post Microsoft joins the MariaDB Foundation as a Platinum level sponsor appeared first on MariaDB.org.

by Otto Kekäläinen at November 15, 2017 03:10 PM

Jean-Jerome Schmidt

The Galera Cluster & Severalnines Teams Present: How to Manage Galera Cluster with ClusterControl - The Replay

Watch the replay of this joint webinar in which we combine forces with the Codership Galera Cluster Team to talk about how to manage Galera Cluster using ClusterControl!

Galera Cluster has become one of the most popular high availability solutions for MySQL and MariaDB; and ClusterControl is the de facto automation and management system for Galera Cluster.

In this webinar we’re joined by Seppo Jaakola, CEO of Codership - Galera Cluster, and together, we demonstrate what it is that makes Galera Cluster such a popular high availability solution for MySQL and MariaDB and how to best manage it with ClusterControl.

We discuss the latest features of Galera Cluster with Seppo, one of its creators. And we also demo how to automate it all from deployment, monitoring, backups, failover, recovery, rolling upgrades and scaling using the new ClusterControl CLI and of course the ClusterControl GUI.

Watch the replay

Agenda

  • Introduction
    • About Codership, the makers of Galera Cluster
    • About Severalnines, the makers of ClusterControl
  • What’s new with Galera Cluster
    • Core feature set overview
    • The latest features
    • What’s coming up
  • ClusterControl for Galera Cluster
    • Deployment
    • Monitoring
    • Management
    • Scaling
  • Live Demo
  • Q&A

Speakers

Seppo Jaakola, Founder of Codership, has over 20 years experience in software engineering. He started his professional career in Digisoft and Novo Group Oy working as a software engineer in various technical projects. He then worked for 10 years in Stonesoft Oy as a Project Manager in projects dealing with DBMS development, data security and firewall clustering. In 2003, Seppo Jaakola joined Continuent Oy, where he worked as team leader for MySQL clustering product. This position linked together his earlier experience in DBMS research and distributed computing. Now he’s applying his years of experience and administrative skills to steer Codership to a right course. Seppo Jaakola has MSc degree in Software Engineering from Helsinki University of Technology.

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

by jj at November 15, 2017 08:33 AM

MariaDB Foundation

Presentations from the 2017 MariaDB Developers Unconference in Shenzhen

The following sessions were held on the two presentation days of the MariaDB Developers Unconference in Shenzhen. (Slides will be added as they become available, so check back later if you don’t yet find the presentations you are looking for). Day 1 MariaDB in 2017 (Otto Kekäläinen) What’s in the pipeline for 10.3 and beyond […]

The post Presentations from the 2017 MariaDB Developers Unconference in Shenzhen appeared first on MariaDB.org.

by Ian Gilfillan at November 15, 2017 08:17 AM

MariaDB AB

MariaDB Server 10.1.29 & MariaDB Galera Cluster 10.0.33 now available

MariaDB Server 10.1.29 & MariaDB Galera Cluster 10.0.33 now available dbart Tue, 11/14/2017 - 22:28

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.1.29 and MariaDB Galera Cluster 10.0.33. See the release notes and changelogs for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.1.29

Release Notes Changelog About MariaDB Server 10.1


Download MariaDB Server 10.2.10

Release Notes Changelog About MariaDB Galera Cluster

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.1.29 and MariaDB Galera Cluster 10.0.33. See the release notes and changelogs for details.

Login or Register to post comments

by dbart at November 15, 2017 03:28 AM

MariaDB Foundation

MariaDB 10.1.29, MariaDB Galera Cluster 10.0.33 and MariaDB Connector/J Releases now available

The MariaDB project is pleased to announce the availability of MariaDB 10.1.29, MariaDB Galera Cluster 10.0.33 and MariaDB Connector/J 2.2.0. See the release notes and changelogs for details. Download MariaDB 10.1.29 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Galera Cluster 10.0.33 Release Notes Changelog What is […]

The post MariaDB 10.1.29, MariaDB Galera Cluster 10.0.33 and MariaDB Connector/J Releases now available appeared first on MariaDB.org.

by Ian Gilfillan at November 15, 2017 01:29 AM

November 14, 2017

MariaDB AB

What’s New in MariaDB Connector/J 2.2 and 1.7

What’s New in MariaDB Connector/J 2.2 and 1.7 diego Dupin Tue, 11/14/2017 - 05:56

We are pleased to announce the general availability (GA) of MariaDB Connector/J 2.2 and 1.7, the newest versions of MariaDB Connector/J. 

As both new versions are fully compatible to their corresponding latest maintenance releases to support Java 6/7 and Java 8+, version 2.1.2 and 1.6.5 are the last maintenance releases for 2.1 and 1.6.

New enhancements include:

Pool Datasource

There are now two different Datasources implementations:

  • MariaDbDataSource: The existing basic implementation. A new connection each time the getConnection() method is called.
  • MariaDbPoolDataSource: Connection pooling implementation. MariaDB Driver will keep a pool of connections and borrow connections when asked for it.

Good framework already exists that can accomplish this job such as DBCP2, HikariCP, C3P0, apache, so why have another implementation? Here are some of the reasons: 

  • Reliability: When reusing a connection from pool, the connection must be like a "new freshly created" connection. Depending on connection state, frameworks may result executing multiple commands to reset state (Some frameworks even choose to skip some of those reset to avoid some performance impact). MariaDB has a dedicated command to refresh connection state permitting real reset (rollback remaining transaction, reset transaction isolation level, reset session variables, delete user variables, remove all PREPARE statement, ...) in one command.
  • Performance: The pool can save some information at the first connection, allowing faster creations when making the next connection.
  • Easy configuration: Solve some frequent issues, like server will close socket if not used after some time (wait_timeout default to 8h). Pool implementation avoids keeping a connection in a bad state. 

The pool is implemented at connection level, which allows using pool for a particular use case by enabling pool using the connection string: "jdbc:mariadb://host/db?pool=true".

Configuration example using Spring:


@Configuration
@EnableTransactionManagement
public class DatabaseConfig {

    @Value("${db.password}")
    private String DB_PASSWORD;

    @Value("${db.url}")
    private String DB_URL;

    @Value("${db.username}")
    private String DB_USERNAME;

    @Bean
    public DataSource dataSource() throws SQLException {
        MariaDbPoolDataSource dataSource = new MariaDbPoolDataSource();
        dataSource.setUrl(DB_URL);
        dataSource.setUser(DB_USERNAME);
        dataSource.setPassword(DB_PASSWORD);
        return dataSource;
    }
    
}

 

Download the MariaDB Connector now and learn about the newest evolution of MariaDB Connector/J.

 

Download Release Notes Knowledge Base

 

We are pleased to announce the general availability (GA) of MariaDB Connector/J 2.2 and 1.7, the newest versions of MariaDB Connector/J.

Login or Register to post comments

by diego Dupin at November 14, 2017 10:56 AM

November 11, 2017

Valeriy Kravchuk

Fun with Bugs #57 - On MySQL Bug Reports I am Subscribed to, Part I

I've decided to stop reviewing MySQL Release Notes in this series, but it does not mean that I am not interested in MySQL bugs any more. At the moment I am subscribed to 91 active MySQL bugs reported by other MySQL users, and in this blog post I am going to present 15 of them, the most recently reported ones. I'd really want to see them fixed or at least properly processed as soon as possible.

In some cases I am going to add my speculations on how the bug had better be handled, or maybe highlight some important details about it. It is not my job any more to process/"verify" any community bug reports for any kind of MySQL, but I did that for many years and I've spent more than 5 years "on the other side", being a member of Community, so in some cases I let myself to share some strong opinion on what may be done differently from the Oracle side.

As a side note, I started to subscribe to MySQL bugs mostly after I left Oracle, as before that I got email notification about each and every change in every MySQL bug report ever created...

Here is the list, starting from the most recent ones:
  • Bug #88422 - "MySQL 5.7 innodb purge thread get oldest readview could block other transaction". It is one of that bug reports without a test case from reporter. It is tempting to set it to "Verified" just "based on code review", as the code in 5.7 is quite obviously shows both holding the trx_sys->mutex and linear complexity of search depending on number of read views in the worst case (when most of them are closed):
    /**
    Get the oldest (active) view in the system.
    @return oldest view if found or NULL */

    ReadView*
    MVCC::get_oldest_view() const
    {
            ReadView*       view;

            ut_ad(mutex_own(&trx_sys->mutex));

            for (view = UT_LIST_GET_LAST(m_views);
                 view != NULL;
                 view = UT_LIST_GET_PREV(m_view_list, view)) {

                    if (!view->is_closed()) {
                            break;
                    }
            }

            return(view);
    }
    But probably current Oracle bugs verification rules do not let to just mark it as verified. After all, somebody will have to create a test case... So, my dear old friend Sinisa Milivojevic decided to try to force bug reporter to provide a test case instead of spending some time trying to create one himself. I am not going to blame him for that, why to try the easy way :) But I consider this his statement in the comment dated [10 Nov 16:21]:
    "... 5.7 methods holds no mutex what so ever..."
    a bit wrong, as we can see the mutex is acquired when get_oldest_view() method is called:
    void
    MVCC::clone_oldest_view(ReadView* view)
    {
            mutex_enter(&trx_sys->mutex);

            ReadView*       oldest_view = get_oldest_view();

            if (oldest_view == NULL) {
    ...
  • Bug #88381 - "Predicate cannot be pushed down "past" window function". Here bug reporter had provided enough hints for a test case. One can probably just check 'Handler%' status variables before and after query execution to come to the conclusion. Moreover, it seems Oracle developer,  Dag Wanvik, accepted this as a known limitation, but the bug still remains "Open" and nobody knows if it was copied to the internal bugs database, got prioritized and if any work on this is planned any time soon. We shell see. You may also want to monitor MDEV-10855.
  • Bug #88373 - "Renaming a column breaks replication from 5.7 to 8.0 because of impl. collation". This bug was quickly verified by Umesh Shastry. I expect a lot of "fun" for users upgrading to MySQL 8.0 when it becomes GA, especially in replication setups.
  • Bug #88328 - "Performance degradation with the slave_parallel_workers increase". There is no test case, just some general description and ideas about the root case when semi-sync replication is used. I expect this bug to stay "Open" for a long time, as it is a topic for a good research and blog posts like this one, that is, a work for real expert!
  • Bug #88223 - "Replication with no tmpdir space and InnoDB as tmp_storage_engine can break". Here we have clear and simple test case from Sveta Smirnova (no wonder, she also worked at bugs verification team in MySQL, Sun and Oracle). I hope Umesh will verify it soon. As a side note, it is explained (in the comments) elsewhere that InnoDB as internal_tmp_disk_storage_engine may not be the best possible option. We do not have this variable and do not plan to support InnoDB for internal temporary tables in MariaDB 10.2+.
  • Bug #88220 - "compressing and uncompressing InnoDB tables seems to be inconsistent". See also other, older bug reports mentioned there that are duplicates/closely related, but were not getting proper attention.
  • Bug #88150 - "'Undo log record is too big.' error occurring in very narrow range of str length". It was reported by my colleague Geoff Montee and is already fixed in recent versions of MariaDB (see MDEV-14051 for the details and some nice examples of gdb usage by a developer)!
  • Bug #88127 - "Index not used for 'order by' query with utf8mb4 character set". Here I am just curious when bugs like that would be caught up by Oracle QA before any public releases.
  • Bug #88071 - "An arresting Performance degradation when set sort_buffer_size=32M". here the test case is clear - just run sysbench oltp test at high concurrency with different values of sort_buffer_size. Still, Sinisa Milivojevic decided to explain when RAM limit may play a role instead of just showing how it works great (if it does) on any server with enough RAM... Let's see how this attempt to force bug reporter to work/explain more may end up...
  • Bug #87947 - "Optimizer chooses ref over range when access when range access is faster". Nice example of a case when optimizer trace may be really useful. Øystein Grøvlen kindly explained that "range access and ref access are not comparable costs". I wish we get better cost model for such cases in MySQL one day.
  • Bug #87837 - "MySQL 8 does not start after upgrade to 8.03". It is expected actually, and even somewhat documented in the release notes that MySQL 8.0.3 is not compatible to any older version. So, it is more like MySQL Installer (that I do not care much about) bug, but I still subscribed to it as yet another source of potential fun during further upgrade attempts.
  • Bug #87716 - "SELECT FOR UPDATE with BETWEEN AND gets row lock excessively". I think I already studied once why with IN() rows are locked differently by InnoDB comparing to BETWEEN that selects the same rows. But I'd like to know what's the Oracle's take on this, and I'd like to study this specific test case in details one day as well.
  • Bug #87670 - "Force index for group by is not always honored". Clear and simple test case, so no wonder it was immediately verified.
  • Bug #87621 - "Huge InnoDB slowdown when selecting strings without indexes ". I'd like to check with perf one day where the time is spent mostly during this test. For now I think this is a result of the way "long" data are stored on separate pages in InnoDB. What;'s interesting here is also a test case where R is used to generate data set.
  • Bug #87589 - "Documentation incorrectly states that LOAD DATA LOCAL INFILE does not use tmpdir". This was yet another report from my colleague Geoff Montee. lsof is your friend, maybe I have to talk about it one day at FOSDEM (call for papers is still open :) I like to find and follow bugs and missing details in MySQL manual, maybe because I would never be able to contribute to it as a writer directly...

So, this list shows my typical recent interests related to MySQL bugs - mostly InnoDB, optimizer, replication problems, fine manual and just some fun details like the way some Oracle engineers try to avoid working extra hard while processing bugs... I am also happy to know that in some cases MariaDB is able to deliver fixes faster.

by Valeriy Kravchuk (noreply@blogger.com) at November 11, 2017 03:54 PM

MariaDB AB

Blog entry title

Blog entry title mariadb.drupal Sat, 11/11/2017 - 05:39

MariaDB Corporation is the testing process of combinations of testimonials process.

MariaDB Corporation is the testing process of combinations of testimonials.

mariadb.drupal

mariadb.drupal

Sat, 11/11/2017 - 09:03

Hello

This is testimonials process.1

mariadb.drupal

mariadb.drupal

Sat, 11/11/2017 - 09:12

This is testing process

This is testimonials process.

mariadb.drupal

mariadb.drupal

Sat, 11/11/2017 - 09:15

Hi Bryan

I have fixed the issue.

mariadb.drupal

mariadb.drupal

Sat, 11/11/2017 - 09:35

Hello Bryan

I am checking the functionality.

mariadb.drupal

mariadb.drupal

Sat, 11/11/2017 - 09:36

In reply to by mariadb.drupal

Hi

This functionality is working fine.

Login or Register to post comments

by mariadb.drupal at November 11, 2017 10:39 AM

Oli Sennhauser

MariaDB master/master GTID based replication with keepalived VIP

Some of our customers still want to have old-style MariaDB master/master replication clusters. Time goes by, new technologies appear but some old stuff still remains.

The main problem in a master/master replication set-up is to make the service highly available for the application (applications typically cannot deal with more than one point-of-contact). This can be achieved with a load balancer (HAproxy, Galera Load Balancer (GLB), ProxySQL or MaxScale) in front of the MariaDB master/master replication cluster. But the load balancer by it-self should also become highly available. And this is typically achieved by a virtual IP (VIP) in front of one of the load balancers. To make operations of the VIP more handy the VIP is controlled by a service like keepalived or corosync.

M/M with LB and keepalived

Because I like simple solutions (I am a strong believer in the KISS principle) I thought about avoiding the load balancer in the middle and attach the VIP directly to the master/master replication servers and let them to be controlled by keepalived as well.

M/M with keepalived

Important: A master/master replication set-up is vulnerable to split-brain situations. Neither keepalived nor the master/master replication helps you to avoid conflicts and in any way to prevent this situation. If you are sensitive to split-brain situations you should look for Galera Cluster. Keepalived is made for stateless services like load balancers, etc. but not databases.

Set-up a MariaDB master/master replication cluster

Because most of the Linux distributions have a bit old versions of software delivered we use the MariaDB 10.2 repository from the MariaDB website:

#
# /etc/yum.repos.d/MariaDB-10.2.repo 
#
# MariaDB 10.2 CentOS repository list - created 2017-11-08 20:32 UTC
# http://downloads.mariadb.org/mariadb/repositories/
#
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.2/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

Then we install the MariaDB server and start it:

shell> yum makecache
shell> yum install MariaDB-server MariaDB-client
shell> systemctl start mariadb
shell> systemctl enabled mariadb

For the MariaDB master/master replication set-up configuration we use the following parameters:

#
# /etc/my.cnf
#

[mysqld]

server_id                = 1           # 2 on the other node
log_bin                  = binlog-m1   # binlog-m2 on the other node
log_slave_updates        = 1

gtid_domain_id           = 1           # 2 on the other node
gtid_strict_mode         = On

auto_increment_increment = 2
auto_increment_offset    = 1           # 2 on the other node

read_only                = On          # super_read_only for MySQL 5.7 and newer

Then we close the master/master replication ring according to: Starting with empty server.

mariadb> SET GLOBAL gtid_slave_pos = "";
mariadb> CHANGE MASTER TO master_host="192.168.56.101", master_user="replication"
                        , master_use_gtid=current_pos;
mariadb> START SLAVE;

Installing keepalived

Literature:


The next step is to install and configure keepalived. This can be done as follows:

shell> yum install keepalived
shell> systemctl enable keepalived

Important: In my tests I got crashes and core dumps with keepalived which disappeared after a full upgrade of CentOS 7.

Configuring keepalived

The most important part is the keepalived configuration file:

#
# /etc/keepalived/keepalived.conf
#

global_defs {

  notification_email {

    root@localhost
    dba@example.com
  }

  notification_email_from root@master1   # master2 on the other node
  smtp_server localhost 25

  router_id MARIADB_MM
  enable_script_security
}


# Health checks

vrrp_script chk_mysql {

  script "/usr/sbin/pidof mysqld"
  weight 2     # Is relevant for the diff in priority
  interval 1   # every ... seconds
  timeout 3    # script considered failed after ... seconds
  fall 3       # number of failures for K.O.
  rise 1       # number of success for OK
}

vrrp_script chk_failover {

  script "/etc/keepalived/chk_failover.sh"
  weight -4    # Is relevant for the diff in priority
  interval 1   # every ... seconds
  timeout 1    # script considered failed after ... seconds
  fall 1       # number of failures for K.O.
  rise 1       # number of success for OK
}


# Main configuration

vrrp_instance VI_MM_VIP {

  state MASTER           # BACKUP on the other side

  interface enp0s9       # private heartbeat interface

  priority 100           # Higher means: elected first (BACKUP: 99)
  virtual_router_id 42   # ID for all nodes of Cluster group

  debug 0                # 0 .. 4, seems not to work?

  unicast_src_ip 192.168.56.101   # Our private IP address
  unicast_peer {
    192.168.56.102       # Peers private IP address
  }


  # For keepalived communication

  authentication {
    auth_type PASS
    auth_pass Secr3t!
  }


  # VIP to move around

  virtual_ipaddress {

    192.168.1.99/24 dev enp0s8   # public interface for VIP
  }


  # Check health of local system. See vrrp_script above.

  track_script {
    chk_mysql
    # If File /etc/keepalived/failover is touched failover is triggered
    # Similar can be reached when priority is lowered followed by a reload
    chk_failover
  }

  # When node becomes MASTER this script is triggered
  notify_master "/etc/keepalived/keepalived_master.sh --user=root --password= --wait=yes --variable=read_only"
  # When node becomes SLAVE this script is triggered
  notify_backup "/etc/keepalived/keepalived_backup.sh --user=root --password= --kill=yes --variable=read_only"
  # Possibly fault and stop should also call keepalived_backup.sh to be on the safe side...
  notify_fault "/etc/keepalived/keepalived_fault.sh arg1 arg2"
  notify_stop "/etc/keepalived/keepalived_stop.sh arg1 arg2"
  # ANY state transit is triggered
  notify /etc/keepalived/keepalived_notify.sh

  smtp_alert   # send notification during state transit
}

With the command:

shell> systemctl restart keepalived

the service is started and/or the configuration is reloaded.

The scripts we used in the configuration file are the following:

chk_failover.sh
keepalived_backup.sh
keepalived_fault.sh
keepalived_master.sh
keepalived_notify.sh
keepalived_stop.sh

#!/bin/bash
#
# /etc/keepalived/keepalived_notify.sh
#

TYPE=${1}
NAME=${2}
STATE=${3}
PRIORITY=${4}

TS=$(date '+%Y-%m-%d_%H:%M:%S')
LOG=/etc/keepalived/keepalived_notify.log

echo $TS $0 $@ >>${LOG}

#!/bin/bash
#
# /etc/keepalived/chk_failover.sh
#

/usr/bin/stat /etc/keepalived/failover 2>/dev/null 1>&2
if [ ${?} -eq 0 ] ; then
  exit 1
else
  exit 0
fi

To make MariaDB master/master replication more robust against replication problems we took the following (configurable) actions on the database side:

Getting the MASTER role:

  • Waiting for catch-up replication
  • Make the MariaDB instance read/write

Getting the BACKUP role:

  • Make the MariaDB instance read-only
  • Kill all open connections

Testing scenarios

The following scenarios where tested under load (insert_test.sh):

  • Intentional fail-over for maintenance:
    shell> touch /etc/keepalived/failover
    shell> rm -f /etc/keepalived/failover
    
  • Stopping keepalived:
    shell> systemctl stop keepalived
    shell> systemctl start keepalived
    
  • Stopping MariaDB node:
    shell> systemctl stop mariadb
    shell> systemctl start mariadb
    
  • Reboot server:
    shell> reboot
    
  • Simulation of split-brain:
    shell> ip link set enp0s9 down
    shell> ip link set enp0s9 up
    

Problems

Problems we faced during set-up and testing were:

  • SElinux/AppArmor
  • Firewall

Keepalived controlling 2 virtual IPs

A second scenario we wanted to build is a MariaDB master/master GTID based replication cluster with 2 VIP addresses. This is to achieve either a read-only VIP and a read/write VIP or to have half of the load on one master and half of the load on the other master:

M/M with keepalived and 2 VIPs

For this scenario we used the same scripts but a slightly different keepalived configuration:

#
# /etc/keepalived/keepalived.conf
#

global_defs {

  notification_email {

    root@localhost
    dba@example.com
  }

  notification_email_from root@master1   # master2 on the other node
  smtp_server localhost 25

  router_id MARIADB_MM
  enable_script_security
}


# Health checks

vrrp_script chk_mysql {

  script "/usr/sbin/pidof mysqld"
  weight 2     # Is relevant for the diff in priority
  interval 1   # every ... seconds
  timeout 3    # script considered failed after ... seconds
  fall 3       # number of failures for K.O.
  rise 1       # number of success for OK
}

vrrp_script chk_failover {

  script "/etc/keepalived/chk_failover.sh"
  weight -4    # Is relevant for the diff in priority
  interval 1   # every ... seconds
  timeout 1    # script considered failed after ... seconds
  fall 1       # number of failures for K.O.
  rise 1       # number of success for OK
}


# Main configuration

vrrp_instance VI_MM_VIP1 {

  state MASTER           # BACKUP on the other side

  interface enp0s9       # private heartbeat interface

  priority 100           # Higher means: elected first (BACKUP: 99)
  virtual_router_id 42   # ID for all nodes of Cluster group

  unicast_src_ip 192.168.56.101   # Our private IP address
  unicast_peer {
    192.168.56.102       # Peers private IP address
  }


  # For keepalived communication

  authentication {
    auth_type PASS
    auth_pass Secr3t!
  }


  # VIP to move around

  virtual_ipaddress {

    192.168.1.99/24 dev enp0s8   # public interface for VIP
  }


  # Check health of local system. See vrrp_script above.

  track_script {
    chk_mysql
    chk_failover
  }

  # ANY state transit is triggered
  notify /etc/keepalived/keepalived_notify.sh

  smtp_alert   # send notification during state transit
}

vrrp_instance VI_MM_VIP2 {

  state BACKUP           # MASTER on the other side

  interface enp0s9       # private heartbeat interface

  priority 99            # Higher means: elected first (MASTER: 100)
  virtual_router_id 43   # ID for all nodes of Cluster group

  unicast_src_ip 192.168.56.101   # Our private IP address
  unicast_peer {
    192.168.56.102       # Peers private IP address
  }


  # For keepalived communication

  authentication {
    auth_type PASS
    auth_pass Secr3t!
  }


  # VIP to move around

  virtual_ipaddress {

    192.168.1.98/24 dev enp0s8   # public interface for VIP
  }


  # Check health of local system. See vrrp_script above.

  track_script {
    chk_mysql
    chk_failover
  }

  # ANY state transit is triggered
  notify /etc/keepalived/keepalived_notify.sh

  smtp_alert   # send notification during state transit
}

by Shinguz at November 11, 2017 10:29 AM

November 10, 2017

Jean-Jerome Schmidt

MySQL & MariaDB Database Backup Resources

Most organizations do not realize they have a problem with database backups until they need to restore the data and find it’s not there or not in the form that they were expecting.

The designated administrator managing the database environments must be prepared for situations where any failure may cause an impact to the availability, integrity, or usability of a database or application. Reacting to these failures is a key component of the administrator’s responsibilities and their ability to react correctly depends on whether they have a well-planned strategy for database backups and recovery.

Pixar’s “Toy Story 2” famously almost never happened due a command line mis-run causing the movie to be deleted and an in-effective backup strategy in place. That movie went on to take in nearly $500 million dollars worldwide in box office… money that, without the fact that one team member made their own personal backup, may have never been made.

ClusterControl provides you with sophisticated backup and failover features using a point-and-click interface to easily restore your data if something goes wrong and can be your DBA-sidekick when it comes to building an effective backup strategy. There are many aspects to consider though when building such a strategy.

Here at Severalnines we have database experts who have written much about the topic and in this blog we will collect the top resources to help you build your own database backup strategy for MySQL and MariaDB databases more specifically.

If you are running a MySQL or MariaDB environment our best resource for you is the free whitepaper “The DevOps Guide to Database Backups for MySQL and MariaDB.” The guide covers the two most popular backup utilities available for MySQL and MariaDB, namely mysqldump and Percona XtraBackup. It further covers topics such as how database features like binary logging and replication can be leveraged in backup strategies and provides best practices that can be applied to high availability topologies in order to make database backups reliable, secure and consistent.

In addition to the whitepaper there are two webinars focused on backups that you can watch on-demand. “MySQL Tutorial - Backup Tips for MySQL, MariaDB & Galera Cluster” and “Become a MySQL DBA - Deciding on a Relevant Backup Solution.” Each of these webinars offer tips and best practices on building a backup plan and summarize much of the content that is available throughout our website.

Here are our most popular and relevant blogs on the topic...

Overview of Backup and Restores

In the blog “Become a MySQL DBA - Backup and Restore” we provide a high-level overview of backups and restores when managing a MySQL environment. Included in the blog is an overview of different backup methodologies, overview of logical and physical backups, and some best practices and guidelines you can follow.

The Impact of MySQL Storage Engines on Backups

In the blog “The Choice of MySQL Storage Engine and its Impact on Backup Procedures” we discuss how the selection of different types of storage engines (like MyISAM, InnoDB, etc) can have an impact on your backup strategy.

Building a Backup Strategy and Plan

In our blog “mysqldump or Percona XtraBackup? Backup Strategies for MySQL Galera Cluster” we discuss the different options available to you when making your backup and restore plan with special focus on doing it in a way that does not affect performance.

Making Sure You Perform a Good Backup

In our blog “How to Perform Efficient Backups for MySQL and MariaDB” we discuss a number of ways to backup MySQL and MariaDB, each of which comes with pros and cons.

Using ClusterControl for Backups

In the blog “ClusterControl Tips & Tricks - Best Practices for Database Backups” we should how to effectively manage your backup plan using ClusterControl. With ClusterControl you can schedule logical or physical backups with failover handling and easily restore backups to bootstrap nodes or systems.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Additional Blogs

There are several more blogs that have been written over the years that can also aid you in ensuring your backups are performed successfully and efficiently. Here’s a list of them...

Full Restore of a MySQL or MariaDB Galera Cluster from Backup

Performing regular backups of your database cluster is imperative for high availability and disaster recovery. This blog post provides a series of best practices on how to fully restore a MySQL or MariaDB Galera Cluster from backup.

Read the Blog

What’s New in ClusterControl 1.4 - Backup Management

This blog post covers the new backup features available in ClusterControl version 1.4.

Read the Blog

ClusterControl Tips & Tricks: Customizing your Database Backups

ClusterControl follows some best practices to perform backups using mysqldump or Percona xtrabackup. Although these work for the majority of database workloads, you might still want to customize your backups. This blog shows you how.

Read the Blog

Architecting for Failure - Disaster Recovery of MySQL/MariaDB Galera Cluster

Whether you use unbreakable private data centers or public cloud platforms, Disaster Recovery (DR) is indeed a key issue. This is not about copying your data to a backup site and being able to restore it, this is about business continuity and how fast you can recover services when disaster strikes.

Read the Blog

Using BitTorrent Sync to Transfer Database Backups Offsite

BitTorrent Sync is a simple replication application providing encrypted bidirectional file transfers that can run behind NAT and is specifically designed to handle large files. By leveraging the simplicity of Bittorrent Sync, we can transfer backup files away from our cluster, enhancing the backups availability and reducing the cost of broken backup, where you can regularly verify your backups off-site.

Read the Blog

How to Clone Your Database

If you are managing a production database, chances are high that you’ve had to clone your database to a different server than the production server. The basic method of creating a clone is to restore a database from a recent backup onto a different database server. Other methods include replicating from a source database while it is up, in which case it is important the original database be unaffected by any cloning procedure.

Read the Blog

Not Using MySQL? Here are some resources we have to help with other database technologies…

Become a MongoDB DBA: MongoDB Backups

This is our fifth post in the “Become a MongoDB DBA” blog series - how do you make a good backup strategy for MongoDB, what tools are available and what you should watch out for.

Read the Blog

Become a MongoDB DBA: Recovering your Data

This is our sixth post in the “Become a MongoDB DBA” blog series - how do you recover MongoDB using a backup.

Read the Blog

Become a PostgreSQL DBA - Logical & Physical PostgreSQL Backups

Taking backups is one of the most important tasks of a DBA - it is crucial to the availability and integrity of the data. Part of our Become a PostgreSQL DBA series, this blog post covers some of the backup methods you can use with PostgreSQL.

Read the Blog

by Severalnines at November 10, 2017 10:18 AM

November 09, 2017

Jean-Jerome Schmidt

HAProxy: All the Severalnines Resources

Load balancers are an essential component in MySQL and MariaDB database high availability; especially when making topology changes transparent to applications and implementing read-write split functionality.

HAProxy is free, open source software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications that spreads requests across multiple servers.

ClusterControl provides support for deployment, configuration and optimization of HAProxy as well as for other popular load balancing and caching technologies for MySQL and MariaDB databases.

Here are our top resources for HAProxy to get you started with this widely used technology.

Tutorials

MySQL Load Balancing with HAProxy - Tutorial

We have recently updated our tutorial on MySQL Load Balancing with HAProxy. Read about deployment and configuration, monitoring, ongoing maintenance, health check methods, read-write splitting, redundancy with VIP and Keepalived and more.

Read More

On-Demand Webinars

How to deploy and manage HAProxy, MaxScale or ProxySQL with ClusterControl

In this webinar we talk about support for proxies for MySQL HA setups in ClusterControl: how they differ and what their pros and cons are. And we show you how you can easily deploy and manage HAProxy, MaxScale and ProxySQL from ClusterControl during a live demo.

Watch the replay

How To Set Up SQL Load Balancing with HAProxy

In this webinar, we cover the concepts around the popular open-source HAProxy load balancer, and shows you how to use it with your SQL-based database clusters.

Watch the replay

Performance Tuning of HAProxy for Database Load Balancing

This webinar discusses the performance tuning basics for HAProxy and explains how to take advantage of some of the new features in 1.5, which was released in June 2014 after 4 years of development work.

Watch the replay

Introducing the Severalnines MySQL© Replication Blueprint

The Severalnines Blueprint for MySQL Replication includes all aspects of a MySQL Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies as ProxySQL, MaxScale and HAProxy. This webinar provides an in-depth walk-through of this blueprint and explains how to make best use of it.

Watch the replay

Top Blogs

HAProxy Connections vs MySQL Connections - What You Should Know

Max connections determines the maximum number of connections to the database server. This can be set on both the database server, or the proxy in front of it. In this blog post, we’ll dive into HAProxy and MySQL maximum connections variables, and see how to get the best of both worlds.

Read More

SQL Load Balancing Benchmark - Comparing Performance of MaxScale vs HAProxy

In a previous post, we gave you a quick overview of the MaxScale load balancer and walked through installation and configuration. We did some quick benchmarks using sysbench, a system performance benchmark that supports testing CPU, memory, IO, mutex and also MySQL performance. We will be sharing the results in this blog post.

Read More

Load balanced MySQL Galera setup - Manual Deployment vs ClusterControl

Deploying a MySQL Galera Cluster with redundant load balancing takes a bit of time. This blog looks at how long it would take to do it manually vs using ClusterControl to perform the task.

Read More

Read-Write Splitting for Java Apps using Connector/J, MySQL Replication and HAProxy

In this blog post, we will play around with Java and MySQL Replication to perform read-write splitting for Java Apps using Connector/J.

Read More

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

High availability read-write splitting with php-mysqlnd, MySQL Replication and HAProxy

In this blog post, we explore the use of php-mysqlnd_ms with a PHP application (Wordpress) on a standard MySQL Replication backend.

Read More

Become a ClusterControl DBA: Making your DB components HA via Load Balancers

There are various ways to retain high availability with databases. You can use Virtual IPs (VRRP) to manage host availability, you can use resource managers like Zookeeper and Etcd to (re)configure your applications or use load balancers/proxies to distribute the workload over all available hosts.

Read More

Wordpress Application Clustering using Kubernetes with HAProxy and Keepalived

In this blog post, we’re going to play with Kubernetes application clustering and pods. We’ll use Wordpress as the application, with a single MySQL server. We will also have HAProxy and Keepalived to provide simple packet forwarding (for external network) with high availability capability.

Read More

How Galera Cluster Enables High Availability for High Traffic Websites

This post gives an insight into how Galera can help to build HA websites.

Read More

by Severalnines at November 09, 2017 10:58 AM

November 08, 2017

MariaDB AB

MariaDB Server 10.2 Now Available on Qualcomm Centriq™ 2400 Server Processor

MariaDB Server 10.2 Now Available on Qualcomm Centriq™ 2400 Server Processor david_thompson_g Wed, 11/08/2017 - 00:35

MariaDB Corporation is pleased to announce support for the 64-bit ARM Qualcomm Centriq™ 2400 server processor. The Centriq 2400 brings Qualcomm's ARM expertise to the data center server world, offering a high core count of 48 per physical CPU chip.

MariaDB 10.2 Server for Centos 7 and Ubuntu 16 is available now.

MariaDB's architectural support for thread per connection helps MariaDB scale extremely well on the Qualcomm Centric 2400 server processor providing near consistent increase in throughput through the core count. 

MariaDB utilizes a custom fork of the widely available sysbench benchmarking utility for performance testing and evaluation. For the benchmarking test we use a data set with 1.2 million rows in either a single table or distributed across 24 tables. Each benchmark thread executes transactions of 1000 SELECT statements, each fetching a single row based on the PRIMARY KEY. Database caches are tuned to be able to hold the data set in memory. Hence the benchmark does not depend on disk speed but tests mostly cpu, memory bandwidth and software scalability (amount of serialized code in the path). My colleague Axel Schwenke ran the benchmark and found some very interesting results. Here we show the results at a system level, the blue bars indicate the test run where data is spread against 24 tables and red is 1 table:

qualcomm_system_benchmark.png
What you see is a near doubling in throughput as more query threads run all the way up to the core count (46 in our pre-production server) and then relatively flat throughput past that as threads increase.

To show the near perfect scaling effect we introduce a throughput per active core metric. This is defined as system throughput by min (thread-count, core-count). When there are fewer threads than cores, some cores will be idle and don’t contribute to the system throughput. But if there are more threads than cores, then some threads must be serialized by the system scheduler and we won’t get any more throughput because all cores are busy anyway. Here are the results:

qualcomm_per_core_benchmark.png
We are very excited to see a CPU that scales out so well combined with very low power / heat consumption making it very well suited for dense deployment in the data center.  We look forward to continue to optimize MariaDB Server for the Qualcomm Centriq ARM64 architecture.

MariaDB Corporation is pleased to announce support for the 64-bit ARM Qualcomm Centriq™ 2400 server processor. The Centriq 2400 brings Qualcomm's ARM expertise to the data center server world, offering a high core count of 48 per physical CPU chip.

Login or Register to post comments

by david_thompson_g at November 08, 2017 05:35 AM

November 07, 2017

Jean-Jerome Schmidt

HAProxy Connections vs MySQL Connections - What You Should Know

Having a load balancer or reverse proxy in front of your MySQL or MariaDB server does add a little bit of complexity to your database setup, which might lead to some, things behaving differently. Theoretically, a load balancer which sits in front of MySQL servers (for example an HAProxy in front of a Galera Cluster) should just act like a connection manager and distribute the connections to the backend servers according to some balancing algorithm. MySQL, on the other hand, has its own way of managing client connections. Ideally, we would need to configure these two components together so as to avoid unexpected behaviours, and narrow down the troubleshooting surface when debugging issues.

If you have such setup, it is important to understand these components as they can impact the overall performance of your database service. In this blog post, we will dive into MySQL's max_connections and HAProxy maxconn options respectively. Note that timeout is another important parameter that we should know, but we are going to cover that in a separate post.

MySQL's Max Connections

The number of connections permitted to a MySQL server is controlled by the max_connections system variable. The default value is 151 (MySQL 5.7).

To determine a good number for max_connections, the basic formulas are:

Where,

**Variable innodb_additional_mem_pool_size is removed in MySQL 5.7.4+. If you are running in the older version, take this variable into account.

And,

By using the above formulas, we can calculate a suitable max_connections value for this particular MySQL server. To start the process, stop all connections from clients and restart the MySQL server. Ensure you only have the minimum number of processes running at that particular moment. You can use 'mysqladmin' or 'SHOW PROCESSLIST' for this purpose:

$ mysqladmin -uroot -p processlist
+--------+------+-----------+------+---------+------+-------+------------------+----------+
| Id     | User | Host      | db   | Command | Time | State | Info             | Progress |
+--------+------+-----------+------+---------+------+-------+------------------+----------+
| 232172 | root | localhost | NULL | Query   |    0 | NULL  | show processlist |    0.000 |
+--------+------+-----------+------+---------+------+-------+------------------+----------+
1 row in set (0.00 sec)

From the above output, we can tell that only one user is connected to the MySQL server which is root. Then, retrieve the available RAM (in MB) of the host (look under 'available' column):

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           3778        1427         508         148        1842        1928
Swap:          2047           4        2043

Just for the info, the 'available' column gives an estimate of how much memory is available for starting new applications, without swapping (only available in kernel 3.14+).

Then, specify the available memory, 1928 MB in the following statement:

mysql> SELECT ROUND((1928 - (ROUND((@@innodb_buffer_pool_size + @@innodb_log_buffer_size + @@query_cache_size + @@tmp_table_size + @@key_buffer_size) / 1024 / 1024))) / (ROUND(@@read_buffer_size + @@read_rnd_buffer_size + @@sort_buffer_size + @@thread_stack + @@join_buffer_size + @@binlog_cache_size) / 1024 / 1024)) AS 'Possible Max Connections';
+--------------------------+
| Possible Max Connections |
+--------------------------+
|                      265 |
+--------------------------+

**Variable innodb_additional_mem_pool_size is removed in MySQL 5.7.4+. If you are running in the older version, take this variable into account.

From this example, we can have up to 265 MySQL connections simultaneously according to the available RAM the host has. It doesn't make sense to configure a higher value than that. Then, append the following line inside MySQL configuration file, under the [mysqld] directive:

max_connections = 265

Restart the MySQL service to apply the change. When the total simultaneous connections reaches 265, you would get a "Too many connections" error when trying to connect to the mysqld server. This means that all available connections are in use by other clients. MySQL actually permits max_connections+1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege. So if you face this error, you should try to access the server as a root user (or any other SUPER user) and look at the processlist to start the troubleshooting.

HAProxy's Max Connections

HAProxy has 3 types of max connections (maxconn) - global, defaults/listen and default-server. Assume an HAProxy instance configured with two listeners, one for multi-writer listening on port 3307 (connections are distributed to all backend MySQL servers) and another one is single-writer on port 3308 (connections are forwarded to a single MySQL server):

global
    ...
    maxconn 2000 #[a]
    ...
defaults
    ...
    maxconn 3 #[b]
    ...
listen mysql_3307
    ...
    maxconn 8 #[c]
    balance leastconn
    default-server port 9200 maxqueue 10 weight 10 maxconn 4 #[d]
    server db1 192.168.55.171 check
    server db2 192.168.55.172 check
    server db3 192.168.55.173 check
listen mysql_3308
    ...
    default-server port 9200 maxqueue 10 weight 10 maxconn 5 #[e]
    server db1 192.168.55.171 check
    server db2 192.168.55.172 check backup #[f]

Let’s look at the meaning of some of the configuration lines:

global.maxconn [a]

The total number of concurrent connections that are allowed to connect to this HAProxy instance. Usually, this value is the highest value of all. In this case, HAProxy will accept a maximum of 2000 connections at a time and distribute them to all listeners defined in the HAProxy process, or worker (you can run multiple HAProxy processes using nbproc option).

HAProxy will stop accepting connections when this limit is reached. The "ulimit-n" parameter is automatically adjusted to this value. Since sockets are considered equivalent to files from the system perspective, the default file descriptors limit is rather small. You will probably want to raise the default limit by tuning the kernel for file descriptors.

defaults.maxconn [b]

Defaults maximum connections value for all listeners. It doesn't make sense if this value is higher than global.maxconn.

If "maxconn" line is missing under the "listen" stanza (listen.maxconn), the listener will obey this value. In this case, mysql_3308 listener will get maximum of 3 connections at a time. To be safe, set this value equal to global.maxconn, divided by the number of listeners. However, if you would like to prioritize other listeners to have more connections, use listen.maxconn instead.

listen.maxconn [c]

The maximum connections allowed for the corresponding listener. The listener takes precedence over defaults.maxconn if specified. It doesn't make sense if this value is higher than global.maxconn.

For a fair distribution of connections to backend servers like in the case of a multi-writer listener (mysql_3307), set this value as listen.default-server.maxconn multiply by the number of backend servers. In this example, a better value should be 12 instead of 8 [c]. If we chose to stick with this configuration, db1 and db2 are expected to receive a maximum of 3 connections each, while db3 will receive a maximum of 2 connections (due to leastconn balancing), which amounts to 8 connections in total. It won't hit the limit as specified in [d].

For single-writer listener (mysql_3308) where connections should be allocated to one and only one backend server at a time, set this value to be the same or higher than listen.default-server.maxconn.

listen.default-server.maxconn [d][e]

This is the maximum number of connections that every backend server can receive at a time. It doesn't make sense if this value is higher than listen.maxconn or defaults.maxconn. This value should be lower or equal to MySQL's max_connections variable. Otherwise, you risk exhausting the connections to the backend MySQL server, especially when MySQL's timeout variables are configured lower than HAProxy's timeouts.

In this example, we've set each MySQL server to only get a maximum of 4 connections at a time for multi-writer Galera nodes [d]. While the single-writer Galera node will get a maximum of 3 connections at a time, due to the limit that applies from [b]. Since we specified "backup" [f] to the other node, the active node will at once get all 3 connections allocated to this listener.

The above explanation can be illustrated in the following diagram:

To sum up the connections distribution, db1 is expected to get a maximum number of 6 connections (3 from 3307 + 3 from 3308). The db2 will get 3 connections (unless if db1 goes down, where it will get additional 3) and db3 will stick to 2 connections regardless of topology changes in the cluster.

Connection Monitoring with ClusterControl

With ClusterControl, you can monitor MySQL and HAProxy connection usage from the UI. The following screenshot provides a summary of the MySQL connection advisor (ClusterControl -> Performance -> Advisors) where it monitors the current and ever used MySQL connections for every server in the cluster:

For HAProxy, ClusterControl integrates with HAProxy stats page to collect metrics. These are presented under the Nodes tab:

From the above screenshot, we can tell that each backend server on multi-writer listener gets a maximum of 8 connections. 4 concurrent sessions are running. These are highlighted in the top red square, while the single-writer listener is serving 2 connections and forwarding them to a single node respectively.

Conclusion

Configuring the maximum connections for HAProxy and MySQL server is important to ensure good load distribution to our database servers, and protect the MySQL servers from overloading or exhausting its connections.

by ashraf at November 07, 2017 11:27 AM

November 02, 2017

Peter Zaitsev

MySQL vs. MariaDB: Reality Check

MySQL vs. MariaDB

MySQL vs. MariaDBIn this blog, we’ll provide a comparison between MySQL vs. MariaDB (including Percona Server for MySQL).

Introduction

The goal of this blog post is to evaluate, at a higher level, MySQL, MariaDB and Percona Server for MySQL side-by-side to better inform the decision making process. It is largely an unofficial response to published comments from the MariaDB Corporation.

It is worth noting that Percona Server for MySQL is a drop-in compatible branch of MySQL, where Percona contributes as much as possible upstream. MariaDB Server, on the other hand, is a fork of MySQL 5.5. They cherry-picked MySQL features, and don’t guarantee drop-in compatibility any longer.

MySQL Percona Server for MySQL* MariaDB Server
Protocols MySQL protocol over port 3306, X Protocol over port 33060 MySQL protocol over port 3306, X Protocol over port 33060 MySQL protocol, MariaDB Server extensions
Community –
Source Code
Open Source Open Source Open Source
Community – Development Open Source, contributions via signing the Oracle Contributor Agreement (OCA) Open Source Open Source, contributions via the new BSD license or signing the MariaDB Contributor Agreement (MCA)
Community – Collaboration Mailing list, forums, bugs system Mailing list, forums, bugs system (Jira, Launchpad) Mailing list, bugs system (Jira), IRC channel
Core –
Replication
MySQL replication with GTID MySQL replication with GTID MariaDB Server replication, with own GTID, compatible only if MariaDB Server is a slave to MySQL, not vice versa
Core –
Routing
MySQL Router (GPLv2) ProxySQL (GPLv3) MariaDB MaxScale (Business Source License)
Core –
Partitioning
Standard Standard Standard, with extra engines like SPIDER/CONNECT that offer varying levels of support
Tool –
Editing
MySQL Workbench for Microsoft Windows, macOS, and Linux MySQL Workbench for Microsoft Windows, macOS, and Linux Webyog’s SQLYog for Microsoft Windows (MySQL Workbench notes an incompatible server)
Tool –
Monitoring
MySQL Enterprise Monitor Percona Monitoring & Management (PMM) (100% open source) Webyog’s Monyog
Scalability –
Client Connections
MySQL Enterprise Threadpool Open Source Threadpool with support for priority tickets Open Source Threadpool
Scalability –
Clustering
MySQL Group Replication MySQL Group Replication, Percona XtraDB Cluster (based on a further engineered Galera Cluster) MariaDB Enterprise Cluster (based on Galera Cluster)
Security –
Encryption
Tablespace data-at-rest encryption. Amazon KMS, Oracle Vault Enterprise Edition Tablespace data-at-rest encryption with Keyring Vault plugin Tablespace and table data-at-rest encryption. Amazon KMS, binlog/redo/tmp file with Aria tablespace encryption
Security –
Data Masking
ProxySQL data masking ProxySQL data masking MariaDB MaxScale data masking
Security –
Firewall
MySQL Enterprise Firewall ProxySQL Firewall MariaDB MaxScale Firewall
Security –
Auditing
MySQL Enterprise Audit Plugin Percona Audit Plugin (OSS) MariaDB Audit Plugin (OSS)
Analytics No ClickHouse MariaDB ColumnStore
SQL –
Common Table Expressions
In-development for MySQL 8.0 (now a release candidate) In-development for MySQL 8.0 (now a release candidate) Present in MariaDB Server 10.2
SQL –
Window Functions
In-development for MySQL 8.0 (now a release candidate) In-development for MySQL 8.0 (now a release candidate) Present in MariaDB Server 10.2
Temporal –
Log-based rollback
No No In development for MariaDB Server 10.3
Temporal – system versioned tables No No In development for MariaDB Server 10.3
JSON JSON Data type, 21 functions JSON Data type, 21 functions No JSON Data Type, 26 functions
Official
client connectors
C (libmysqlclient), Java, ODBC, .NET, Node.js, Python, C++, mysqlnd for PHP C (libmysqlclient), Java, ODBC, .NET, Node.js, Python, C++, mysqlnd for PHP C (libmariadbclient), Java, ODBC
Usability – CJK Language support Gb18030, ngram & MeCab for InnoDB full-text search Gb18030, ngram & MeCab for InnoDB full-text search No
Monitoring – PERFORMANCE
_SCHEMA
Thorough instrumentation in 5.7, sys schema included Thorough instrumentation in 5.7, sys schema included Instrumentation from MySQL 5.6, sys schema not included
Security – Password authentication sha256_password (with caching_sha2_password in 8.0) sha256_password (with caching_sha2_password in 8.0) ed25519 (incompatible with sha256_password)
Security –
Secure out of the box
validate_password on by default, to choose a strong password at the start validate_password on by default, to choose a strong password at the start No
Usability – Syntax differences EXPLAIN FOR CONNECTION <thread_id> EXPLAIN FOR CONNECTION <thread_id> SHOW EXPLAIN FOR <thread_id>
Optimiser –
Optimiser Tracing
Yes Yes No
Optimiser –
Optimiser Hints
Yes Yes No
DBA –
Super readonly mode
Yes Yes No
Security – Password expiry Yes Yes No
Security – Password last changed? Password lifetime? Yes Yes No
Security – VALIDATE_PASSWORD
_STRENGTH()
Yes Yes No
Security – ACCOUNT LOCK/UNLOCK Yes Yes No
Usability – Query Rewriting Yes Yes No
GIS – GeoJSON &
GeoHash functionality
Yes Yes Incomplete
Security – mysql_ssl_rsa_setup Yes Yes No (setup SSL connections manually)
MySQL Utilities Yes Yes No
Backup locks No (in development for 8.0) Yes No
Usability – InnoDB memcached interface Yes Yes No

*Note. Third-party software (such as ProxySQL and ClickHouse) used in conjunction with Percona Server for MySQL is not necessarily covered by Percona Support services.

To get a higher level view of what Percona Server for MySQL offers compared to MySQL, please visit: Percona Server Feature Comparison. Read this for a higher level view of compatibility between MariaDB Server and MySQL written by MariaDB Corporation.

Open Community

MariaDB Server undoubtedly has an open community, with governance mixed between MariaDB Foundation and MariaDB Corporation. There are open developer meetings on average about twice per year, two mailing lists (one for developers and users), an IRC channel and an open JIRA ticket system that logs bugs and feature requests.

Percona Server for MySQL also has an open community. Developer meetings are not open to general contributors, but there is a mailing list, an IRC channel and two systems – Launchpad and JIRA – for logging bugs and feature requests.

MySQL also has an open community where developer meetings are also not open to general contributors. There are many mailing lists, there are a few IRC channels and there is the MySQL bugs system. The worklogs are where the design for future releases happens, and these are opened up when their features are fully developed and  source-code-pushed.

From a source code standpoint, MySQL makes pushes to Github when a release is made; whereas open source development happens for Percona Server for MySQL and MariaDB Server on Github.

Feature development on MySQL continues in leaps and bounds, and Oracle has been an excellent steward of MySQL. Please refer to The Complete List of Features in 5.7, as well as The Unofficial MySQL 8 Optimiser Guide.

Linux distributions have chosen MariaDB Server 5.5, and some have chosen MariaDB Server 10.0/10.1 when there was more backward compatibility to MySQL 5.5/5.6. It is the “default” MySQL in many Linux distributions (such as Red Hat Enterprise Linux, SUSE and Debian). However, Ubuntu still believes that when you ask for MySQL you should get it (and that is what Ubuntu ships).

One of the main reasons Debian switched was due to the way Oracle publishes updates for security issues. They are released as a whole quarterly as Critical Patch Updates, without much detail about individual fixes. This is a policy that is unlikely to change, but has had no adverse effects on distribution.

All projects actively embrace contributions from the open community. MariaDB Server does include contributions like the MyRocks engine developed at Facebook, but so does Percona Server for MySQL. Oracle accepts contributions from a long list of contributors, including Percona. Please see Licensing information for MySQL 5.7 as an example.

A Shared Core Engine

MariaDB Server has differed from MySQL since MySQL 5.5. This is one reason why you don’t get version numbers that follow the MySQL scheme. It is also worth noting that features are cherry-picked at merge time, because the source code has diverged so much since then.

As the table below shows, it took Percona Server for MySQL over four months to get a stable 5.5 release based on MySQL 5.5, while it took MariaDB Server one year and four months to get a stable 5.5 release based on MySQL 5.5. Percona Server for MySQL 5.6 and 5.7 are based on their respective MySQL versions.

MySQL Percona Server for MySQL MariaDB Server
3 December 2010 5.5.8 GA
28 April 2011 5.5.11-20.2 GA
11 April 2012 5.5.23 GA
5 February 2013 5.6.10 GA
7 October 2013 5.6.13-61.0 GA
31 March 2014 10.0.10 GA
17 October 2015 10.1.8 GA
21 October 2015 5.7.9 GA
23 February 2016 5.7.10-3 GA
23 May 2017 10.2.6 GA

 

MySQL is currently at 8.0.3 Release Candidate, while MariaDB Server is at 10.3.2 Alpha as of this writing.

MariaDB Server is by no means a drop-in replacement for MySQL. The risk of moving to MariaDB Server if you aren’t using newer MySQL features may be minimal, but the risk of moving out of MariaDB Server to MySQL is very prevalent. Linux distributions like Debian already warn you of this.

MySQL vs. MariaDB

The differences are beyond just default configuration options. Some features, like time-delayed replication that were present in MySQL since 2013, only make an appearance in MariaDB Server in 2017! (Refer to the MariaDB Server 10.2 Overview for more.) However, it is also worth noting some features such as multi-source replication appeared in MariaDB Server 10.0 first, and only then came to MySQL 5.7.

Extensibility

MySQL and MariaDB Server have a storage engine interface, and this is how you access all engines, including the favored InnoDB/Percona XtraDB. It is worth noting that Percona XtraDB was the default InnoDB replacement in MariaDB Server 5.1, 5.2, 5.3, 5.5, 10.0 and 10.1. But in MariaDB Server 10.2, the InnoDB of choice is upstream MySQL.

Stock MySQL has provided several storage engines beyond just InnoDB (the default) and MyISAM. You can find out more information about 5.7 Supported Engines.

Percona Server for MySQL includes a modified MEMORY storage engine, ships Percona XtraDB as the default InnoDB and also ships TokuDB and MyRocks (currently experimental). MyRocks is based on the RocksDB engine, and both are developed extensively at Facebook.

MariaDB Server includes many storage engines, beyond the default InnoDB. MyISAM is modified with segmented key caches, the default temporary table storage engine is Aria (which is a crash-safe MyISAM), the FederatedX engine is a modified FEDERATED engine, and there are more: CONNECT, Mroonga, OQGRAPH, Sequence, SphinxSE, SPIDER, TokuDB and of course MyRocks.

Storage engines have specific use cases, and have different levels of feature completeness. You should thoroughly evaluate a storage engine before choosing it. We believe that over 90% of installations are fine with just InnoDB or Percona XtraDB. Percona TokuDB is another engine that users who need compression could use. We naturally expect more usage in the MyRocks sphere going forward.

Analytics

MariaDB ColumnStore is the MariaDB solution to analytics and using a column-based store. It is a separate download and product, and not a traditional storage engine (yet). It is based on the now defunct InfiniDB product.

At Percona, we are quite excited by ClickHouse. We also have plenty of content around it. There is no MySQL story around this.

High Availability

High Availability is an exciting topic in the MySQL world, considering the server itself has been around for over 22 years. There are so many solutions out there, and some have had evolution as well.

MySQL provides MySQL Cluster (NDBCLUSTER) (there is no equivalent in the MariaDB world). MySQL also provides group replication (similar to Galera Cluster). Combined with the proxy MySQL Router, and the mysqlsh for administration (part of the X Protocol/X Dev API), you can also get MySQL InnoDB Cluster.

We benefit from the above at Percona, but also put lots of engineering work to make Percona XtraDB Cluster.

MariaDB Server only provides Galera Cluster.

Security

While we don’t want to compare the proprietary MySQL Enterprise Firewall, MariaDB’s recommendation is the proprietary, non-open source MariaDB MaxScale (it uses a Business Source License). We highly recommend the alternative, ProxySQL.

When it comes to encryption, MariaDB Server implements Google patches to provide complete data at rest encryption. This supports InnoDB, XtraDB and Aria temporary tables. The log files can also be encrypted (not present in MySQL, which only allows tablespace encryption and not log file encryption).

When it comes to attack prevention, ProxySQL should offer everything you need.

MySQL Enterprise provides auditing, while MariaDB Server provides an audit plugin as well as an extension to the audit interface for user filtering. Percona Server for MySQL has an audit plugin that sticks to the MySQL API, yet provides user filtering and controls the ability to audit (since auditing is expensive). Streaming to syslog is supported by the audit plugins from Percona and MariaDB.

Supporting Ecosystem and Tools

Upgrading from MySQL to MariaDB Server should be a relatively simple process (as stated above). If you want to upgrade away from MariaDB Server to MySQL, you may face hassles. For tools, see the following table:

Purpose MySQL Percona Server for MySQL MariaDB Server
Monitoring MySQL Enterprise Monitor Percona Monitoring & Management (PMM) (100% open source) Webyog Monyog
Backup MySQL Enterprise Backup Percona XtraBackup MariaDB Backup (fork of Percona XtraBackup)
SQL Management MySQL Workbench MySQL Workbench Webyog SQLyog
Load Balancing & Routing MySQL Router ProxySQL MariaDB MaxScale
Database Firewall MySQL Enterprise Firewall ProxySQL MariaDB MaxScale

 

Enterprise Database Compatibility

MariaDB Server today has window functions and common table expressions (CTEs). These appeared in MariaDB Server 10.2. MySQL 8 is presently in release candidate status and also has similar functionality.

Looking ahead, MariaDB Server 10.3 also includes an Oracle SQL_MODE and a partial PL/SQL parser. This is to aid migration from Oracle to MariaDB Server.

MariaDB Server 10.2 also has “flashback”, developed at Alibaba, to help with log-based rollback using the binary log.

Conclusion

Percona sees healthy competition in the MySQL ecosystem. We support all databases in the ecosystem: MySQL, MariaDB Server and Percona Server for MySQL. Our focus is to provide alternatives to proprietary parts of open source software. Percona has a strong operations focus on compatibility, application scalability, high availability security and observability. We also support many additional tools within the ecosystem, and love integrating and contributing to open source code.

For example, Percona Monitoring and Management (PMM) includes many open source tools like Prometheus, Consul, Grafana, Orchestrator and more. We have made the de facto open source hot backup solution for MySQL, MariaDB Server and Percona Server for MySQL (called Percona XtraBackup). We continue to maintain and extend useful tools for database engineers and administrators in Percona Toolkit. We make Percona XtraDB Cluster safe for deployment out of the box. We have invested in a write-optimized storage engine, TokuDB, and now continue to work with making MyRocks better.

We look forward to supporting your deployments of MySQL or MariaDB Server, whichever option is right for you! If you need assistance on migrations between servers, or further information, don’t hesitate to contact your friendly Percona sales associate.

by Colin Charles at November 02, 2017 05:55 PM

November 01, 2017

MariaDB AB

MariaDB Server 10.2.10 now available

MariaDB Server 10.2.10 now available dbart Wed, 11/01/2017 - 11:04

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.10. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.2.10

Release Notes Changelog What is MariaDB 10.2?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.10. See the release notes and changelog for details.

Login or Register to post comments

by dbart at November 01, 2017 03:04 PM

Peter Zaitsev

Percona Server for MongoDB 3.2.17-3.8 Is Now Available

Percona Server for MongoDB 3.4

Percona Server for MongoDB 3.2Percona announces the release of Percona Server for MongoDB 3.2.17-3.8 on October 31, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database that supports the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocksPercona Memory Engine and PerconaFT storage engine, as well as enterprise-grade features like External Authentication, Audit Logging, Profiling Rate Limiting, and Hot Backup at no extra cost. The software requires no changes to MongoDB applications or code.

NOTE: The PerconaFT storage engine is deprecated as of 3.2. It is no longer supported and isn’t available in higher version releases.

This release is based on MongoDB 3.2.17 and does not include any additional changes.

The Percona Server for MongoDB 3.2.17-3.8 release notes are available in the official documentation.

by Alexey Zhebel at November 01, 2017 12:37 PM

MariaDB Foundation

MariaDB 10.2.10 and MariaDB 10.0.33 now available

The MariaDB project is pleased to announce the availability of MariaDB 10.2.10 and MariaDB 10.0.33. See the release notes and changelogs for details. Download MariaDB 10.2.10 Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator Download MariaDB 10.0.33 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository […]

The post MariaDB 10.2.10 and MariaDB 10.0.33 now available appeared first on MariaDB.org.

by Ian Gilfillan at November 01, 2017 06:01 AM

October 31, 2017

Peter Zaitsev

MySQL Dashboard Improvements in Percona Monitoring and Management 1.4.0

In this blog post, I’ll walk through some of the improvements to the Percona Monitoring and Management (PMM) MySQL dashboard in release 1.4.0.

As the part of Percona Monitoring and Management development, we’re constantly looking for better ways to visualize information and help you to spot and resolve problems faster. We’ve made some updates to the MySQL dashboard in the 1.4.0 release. You can see those improvements in action in our Percona Monitoring and Management Demo Site: check out the MySQL Overview and MySQL InnoDB Metrics dashboards.

MySQL Client Thread Activity

Percona Monitoring and Management 1

One of the best ways to characterize a MySQL workload is to look at the number of MySQL server-client connections (Threads Connected). You should compare this number to how many of those threads are actually doing something on the server side (Threads Running), rather than just sitting idle waiting for a client to send the next request.

MySQL can handle thousands of connected threads quite well. However, many threads (hundred) running concurrently often increases query latency. Increased internal contention can make the situation much worse.

The problem with those metrics is that they are extremely volatile – one second you might have a lot of threads connected and running, and then none. This is especially true when some stalls on the MySQL level (or higher) causes pile-ups.

To provide better insight, we now show Peak Threads Connected and Peak Threads Running to help easily spot such potential pile-ups, as well as Avg Threads Running. These stats allow you look at a high number of threads connected and running to see if it there are just minor spikes (which tend to happen in many systems on a regular basis), or something more prolonged that warrants deeper investigation.

To simplify it even further: Threads Running spiking for a few seconds is OK, but spikes persisting for 5-10 seconds or more are often signs of problems that are impacting users (or problems about to happen).

InnoDB Logging Performance

Percona Monitoring and Management 2

Since I wrote a blog post about Choosing MySQL InnoDB Log File Size, I thought it would be great to check out how long the log file space would last (instead of just looking at how much log space is written per hour). Knowing how long the innodb_log_buffer_size lasts is also helpful for tuning this variable, in general.

This graph shows you how much data is written to the InnoDB Log Files, which helps to understand your disk bandwidth consumption. It also tells you how long it will take to go through your combined Redo Log Space and InnoDB Log Buffer Size (at this rate).

As I wrote in the blog post, there are a lot of considerations for choosing the InnoDB log file size, but having enough log space to accommodate all the changes for an hour is a good rule of thumb. As we can see, this system is close to full at around 50 minutes.

When it comes to innodb_log_buffer_sizeeven if InnoDB is not configured to flush the log at every transaction commit, it is going to be flushed every second by default. This means 10-15 seconds is usually good enough to accommodate the spikes. This system has it set at about 40 seconds (which is more than enough).

InnoDB Read-Ahead

Percona Monitoring and Management 3

This graph helps you understand how InnoDB Read-Ahead is working out, and is a pretty advanced graph.

In general, Innodb Read-Ahead is not very well understood. I think in most cases it is hard to tell if it is helping or hurting the current workload in its current configuration.

The for Read-Ahead in any system (not just InnoDB) is to pre-fetch data before it is really needed (in order to reduce latency and improve performance). The risk, however, is pre-fetching data that isn’t needed. This is wasteful.

InnoDB has two Read-Ahead options: Linear Read-Ahead (designed to speed up workloads that have physically sequential data access) and Random Read-Ahead (designed to help workloads that tend to access the data in the same vicinity but not in a linear order).

Due to potential overhead, only Linear Read-Ahead is enabled by default. You need to enable Random Read-Ahead separately if you want to determine its impact on your workload

Back to the graph in question: we show a number of pages pre-fetched by Linear and Random Read-Aheads to confirm if these are even in use with your workload. We show Number of Pages Fetched but Never Accessed (evicted without access) – shown as both the number of pages and as a percent of pages. If Fetched but Never Accessed is more than 30% or so, Read-Ahead might be producing more waste instead of helping your workload. It might need tuning.

We also show the portion of IO requests that InnoDB Read-Ahead served, which can help you understand the portion of resources spent on InnoDB Read-Ahead

Due to the timing of how InnoDB increments counters, the percentages of IO used for Read-Ahead and pages evicted without access shows up better on larger scale graphs.

Conclusion

I hope you find these graphs helpful. We’ll continue making Percona Monitoring and Management more helpful for troubleshooting database systems and getting better performance!

by Peter Zaitsev at October 31, 2017 08:08 PM

Henrik Ingo

impress.js HowTo: Slides over a background image

A common and IMO cool way to create impress.js presentations, is to use some large background image for the entire presentation, then layout each slide over it. One of my first impress.js presentation was Selling Open Source 101 for Oscon 101. The presentation is inside a picture of a woman selling all kinds of stuff in a bazaar.

Next week I will present something about EC2 at HighLoad++ conference, and my presentation is flying over some clouds, of course.

read more

by hingo at October 31, 2017 01:46 PM

Jean-Jerome Schmidt

How to Stop or Throttle SST Operation on a Galera Cluster

State Snapshot Transfer (SST) is one of the two ways used by Galera to perform initial syncing when a node is joining a cluster, until the node is declared as synced and part of the “primary component”. Depending on the dataset size and workload, SST could be lightning fast, or an expensive operation which will bring your database service down on its knees.

SST can be performed using 3 different methods:

  • mysqldump
  • rsync (or rsync_wan)
  • xtrabackup (or xtrabackup-v2, mariabackup)

Most of the time, xtrabackup-v2 and mariabackup are the preferred options. We rarely see people running on rsync or mysqldump in production clusters.

The Problem

When SST is initiated, there are several processes triggered on the joiner node, which are executed by the "mysql" user:

$ ps -fu mysql
UID         PID   PPID  C STIME TTY          TIME CMD
mysql    117814 129515  0 13:06 ?        00:00:00 /bin/bash -ue /usr//bin/wsrep_sst_xtrabackup-v2 --role donor --address 192.168.55.173:4444/xtrabackup_sst//1 --socket /var/lib/mysql/mysql.sock --datadir
mysql    120036 117814 15 13:06 ?        00:00:06 innobackupex --no-version-check --tmpdir=/tmp/tmp.pMmzIlZJwa --user=backupuser --password=x xxxxxxxxxxxxxx --socket=/var/lib/mysql/mysql.sock --galera-inf
mysql    120037 117814 19 13:06 ?        00:00:07 socat -u stdio TCP:192.168.55.173:4444
mysql    129515      1  1 Oct27 ?        01:11:46 /usr/sbin/mysqld --wsrep_start_position=7ce0e31f-aa46-11e7-abda-56d6a5318485:4949331

While on the donor node:

mysql     43733      1 14 Oct16 ?        03:28:47 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=7ce0e31f-aa46-11e7-abda-56d6a5318485:272891
mysql     87092  43733  0 14:53 ?        00:00:00 /bin/bash -ue /usr//bin/wsrep_sst_xtrabackup-v2 --role donor --address 192.168.55.172:4444/xtrabackup_sst//1 --socket /var/lib/mysql/mysql.sock --datadir /var/lib/mysql/  --gtid 7ce0e31f-aa46-11e7-abda-56d6a5318485:2883115 --gtid-domain-id 0
mysql     88826  87092 30 14:53 ?        00:00:05 innobackupex --no-version-check --tmpdir=/tmp/tmp.LDdWzbHkkW --user=backupuser --password=x xxxxxxxxxxxxxx --socket=/var/lib/mysql/mysql.sock --galera-info --stream=xbstream /tmp/tmp.oXDumYf392
mysql     88827  87092 30 14:53 ?        00:00:05 socat -u stdio TCP:192.168.55.172:4444

SST against a large dataset (hundreds of GBytes) is no fun. Depending on the hardware, network and workload, it may take hours to complete. Server resources may be saturated during the operation. Despite throttling is supported in SST (only for xtrabackup and mariabackup) using --rlimit and --use-memory options, we are still exposed to a degraded cluster when you are running out of majority active nodes. For example, if you are unlucky enough to find yourself with only one out of three nodes running. Therefore, you are advised to perform SST during quiet hours. You can, however, avoid SST by taking some manual steps, as described in this blog post.

Stopping an SST

Stopping an SST needs to be done on both the donor and the joiner nodes. The joiner triggers SST after determining how big the gap is when comparing the local Galera seqno with cluster's seqno. It executes the wsrep_sst_{wsrep_sst_method} command. This will be picked by the chosen donor, which will start streaming out data to the joiner. A donor node has no capabilities of refusing to serve snapshot transfer, once selected by Galera group communication, or by the value defined in wsrep_sst_donor variable. Once the syncing has started and you want to revert the decision, there is no single command to stop the operation.

The basic principle when stopping an SST is to:

  • Make the joiner look dead from a Galera group communication point-of-view (shutdown, fence, block, reset, unplug cable, blacklist, etc)
  • Kill the SST processes on the donor

One would think that killing the innobackupex process (kill -9 {innobackupex PID}) on the donor would be enough, but that is not the case. If you kill the SST processes on donor (or joiner) without fencing off the joiner, Galera still can see the joiner as active and will mark the SST process as incomplete, thus respawning a new set of processes to continue or start over again. You will be back to square one. This is the expected behaviour of /usr/bin/wsrep_sst_{method} script to safeguard SST operation which is vulnerable to timeouts (e.g., if it is long-running and resource intensive).

Let's look at an example. We have a crashed joiner node that we would like to rejoin the cluster. We would start by running the following command on the joiner:

$ systemctl start mysql # or service mysql start

A minute later, we found out that the operation is too heavy at that particular moment, and decided to postpone it later during low traffic hours. The most straightforward way to stop an xtrabackup-based SST method is by simply shutting down the joiner node, and kill the SST-related processes on the donor node. Alternatively, you can also block the incoming ports on the joiner by running the following iptables command on the joiner:

$ iptables -A INPUT -p tcp --dport 4444 -j DROP
$ iptables -A INPUT -p tcp --dport 4567:4568 -j DROP

Then on the donor, retrieve the PID of SST processes (list out the processes owned by "mysql" user):

$ ps -u mysql
   PID TTY          TIME CMD
117814 ?        00:00:00 wsrep_sst_xtrab
120036 ?        00:00:06 innobackupex
120037 ?        00:00:07 socat
129515 ?        01:11:47 mysqld

Finally, kill them all except the mysqld process (you must be extremely careful to NOT kill the mysqld process on the donor!):

$ kill -9 117814 120036 120037

Then, on the donor MySQL error log, you should notice the following line appearing after ~100 seconds:

2017-10-30 13:24:08 139722424837888 [Warning] WSREP: Could not find peer: 42b85e82-bd32-11e7-87ae-eff2b8dd2ea0
2017-10-30 13:24:08 139722424837888 [Warning] WSREP: 1.0 (192.168.55.172): State transfer to -1.-1 (left the group) failed: -32 (Broken pipe)

At this point, the donor should return to the "synced" state as reported by wsrep_local_state_comment and the SST process is completely stopped. The donor is back to its operational state and is able to serve clients in full capacity.

For the cleanup process on the joiner, you can simply flush the iptables chain:

$ iptables -F

Or simply remove the rules with -D flag:

$ iptables -D INPUT -p tcp --dport 4444 -j DROP
$ iptables -D INPUT -p tcp --dport 4567:4568 -j DROP

The similar approach can be used with other SST methods like rsync, mariabackup and mysqldump.

Throttling an SST (xtrabackup method only)

Depending on how busy the donor is, it's a good approach to throttle the SST process so it won't impact the donor significantly. We've seen a number of cases where, during catastrophic failures, users were desperate to bring back a failed cluster as a single bootstrapped node, and let the rest of the members catch up later. This attempt reduces the downtime from the application side, however, it creates additional burden on this “one-node cluster”, while the remaining members are still down or recovering.

Xtrabackup can be throttled with --throttle=<rate of IO/sec> to simply limit the number of IO operation if you are afraid that it will saturate your disks, but this option is only applicable when running xtrabackup as a backup process, not as an SST operator. Similar options are available with rlimit (rate limit) and can be combined with --use-memory to limit the RAM usage. By setting up values under [sst] directive inside the MySQL configuration file, we can ensure that the SST operation won't put too much load on the donor, even though it can take longer to complete. On the donor node, set the following:

[sst]
rlimit=128k
inno-apply-opts="--use-memory=200M"

More details on the Percona Xtrabackup SST documentation page.

However, there is a catch. The process could be so slow that it will never catch up with the transaction logs that InnoDB is writing, so SST might never complete. Generally, this situation is very uncommon, unless if you really have a very write-intensive workload or you allocate very limited resources to SST.

Conclusions

SST is critical but heavy, and could potentially be a long-running operation depending on the dataset size and network throughput between the nodes. Regardless of the consequences, there are still possibilities to stop the operation so we can have a better recovery plan at a better time.

by ashraf at October 31, 2017 01:08 PM

October 30, 2017

MariaDB AB

MariaDB Server 10.0.33 now available

MariaDB Server 10.0.33 now available dbart Mon, 10/30/2017 - 14:12

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.0.33. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.0.33

Release Notes Changelog What is MariaDB 10.0?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.0.33. See the release notes and changelog for details.

Login or Register to post comments

by dbart at October 30, 2017 06:12 PM

Peter Zaitsev

Percona XtraDB Cluster 5.7.19-29.22-3 is now available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.7Percona announces the release of Percona XtraDB Cluster 5.7.19-29.22-3 on October 27, 2017. Binaries are available from the downloads section or our software repositories.

NOTE: You can also run Docker containers from the images in the Docker Hub repository.

Percona XtraDB Cluster 5.7.19-29.22-3 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs

  • Added access checks for DDL commands to make sure they do not get replicated if they failed without proper permissions. Previously, when a user tried to perform certain DDL actions that failed locally due to lack of privileges, the command could still be replicated to other nodes, because access checks were performed after replication.This vulnerability is identified as CVE-2017-15365.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at October 30, 2017 12:29 PM

Percona XtraDB Cluster 5.6.37-26.21-3 is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6.34-26.19Percona announces the release of Percona XtraDB Cluster 5.6.37-26.21-3 on October 27, 2017. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.37-26.21-3 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs

  • Added access checks for DDL commands to make sure they do not get replicated if they failed without proper permissions. Previously, when a user tried to perform certain DDL actions that failed locally due to lack of privileges, the command could still be replicated to other nodes, because access checks were performed after replication.This vulnerability is identified as CVE-2017-15365.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at October 30, 2017 12:27 PM

October 27, 2017

Peter Zaitsev

This Week in Data with Colin Charles 12: Open Source Summit Europe and Open Source Entrepreneur Network

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

This week was exciting from a Percona exposure standpoint. We were at Open Source Summit Europe. I gave two talks and participated in a panel, as the co-located event for the Open Source Entrepreneur Network happened on the last day as well. We had a booth, and it was great to hang out and talk with my colleagues Dorothée Wuest and Dimitri Vanoverbeke as well as all the attendees that popped by.

Releases

Link List

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at October 27, 2017 08:23 PM

October 26, 2017

Peter Zaitsev

Percona Server for MongoDB 3.4.9-2.9 is Now Available

Percona Server for MongoDB 3.4

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.9-2.9 on October 26, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.9 and includes the following additional change:

  • Updated RocksDB to version 5.7.3

by Alexey Zhebel at October 26, 2017 05:17 PM

MariaDB Foundation

MariaDB Galera Cluster 5.5.58 now available

The MariaDB project is pleased to announce the availability of MariaDB Galera Cluster 5.5.58. See the release notes and changelogs for details. Download MariaDB Galera Cluster 5.5.58 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB Galera Cluster 5.5.58 now available appeared first on MariaDB.org.

by Ian Gilfillan at October 26, 2017 07:22 AM

October 25, 2017

MariaDB AB

20 Tips: Prepare Your Database for Black Friday & Cyber Monday

20 Tips: Prepare Your Database for Black Friday & Cyber Monday jmclaurin Wed, 10/25/2017 - 16:51

The biggest online-shopping days of the year are just around the corner. Is your database ready? By tuning 20 key MariaDB system variables, you'll bolster your database's performance, scalability and availability, ensuring every potential customer has a smooth user experience. The following system variables come up repeatedly in configuring an optimal MariaDB server environment. Implement our recommendations for the most tuned values, and make this year's Black Friday–Cyber Monday period your best ever.

A couple of important notes:

  • Do not accept these suggestions blindly. Each MariaDB environment is unique and requires additional thought before making any changes.  You will most likely need to adjust these settings for your specific use case and environment.

  • MariaDB configuration file is located in /etc/my.cnf. Every time you modify this file you will need to restart the MySQL service so the new changes can take effect.

20 Black Friday and Cyber Monday Tuning Recommendations

1. InnoDB Buffer Pool Size

The InnoDB buffer pool size this is the #1 setting to look at for any installation using InnoDB. The buffer pool is where data and indexes are cached; having it as large as possible will ensure you use memory and not disks for most read operations.

2. InnoDB Log File Size

innodb_log-file-size is the size of the redo logs, which are used to make sure writes are fast and durable. There are two general suggestions for InnoDB log file sizing:  

  • Set combined total size of InnoDB log files greater than 25–50% of the InnoDB buffer pool size

or

  • Set combined InnoDB log file log size equal to one hour’s worth of log entries during peak load 

Larger log files can lead to slower recovery in the event of a server crash. However, they also reduce the number of checkpoints needed and reduce disk I/O.

Evaluate the size of one hour’s worth of binary logs under operational load, then decide whether to increase the size of the InnoDB log files. 

Getting the innodb log file sizes right is important to achieve good system performance. MariaDB’s InnoDB storage engine uses a fixed size (circular) redo log space. The size is controlled by innodb_log_file_size and innodb_log_files_in_group (default 2). Multiply those values to get the redo log space that available for use. While technically it shouldn’t matter whether you use the innodb_log_file_size or innodb_log_files_in_group variable to control the redo space size, most people just work with the innodb_log_file_size and leave innodb_log_files_in_group alone. 

InnoDB’s redo space size is one of the most important configuration options for write-intensive workloads. However, it comes with trade-offs. The more redo space configured, the better InnoDB can optimize write I/O. However, increasing the redo space also means longer recovery times when the system loses power or crashes for other reasons. 

3. InnoDB Log Buffer Size

A larger InnoDB log buffer size means less disk I/O for larger transactions. It is suggested to set this to 64M on all servers.

4. InnoDB Log Flush Interval

The innodb_flush_log_at_trx_commit variable controls when flushing of the log buffer to disk occurs. innodb_flush_log_at_trx_commit = 1 (default) flushes the log buffer to disk at each transaction commit. This is the safest but also the least performant option.  

innodb_flush_log_at_trx_commit = 0 flushes the log buffer to disk every second, but nothing on transaction commit. Up to one second (possibly more due to process scheduling) could be lost. If there’s any crash, MySQL or the server can lose data. This is the fastest but least safe option.

innodb_flush_log_at_trx_commit = 2 writes the log buffer out to file on each commit but flushes to disk every second. If the disk cache has a battery backup (for instance a battery backed cache raid controller) this is generally the best balance of performance and safety. A crash of MySQL should not lose data. A server crash or power outage could lose up to a second (possibly more due to process scheduling). A battery-backed cache reduces this possibility.

We suggest using the first option for safety.

5. InnoDB IO Capacity

innodb_io_capacity should be set to approximately the maximum number of IOPS the underlying storage can handle.

By default this is set to 1000. We recommend to benchmarking the storage to determine whether you can increase this value further.

6. Thread Cache Size

Monitor the value of Threads_created. If it continues increasing at more than a few threads per minute, increase the value of thread_cache_size.

The thread cache size is set to 200 in the current default configuration.

7. Table Cache and Table Definition Cache

The table_open_cache and table_defintion_cache variables control the number of tables and definitions to keep open for all threads.

Monitor Open_tables, Open_table_defintitions, Opened_tables, and Opened_table_definitions to determine the best value. The general suggestion is to set table_open_cache (and subsequently table_definition_cache) only high enough to reduce the rate of increase of the Opened_tables (and Opened_table_definitions) status value.

Both table_open_cache and table_defintion_cache are set to 2048 in the default configuration.

8. Query Cache

The query cache is a well-known bottleneck that can be seen even when concurrency is moderate. The best option is to disable it from day one by setting query_cache_size = 0 (the default in MariaDB 10) and to use other ways to speed up read queries: having good indexing, adding replicas to spread the read load or using an external cache (memcache or redis, for instance). If you have already built your MariaDB application with the query cache enabled and have never noticed any problem, the query cache may be beneficial for you. In that case, be cautious if you decide to disable it.

9. Temporary Tables, tmp_table_size, and max_heap_table_size

MySQL uses the lower of max_heap_table_size and tmp_table_size to limit the size of temporary tables in memory. These are per client variables. While having this value large can help reduce the number of temporary tables created on disk, it also raises the risk of reaching the server's memory capacity since this is per client. Generally 32M to 64M is the suggested value to begin with for both variables and tune as needed.

Temporary tables are often used for GROUP BY, ORDER BY, DISTINCT, UNION, sub queries, etc. Ideally, MySQL should create these in memory, with as few on disk as possible.

It is important to note that queries not using joins appropriately and creating large temporary tables can be one reason for a higher number of temporary tables on disk. Another reason is the memory storage engine uses fixed length columns and assumes the worst-case scenario. If columns are not sized correctly (for example, a VARCHAR(255) for a short string), this influences the size of the table in memory and can cause it to go to disk earlier than it should. Also, temporary tables with blob and text columns will immediately go to disk, as the memory storage engine does not support them.

Both are currently set to 64M by default.

10. Warning Log Level

We recommend setting the warning log level this to log_warnings = 2. Doing so logs information about aborted connections and access-denied errors.

11. Max Connections

If you are often facing the "Too many connections" error, max_connections is too low. Frequently, because the application does not close connections to the database correctly, you need much more than the default 151 connections. The main drawback of high values for max_connections (say, 1,000 or more) is that the server will become unresponsive if it must run that many active transactions. Using a connection pool at the application level or a thread pool at the MariaDB level can help here.

12. Transaction Isolation

Investigate the available transaction isolation levels, and determine the best transaction isolation for your server’s use case.

13. Binary Log Format

We recommend using the ROW binary log format for master-master replication.

14. Auto-Increment Offsets

To help reduce the chances of collision between two masters being written to simultaneously, the auto increment and auto increment offset values need to be adjusted accordingly. 

15. Sync Binlog

By default, the OS handles flushing the binlog to disk. In the event of a server crash, it is possible to lose transactions from the binary log, leading to replication being out sync. Setting sync_binlog = 1 causes the binlog file to be flushed on every commit.

This is slower, but the safest option.

16. Crash Safe(r) Slaves

To help avoid replication errors after a slave crash, enable relay log recover and syncing of the relay log and relay log info files to disk.

17. Log Slave Updates

To have chained replication (master -> slave -> slave), log_slave_updates needs to be enabled. This tells a slave to write replicated transactions to its own binary log so that they can then be replicated to slaves off of it.

18. Read-Only Slaves

Slaves should be read-only to avoid data accidentally being written to them.

Note: Users with super privileges can still write when the server is read-only.

19. Slave Net Timeout

The slave_net_timeout variable is the number of seconds the slave will wait for a packet from the master before trying to reconnect. The default is 3600 (1 hour). This means if the link goes down and isn’t detected, it could be up to an hour before the slave reconnects. This could lead to the slave suddenly being up to an hour behind the master.

We recommend setting slave_net_timeout to a more reasonable value, such as 30 or 60.

20. Watch Our Webinar on Preparing for Black Friday & Cyber Monday

Watch our on-demand webinar – Preparing for Black Friday & Cyber Monday – to learn the four vital principles of database preparedness: security measures to protect your database from malicious attacks, performance tuning to ensure you deliver a smooth user experience, high availability strategies to ensure you don’t miss a single sale and scalability to prepare for both anticipated growth and unexpected spikes.

With all the excitement leading up to the holidays, it's easy to overlook how to prep your database environment for the biggest online shopping days of the year! Each year, more and more shoppers opt for online holiday deals instead of the more traditional mall experience, which means that retailers must prepare for multiple days of high online traffic to their e-commerce sites. This year you’ll be prepared, as I’m passing along 20 tips to tune your database environment for some of the biggest online holiday shopping days – Black Friday and Cyber Monday!

Simon Mudd

Thu, 10/26/2017 - 01:34

Changes to /etc/my.cnf and restarting MySQL/MariaDB

A blank statement like "MariaDB configuration file is located in /etc/my.cnf. Every time you modify this file you will need to restart the MySQL service so the new changes can take effect." is very misleading. Restarting MySQL/MariaDB often means taking downtime especially for a server running in a "24x7 web environment" so suggesting changes and suggesting that the server is restarted is really not considering the real world, and is suggesting the user do something that may not be necessary.

I think you should point out that:
* the /etc/my.cnf settings have an "on-server global variable equivalent value", at least in almost every case.
* of these global settings within MySQL/MariaDB most of them are dynamic. It's true that some are not and if you need to change these settings then you will need to restart the server to pick them up.
* for the dynamic settings you can do something as simple as SET GLOBAL sync_binlog = 1 to mention one of the settings referenced in the post. This would change the running value from 0 if that were set before.
* Given your list of settings that need to be looked at above I would make a clear point of indicating which of these settings are not dynamic, as this allows people to modify the dynamic ones as I indicated and then consider restarting the server to change those static settings if that's required.

Given your target audience addressing these points I mention would probably make the suggestions you make more useful to real world users.

In the environment I work in I push out changes to /etc/my.cnf and a process which monitors the change checks the running server's configuration and if it sees that a setting needs to be changed and it is dynamic the change will be made dynamically avoiding any server restart. This has proved to be very convenient and has avoided a large number of unnecessary restarts.

It is true that there are static settings which mean the server does need to be restarted and I would always encourage the MariaDB developers to look for ways to make these settings dynamic too. That would allow how higher uptime which is in the end what we care about, while at the same time allowing us to reconfigure the running server to best suit its workload.

Jack Ponting

Jack Ponting

Wed, 11/01/2017 - 08:23

Good
http://www.good.com

Login or Register to post comments

by jmclaurin at October 25, 2017 08:51 PM

MariaDB Galera Cluster 5.5.58 now available

MariaDB Galera Cluster 5.5.58 now available dbart Wed, 10/25/2017 - 14:22

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.58. See the release notes and changelog for details.

Download MariaDB Galera Cluster 5.5.58

Release Notes Changelog What is MariaDB Galera Cluster?

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.58. See the release notes and changelog for details.

Login or Register to post comments

by dbart at October 25, 2017 06:22 PM

Peter Zaitsev

JSON Output of the pmm-admin list Command

In this blog post, we’ll look at Percona Monitoring and Management’s pmm-admin list command.

The pmm-admin list command shows all monitoring services you have added using the pmm-admin add command. Starting with version 1.4.0, Percona Monitoring and Management (PMM) also lists external monitoring services when you run pmm-admin list, i.e., those services that monitor the backends not supported out of the box (such as PostgreSQL databases).

In the output, the external monitoring services appear at the bottom:

The tabular output of the pmm-admin list command

JSON Output for Automatic Verification

But there is also another feature of pmm-admin list. If you run this command with the –json parameter, the command gives you a JSON document as output. This option now enables inspecting the monitoring services by computers due to the strict JSON syntax rules. JSON has become a de-facto standard for exchanging data for many tools. The JSON output provided by the pmm-admin list command can be used by configuration management tools such as ansible or chef.

The output is captured as keys and values. The general information about the computer where this pmm-client is installed is given as top-level elements:

  • Version
  • ServerAddress
  • ServerSecurity
  • ClientName
  • ClientAddress
  • ClientBindAddress
  • Platform

You can quickly determine if there are any errors in built-in monitoring services by inspecting the Err top level element in the JSON output. Similarly, the ExternalErr element reports errors on external services:

The JSON parsing friendly version produced by the pmm-admin list command

Representing Monitoring Services

Two elements contain lists as their values. The Services top-level element contains a list of documents that represent enabled monitoring services. The ExternalServices element contains a list of documents that represent enabled external monitoring services. Each attribute in the Services and ExternalServices elements provides the same information as a column in the tabular output.

Hope this brief post provides some valuable information regarding new Percona Monitoring and Management 1.4.0 functionality. Let me know about any questions in the comments.

by Borys Belinsky at October 25, 2017 05:08 PM

October 24, 2017

MariaDB Foundation

M|18 MariaDB Users Conference and the MariaDB Developers Unconference in New York, February 2018

February 2018 sees New York City again play host to the MariaDB Users Conference, M|18. The submission deadline for the Call for Papers is October 31, so there’s still time to get your proposal in. For those attending, Early Bird pricing is applicable until the end of November. Details at https://m18.mariadb.com/. M|18, on 26 and […]

The post M|18 MariaDB Users Conference and the MariaDB Developers Unconference in New York, February 2018 appeared first on MariaDB.org.

by Ian Gilfillan at October 24, 2017 03:28 PM

Peter Zaitsev

Community Matters

Community Matters

Community MattersBuilding on community

Percona is very committed to open source database software. We think of ourselves as unbiased champions of open source database solutions. With that, we also carry a responsibility to the open source database community – whether MySQL®, MongoDB®, ProxySQL or other open source database technology. We’ve seen that, and taken action by hiring a Community Manager.

That’s me. Which is great… For me!

And my job, in a nutshell, is to help to make our community great for you. By building on the good stuff that’s been done in the past and finding ways to do more.

The common thread tying the community together is the sharing of information, experience, and knowledge. Hundreds of you have taken part in Percona Live or Percona Live Europe — thank you for that! Props if you’ve done both. If you’ve proposed a paper (selected or not), presented a session, given a tutorial, staffed a booth or sponsored the event – kudos!

Maybe you’ve benefited from or run sessions at a Percona University (the next one is in Kiev in November and it’s FREE). Or caught up with Percona staff at one of the many tech conferences we attend during the year.

You might have used our code, added to our code, spotted and logged bugs, given feedback or requested new features. Helped out other users in forums, written to question-and-answer sites like Stack Overflow. Maybe you’ve blogged about using Percona software on your own blog, or looked for help on the Percona Database Performance Blog. You might have recommended our software to your company, or a colleague, or a client or a friend. Or even a stranger. Mentioned us in passing in conversation. Read our e-books, watched our webinars, shared a link or reached out to Percona via social media.

All excellent, valuable and much-appreciated contributions to the community.

Ways you can join in

Have a think about these opportunities to shine, share and make the Percona community best-in-class.

  • Take part in our forum: we really try to keep up, but there are always more questions than we can address. It’s easy to think of the forums as a support queue but honestly, we are MORE than delighted when we have help from you.
  • You have a passion for a particular subject, or maybe an interesting project to share. How about proposing a webinar or blog post? Contact me if you are interested.
  • If you haven’t yet done it, make 2018 the year you attend Percona Live. If you’ve done it before, do it again – network with old friends and make some new ones. Get a new t-Shirt. Enjoy the company. The warmth of the welcome and the generosity of the knowledge shared made a big impression on me in Dublin, I’m convinced you’ll find the same.
  • In-depth knowledge or hardcore learning on-the-job? Don’t forget that the call for papers for Percona Live is opening soon and that speakers get free attendance at the conference. It’s a competitive call, but you’re up for that right? Right! 
  • Don’t want to “do stuff” on the Percona site? Maybe contributing to code or working on the question-and-answer sites is more for you. Or maybe you have a blog already and write about our software and how to use it. If so – thanks again, and please let me have the link!
  • If you haven’t already, don’t forget to subscribe to our newsletters to get early warning of upcoming webinars, and the latest tech and community news

Have you thought about joining Percona? We’re hiring! Don’t forget, too, that all the contributions you make to online communities – Percona or not – really pay off when you want to demonstrate your knowledge and commitment to future employers or clients. A link is worth a thousand words.

What do you think?

Interested? Ideas or comments? Things you think we should do better? Things that you think are great? Things we used to do that were great and you miss? Things that others do and you wished we did? Things that … well, you get the idea!

Get in touch, or just get stuck in. You might find it rewarding*…

free to email me or message me on Skype.

*I have keys to the swag box … 😉

by Lorraine Pocklington at October 24, 2017 02:36 PM

Upcoming Joint Webinar Tuesday October 24, 2017: How to automate and manage MongoDB & Percona Server for MongoDB

Automate and Manage MongoDB

Automate and Manage MongoDBJoin me on October 24th as the Percona teams join forces with Severalnines for a different perspective on how to automate and manage your MongoDB or Percona Server for MongoDB databases. Severalnine’s Ruairí Newman will join me as we present How to automate and manage MongoDB & Percona Server for MongoDB on Tuesday, October 24, 2017, at 12:00 pm ET and 4:00 pm ET.

During this webinar, we’ll walk you through the key features of Percona Server for MongoDB as compared to MongoDB itself as well as how to manage both “flavors” of MongoDB. This joint webinar by Severalnines and Percona provides attendees with a clear understanding of the differences between the MongoDB Ops Manager and ClusterControl, and how they help automate and manage MongoDB operations. It also provides an introduction to Percona Server for MongoDB and some of its key functionalities as compared to MongoDB itself.

There are many generic tools available, both commercial and open source, to aid with the automation of operational tasks. Some of these tools are even deployed in the database world.

However, there are a small number of specialist domain-specific automation tools available, and we are going to compare the MongoDB-relevant functionality of two of these products: MongoDB’s Ops Manager, and ClusterControl from Severalnines. Attendees should take away a clear understanding of the differences between these tools, and how they help automate and manage MongoDB operations.

Agenda

  • Introduction to Percona Server for MongoDB
  • How to automate and manage MongoDB
    • Installation and maintenance
    • Complexity of architecture
    • Options for redundancy
    • Comparative functionality
    • Monitoring, dashboard, alerting
    • Backing up your deployments
    • Automated deployment of advanced configurations
    • Upgrading existing deployments

Automate and Manage MongoDBTyler Duzan
Product Manager, Services at Percona
Before joining Percona as a Product Manager, Tyler spent almost 13 years as an operations and security engineer in a variety of different industries. Deciding to take his analytical mindset and strategic focus into new territory, Tyler is applying his knowledge to solving business problems for Percona customers with inventive solutions combining technology and services.

 

Automate and Manage MongoDBRuairí Newman
Senior Support Engineer at Severalnines
Ruairí Newman is passionate about all things cloud and automation and has worked for MongoDB, VMware and Amazon Web Services among others. He has a background in Operational Support Systems and Professional Services. Before joining Severalnines, Ruairí worked for Huawei Ireland as Senior Cloud Solutions Architect on their Web Services project, where he advised on commodity cloud architecture and Monitoring technologies, and deployed and administered a Research & Development OpenStack lab.

by Tyler Duzan at October 24, 2017 04:59 AM

October 23, 2017

Peter Zaitsev

MySQL Point in Time Recovery the Right Way

MySQL Point In Time Recovery

MySQL Point In Time RecoveryIn this blog, I’ll look at how to do MySQL point in time recovery (PITR) correctly.

Sometimes we need to restore from a backup, and then replay the transactions that happened after the backup was taken. This is a common procedure in most disaster recovery plans, when for example you accidentally drop a table/database or run an update/delete without the “where” clause and lose data.

The usual way is to get a copy of your binlogs and use mysqlbinlog to replay those transactions. But this approach has many pitfalls that can make the whole PITR process a nightmare. Some examples:

  • You need to make sure to run a single mysqlbinlog command with all related binlogs, and pipe them to mysql at once. Otherwise, if binlog.000001 creates a temporary table, and binlog.000002 requires that temporary table, it will not be present. Each execution of MySQL creates a new connection:

shell> mysqlbinlog binlog.000001 | mysql -u root -p # Creates tmp table X
shell> mysqlbinlog binlog.000002 | mysql -u root -p # Uses tmp table X

  • We can say that it has to be an atomic operation. If it fails halfway through, it will be very difficult to know where it failed and even more difficult to resume from that point forward. There are many reasons for it to fail: InnoDB lock wait timeout / deadlock with some concurrent transaction, server and client have different
    max_allowed_packet
     and you get a Lost connection to MySQL server during query error, and so on.

So how can we overcome those limitations and have a reliable way to do Point In Time Recovery?

We can restore the backup on the desired server, build a second server with just the minimal MySQL required data and move the all binary logs to this “fake” server datadir. Then we need to configure the server where we want the PITR to happen as a slave of the fake server. From this point forward, it’s going to be pure MySQL replication happening.

To illustrate it, I create a Docker container on the machine. I have Percona Server for MySQL running on the box listening on 3306, and have already restored the backup on it. There is a tarball there with all binlogs required. The saved positions for PITR are as follows:

[root@localhost ~]# cat /var/lib/mysql/xtrabackup_binlog_info
master-bin.000007	1518932

I create a folder to store the Docker MySQL datadir:

mkdir /tmp/pitr
chown -R 1001 /tmp/pitr

I start the Docker container. As we can see from xtrabackup_binlog_info, my binlogs are named master-bin and I’ll be setting the same server-id as original master:

docker run --name ps_pitr -v /tmp/pitr:/var/lib/mysql
-p 3307:3306 -e MYSQL_ROOT_PASSWORD=secret
-d percona/percona-server:5.7.18
--log_bin=master-bin --server-id=10

In case you want to make usage of GTID, append --gtid-mode=ON --enforce_gtid_consistency=ON to the end of the Docker command.

The command above starts a MySQL instance, invokes mysqld –initialize, sets the root password to secret and it’s port 3306 is mapped back to my local 3307 port. Now I’ll stop it, remove the binlogs that it created, uncompress and move all required binlogs to its datadir mapped folder and start it again:

docker stop ps_pitr
rm /tmp/pitr/master-bin.*
tar -zxf binlogs.tgz -C /tmp/pitr
chown -R 1001 /tmp/pitr/master-bin.*
docker start ps_pitr

If it all worked correctly, at this point we can see the full list of binary logs on the Docker container by connecting on port 3307:

mysql -u root -psecret -P 3307 --protocol=TCP -e "SHOW BINARY LOGS"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| master-bin.000005 |  26216208 |
| master-bin.000006 |  26214614 |
| master-bin.000007 |  26214902 |
. . .
| master-bin.000074 |       154 |
+-------------------+-----------+

Now, all we need to do is connect to our server, which has the backup restored, and configure it as a slave from 3307:

mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 6
Server version: 5.7.18-16 Percona Server (GPL), Release 16, Revision d7301f8
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3307, MASTER_USER='root', MASTER_PASSWORD='secret', MASTER_LOG_FILE='master-bin.000007', MASTER_LOG_POS=1518932;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: master-bin.000008
          Read_Master_Log_Pos: 449696
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 28957
        Relay_Master_Log_File: master-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 15217950
              Relay_Log_Space: 11476311
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 4382
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 10
                  Master_UUID: 80b9fe26-a945-11e7-aa1d-0242ac110002
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Opening tables
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.17 sec)
. . .
mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: master-bin.000074
          Read_Master_Log_Pos: 154
               Relay_Log_File: localhost-relay-bin.000133
                Relay_Log_Pos: 381
        Relay_Master_Log_File: master-bin.000074
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154
              Relay_Log_Space: 819
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 10
                  Master_UUID: 80b9fe26-a945-11e7-aa1d-0242ac110002
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.01 sec)

If you want to apply logs up to a particular time you can make use of mysqlbinlog to verify what the last position / GTID it should apply, and use START SLAVE UNTIL MASTER_LOG_FILE = 'log_name', MASTER_LOG_POS = log_pos or START SLAVE SQL_THREAD UNTIL SQL_AFTER_GTIDS = 3E11FA47-71CA-11E1-9E33-C80AA9429562:11-56.

Special thanks to Marcos Albe, who originally showed me this MySQL point in time recovery approach.

by Marcelo Altmann at October 23, 2017 05:08 PM

October 20, 2017

Peter Zaitsev

Percona Monitoring and Management 1.4.0 Is Now Available

Percona Monitoring and ManagementPercona announces the release of Percona Monitoring and Management 1.4.0.

This release introduces the support of external Prometheus exporters so that you can create dashboards in the Metrics monitor even for the monitoring services other than those provided with PMM client packages. To attach an existing external Prometheus exporter, run pmm-admin add external:metrics NAME_OF_EXPORTER URL:PORT.

The list of attached monitoring services is now available not only in the tabular format but also as a JSON file to enable automatic verification of your configuration. To view the list of monitoring services in the JSON format run pmm-admin list --json.

In this release, Prometheus and Grafana have been upgraded. Prometheus version 1.7.2, shipped with this release, offers a number of bug fixes that will contribute to its smooth operation inside PMM. For more information, see the Prometheus change log.

Version 4.5.2 of Grafana, included in this release of PMM, offers a number of new tools that will facilitate data analysis in PMM:

  • New query editor for Prometheus expressions features syntax highlighting and autocompletion for metrics, functions and range vectors.
    Percona Monitoring and Management
  • Query inspector provides detailed information about the query. The primary goal of graph inspector is to enable analyzing a graph which does not display data as expected.
    Percona Monitoring and Management

The complete list of new features in Graphana 4.5.0 is available from What’s New in Grafana v4.5.

For install and upgrade instructions, see Deploying Percona Monitoring and Management.

New features

  • PMM-1520: Prometheus upgraded to version 1.7.2.
  • PMM-1521: Grafana upgraded to version 4.5.2.
  • PMM-1091: The pmm-admin list produces a JSON document as output if the --json option is supplied.
  • PMM-507: External exporters are supported with pmm-admin.
  • PMM-1622: docker images of PMM Server are available for downloading as tar packages.

Improvements

  • PMM-1553: Consul upgraded to the 0.8 release.

Bug fixes

  • PMM-1172: In some cases, the TABLES section of a query in QAN could contain no data and display the List of tables is empty error. The Query and Explain sections had the relevant values.
  • PMM-1519: A Prometheus instance could be forced to shut down if it contained too many targets (more than 50). When started the next time, Prometheus initiated a time-consuming crash recovery routine which took long on large installations.

by Borys Belinsky at October 20, 2017 10:02 PM

Open Query Pty Ltd

Your E-Commerce site and Credit Cards

Sites that deal with credit cards can have some sloppy practices.  Not through malicious intent, but it’s sloppy nevertheless so it should be addressed.  There are potential fraud and identity theft issues at stake, and any self-respecting site will want to be seen to be respecting their clients!

First, a real-world story. Read Using expired credit cards

The key lesson from there is that simply abiding by what payment gateways, banks and other credit card providers require does not make your payment system good.  While it is hoped that those organisations also clean up their processes a bit, you can meanwhile make sure that you do the right thing by your clients regardless of that.

First of all, ensure that all pages and all page-items (CSS, images, scripts, form submit destinations, etc) as well as payment gateway communications go over HTTPS.  Having some aspects of payment/checkout/profile pages not over HTTPS will show up in browsers, and it looks very sloppy indeed. Overall, you are encouraged to just make your entire site run over HTTPS.  But if you use any external sources for scripts, images or other content, that too needs to be checked as it can cause potential leaks in your site security on the browser end.

For the credit card processing, here are a few tips for what you can do from your end:

  • DO NOT store credit card details.  Good payment gateways work with a token system, so you can handle recurring payments and clients can choose to have their card kept on file, but you don’t have it.  After all, data you don’t have, cannot be leaked or stolen.
  • DO NOT check credit card number validity before submitting to the payment gateway, i.e. don’t apply the Luhn check.  We wrote about this over a decade ago, but it’s still relevant: Lunn algorithm (credit card number check).  In a nutshell, if you do pre-checks, the payment gateway gets less data and might miss fraud attempts.
  • Check that your payment gateway requires the CVV field, and checks it.  If it doesn’t do this, the gateway will be bad at fraud prevention: have them fix it, or move to another provider.
  • Check that your payment gateway does not allow use of expired cards, not even for recurring payments using cards-on-file.  This is a bit more difficult to check (since you don’t want to be storing credit card details locally) and you may only find out over time, but try to make this effort.  It is again an issue that can otherwise harm your clients.

If you have positive confirmation that your payment gateway does the right thing, please let us know!  It will help others.  Thanks.

by Arjen Lentz at October 20, 2017 02:33 AM

October 19, 2017

Peter Zaitsev

Percona Blog Poll: How Do You Currently Host Applications and Databases?

Host applications and databases

Host applications and databasesPercona latest blog poll asks how you currently host applications and databases. Select an option below, or leave a comment to clarify your deployment!

With the increased need for environments that respond more quickly to changing business demands, many enterprises are moving to the cloud and hosted deployments for applications and software in order to offload development and maintenance overhead to a third party. The database is no exception. Businesses are turning to using database as a service (DBaaS) to handle their data needs.

DBaaS provides some obvious benefits:

  • Offload physical infrastructure to another vendor. It is the responsibility of whoever is providing the DBaaS service to maintain the physical environment – including hardware, software and best practices.
  • Scalability. You can add or subtract capacity as needed by just contacting your vendor. Have a big event on the horizon? Order more servers!
  • Expense. Since you no longer have shell out for operational costs or infrastructure upgrades (all handled by the vendor now), you can reduce capital and operation expenses – or at least reasonably plan on what they are going to be.

There are some potential disadvantages to a DBaaS as well:

  • Network performance issues. If your database is located off-premises, then it can be subject to network issues (or outages) that are beyond your control. These can translate into performance problems that impact the customer experience.
  • Loss of visibility. It’s harder (though not impossible) to always know what is happening with your data. Decisions around provisioning, storage and architecture are now in the hands of a third party.
  • Security and compliance. You are no longer totally in control of how secure or compliant your data is when using a DBaaS. This can be crucial if your business requires certain standards to operate in your market (healthcare, for example).

How are you hosting your database? On-premises? In the cloud? Which cloud? Is it co-located? Please answer using the poll below. Choose up to three answers. If you don’t see your solutions, use the comments to explain.

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Thanks in advance for your responses – they will help the open source community determine how databases are being hosted.

by Tyler Duzan at October 19, 2017 02:17 PM

Valeriy Kravchuk

Fun with Bugs #56 - On Some Public Bugs Fixed in MySQL 5.7.20

While MySQL 8.0.x hardly has much impact on my regular work, recent MySQL 5.7.20 release is something to check carefully. MySQL 5.7 is widely used in production, as a base for Percona Server 5.7, some features may be merged into MariaDB 10.x etc. So, here is my review of some community reported bugs that were fixed in recently released MySQL 5.7.20, based on the release notes.

Usually I start with InnoDB bug fixes, but in 5.7.20 several related fixes were made only to bugs reported internally. So, this time I have to start with partitioning:
  • Bug #86255 - First one to write about, and the bug report is private... Second one (Bug #76418) is also private. All we have is this:
    "Partitioning: In certain cases when fetching heap records a partition ID could be set to zero. (Bug #86255, Bug #26034430)"

    "Partitioning: Queries involving NULL were not always handled correctly on tables that were partitioned by LIST. (Bug #76418, Bug #20748521)"
That's all, folks. I fail to understand why bugs with such a description can stay private after they are fixed. I have to admit: I do not get it. Moreover, I am not going even to try any longer. Lists with one item look stupid, but hiding such bugs is not much better, IMHO.

Lucky I am, there were several bug fixes related to replication:
  • Bug #85352 - "Replication regression with RBR and partitioned tables". This regression bug (comparing to 5.5.x) was reported by Juan Arruti and immediately verified by Umesh Shastry. I do not know why it is listed as InnoDB in the release notes, as it hardly can be about native InnoDB partitioning, based on versions affected.
  • Bug #86224 - "mysqlrplsync errors out because slave is using --super-read-only option". It was reported by Richard Morris and verified by Bogdan Kecman.
  • Bug #84731 - "Group Replication: mysql client connections hang during group replication start", was reported by Kenny Gryp and verified by Umesh Shastry. Another bug reported by Kenny was also fixed in 5.7.20, Bug #84798 - "Group Replication can use some verbosity in the error log".
  • Bug #86250 - "thd->ha_data[ht_arg->slot].ha_ptr_backup == __null || (thd->is_binlog_applier()". This debug assertion was reported by Roel Van de Paar and verified by Umesh Shastry
  • Bug #85639 - "XA transactions are 'unsafe' for RPL using SBR". It was reported by João Gramacho.
  • Bug #86288 - "mysqlbinlog read-from-remote-server not honoring rewrite_db filtering", was reported by Julie Hergert.
Other bug fixes not to miss are:
  • Bug #85622 - "5.6: wrong results with materialization, views". It was reported by Shane Bester. Even though it was initially stated that only 5.6 is affected, release notes say there was a fix for 5.7.20 and even 8.0.3.
  • Bug #82992 - "Some warnings appear in dump from mysqldump". This funny bug was found by Nong LO and verified by Sinisa Milivojevic.
  • Bug #81714 - "mysqldump get_view_structure does not free MYSQL_RES in one error path". Thisbug was reported by Laurynas Biveinis, but it was noticed and patched by Yura Sorokin, also from Percona.
  • Bug #83950 - "LOAD DATA INFILE fails with an escape character followed by a multi-byte one", was reported by yours truly and verified by Umesh Shastry. Unfortunately the bug report itself does NOT say explicitly what versions had got the fix.
  • Bug #79596 - "If client killed after ROLLBACK TO SAVEPOINT previous stmts committed". This regression and potential data corruption bug was reported by Sveta Smirnova, verified by Umesh Shastry and studied at the source code level by Zhai Weixiang. Nice to see it fixed!
That's all, few build and packaging related bugs aside.

This was my very last attempt to do a detailed review of bug reports from MySQL Community based on official Release Notes. With private bugs and very few fixes for things coming from the public bugs database in general, it seems to make zero sense now to continue these. Authors of patches contributed are properly mentioned by Oracle, and we all know who verify community bug reports... One Twitter message would be enough to fit everything I have to say, and any real details should better be checked in git commits.

It's time to get back to the roots of this series and maybe write about bugs just opened, bugs not properly handled or just funny ones. I think it helped a lot back in 2013 to make MySQL 5.6 a release that was commonly accepted as a "good one". It's not fun any more and not much useful to report Oracle achievements in public bugs fixing, so I'd better switch to their problems.

by Valeriy Kravchuk (noreply@blogger.com) at October 19, 2017 07:27 AM

October 18, 2017

Peter Zaitsev

How to Choose the MySQL innodb_log_file_size

innodb_log_file_size

In this blog post, I’ll provide some guidance on how to choose the MySQL innodb_log_file_size.

Like many database management systems, MySQL uses logs to achieve data durability (when using the default InnoDB storage engine). This ensures that when a transaction is committed, data is not lost in the event of crash or power loss.

MySQL’s InnoDB storage engine uses a fixed size (circular) Redo log space. The size is controlled by innodb_log_file_size and innodb_log_files_in_group (default 2). You multiply those values and get the Redo log space that available to use. While technically it shouldn’t matter whether you change either the innodb_log_file_size or innodb_log_files_in_group variable to control the Redo space size, most people just work with the innodb_log_file_size and leave innodb_log_files_in_group alone.

Configuring InnoDB’s Redo space size is one of the most important configuration options for write-intensive workloads. However, it comes with trade-offs. The more Redo space you have configured, the better InnoDB can optimize write IO. However, increasing the Redo space also means longer recovery times when the system loses power or crashes for other reasons.  

It is not easy or straightforward to predict how much time a system crash recovery takes for a specific innodb_log_file_size value – it depends on the hardware, MySQL version and workload. It can vary widely (10 times difference or more, depending on the circumstances). However, around five minutes per 1GB of innodb_log_file_size is a decent ballpark number. If this is really important for your environment, I would recommend testing it by a simulating system crash under full load (after the database has completely warmed up).   

While recovery time can be a guideline for the limit of the InnoDB Log File size, there are a couple of other ways you can look at this number – especially if you have Percona Monitoring and Management installed.

Check Percona Monitoring and Management’s “MySQL InnoDB Metrics” Dashboard. If you see a graph like this:

innodb_log_file_size

where Uncheckpointed Bytes is pushing very close to the Max Checkpoint Age, you can almost be sure your current innodb_log_file_size is limiting your system’s performance. Increasing it can provide substantial performance improvements.

If you see something like this instead:

innodb_log_file_size 2

where the number of Uncheckpointed Bytes is well below the Max Checkpoint Age, then increasing the log file size won’t give you a significant improvement.

Note: many MySQL settings are interconnected. While a specific log file size might be good enough for smaller innodb_buffer_pool_size, larger InnoDB Buffer Pool values might warrant larger log files for optimal performance.

Another thing to keep in mind: the recovery time we spoke about early really depends on the Uncheckpointed Bytes rather than total log file size. If you do not see recovery time increasing with a larger innodb_log_file_size, check out InnoDB Checkpoint Age graph – it might be you just can’t fully utilize large log files with your workload and configuration.

Another way to look at the log file size is in context of log space usage:

innodb_log_file_size 3

This graph shows the amount of Data Written to the InnoDB log files per hour, as well as the total size of the InnoDB log files. In the graph above, we have 2GB of log space and some 12GB written to the Log files per hour. This means we cycle through logs every ten minutes.

InnoDB has to flush every dirty page in the buffer pool at least once per log file cycle time.

InnoDB gets better performance when it does that less frequently, and there is less wear and tear on SSD devices. I like to see this number at no less than 15 minutes. One hour is even better.  

Summary

Getting the innodb_log_file_file size is important to achieve the balance between reasonably fast crash recovery time and good system performance. Remember, your recovery time objective it is not as trivial as you might imagine. I hope the techniques described in this post help you to find the optimal value for your situation!

by Peter Zaitsev at October 18, 2017 10:15 PM

Webinar Thursday, October 19, 2017: What You Need to Get the Most Out of Indexes – Part 2

Indexes

IndexesJoin Percona’s Senior Architect, Matthew Boehm, as he presents What You Need to Get the Most Out of Indexes – Part 2 webinar on Thursday, October 19, 2017, at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Proper indexing is key to database performance. Finely tune your query writing and database performance with tips from the experts. MySQL offers a few different types of indexes and uses them in a variety of ways.

In this session you’ll learn:

  • How to use composite indexes
  • Other index usages besides lookup
  • How to find unoptimized queries
  • What is there beyond EXPLAIN?

Register for the webinar.

IndexesMatthew Boehm, Architect

Matthew joined Percona in the fall of 2012 as a MySQL consultant. His areas of knowledge include the traditional Linux/Apache/MySQL/PHP stack, memcached, MySQL Cluster, massive sharding topologies, PHP development and a bit of MySQL-C-API development. Previously, Matthew DBAed for the fifth largest MySQL installation at eBay/PayPal. He also hails from managed hosting environments. During his off-hours, Matthew is a nationally ranked, competitive West Coast Swing dancer and travels to competitions around the US. He enjoys working out, camping, biking and playing MMOs with his son.

by Matthew Boehm at October 18, 2017 06:46 PM

MariaDB Foundation

MariaDB 5.5.58 and MariaDB Connector/ODBC 3.0.2 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.58, as well as the recent release of MariaDB Connector/ODBC 3.0.2. These are both stable (GA) releases. See the release notes and changelog for details. Download MariaDB 5.5.58 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Download […]

The post MariaDB 5.5.58 and MariaDB Connector/ODBC 3.0.2 now available appeared first on MariaDB.org.

by Ian Gilfillan at October 18, 2017 07:11 AM

MariaDB AB

MariaDB 5.5.58 now available

MariaDB 5.5.58 now available dbart Wed, 10/18/2017 - 03:06

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.58. See the release notes and changelog for details.

Download MariaDB 5.5.58

Release Notes Changelog What is MariaDB 5.5?

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.58. See the release notes and changelog for details.

Login or Register to post comments

by dbart at October 18, 2017 07:06 AM

October 17, 2017

Peter Zaitsev

Webinar Wednesday, October 18, 2017: How to Scale with MongoDB

Scale with MongoDB

Scale with MongoDBJoin Percona’s Senior Technical Services Engineer Adamo Tonete as he presents How To Scale with MongoDB on Wednesday, October 18, 2017, at 11:00 am PDT / 2:00 pm EDT (UTC-7).

In this webinar, we will talk about how to scale with MongoDB, up to thousands of writes and reads per second. What are the common issues when you scale with MongoDB? Is it better to shard or to add further secondaries?

We will walk through many common scaling situations, and through the steps needed to deploy a sharded cluster: from a single instance to a sharded environment. We will also talk about common mistakes/pitfalls a company can make when scaling its database – and how to avoid such situations.

Register for the webinar.

Adamo Tonete, Senior Technical Services Engineer

Adamo joined Percona in 2015, after working as a MongoDB/MySQL database administrator for three years. As the main database member of a startup, he was responsible for suggesting the best architecture and data flows for a worldwide company in a 24×7 environment. Before that, he worked as a Microsoft SQL Server DBA for a large e-commerce company, mainly on performance tuning and automation. Adamo has almost eight years of experience working as a DBA, and in the past three has moved to NoSQL technologies without giving up relational databases. He likes to play video games and study everything that is related to engines. Adamo lives with his wife in São Paulo, Brazil.

by Adamo Tonete at October 17, 2017 03:40 PM

October 16, 2017

Peter Zaitsev

When Should I Enable MongoDB Sharding?

MongoDB Sharding

MongoDB ShardingIn this blog post, we will talk about MongoDB sharding and walk through the main reasons why you should start a cluster (independent of the approach you have chosen).

Note: I will cover this subject in my webinar How To Scale with MongoDB on Wednesday, October 18, 2017, at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Sharding is the most complex architecture you can deploy using MongoDB, and there are two main approaches as to when to shard or not. The first is to configure the cluster as soon as possible – when you predict high throughput and fast data growth.

The second says you should use a cluster as the best alternative when the application demands more resources than the replica set can offer (such as low memory, an overloaded disk or high processor load). This approach is more corrective than preventative, but we will discuss that in the future.

1) Disaster recovery plan

Disaster recovery (DR) is a very delicate topic: how long would you tolerate an outage? If necessary, how long would it take you to restore the entire database? Depending on the database size and on disk speed, a backup/restore process might take hours or even days!
There is no hard number in Gigabytes to justify a cluster. But in general, you should engage when the database is more than 200GB the backup and restore processes might take a while to finish.
Let’s consider the case where we have a replica set with a 300GB database. The full restore process might last around four hours, whereas if the database has two shards, it will take about two hours – and depending on the number of shards we can improve that time. Simple math: if there are two shards, the restore process takes half of the time to restore when compared to a single replica set.

2) Hardware limitations

Disk and memory are inexpensive nowadays. However, this is not true when companies need to scale out to high numbers (such as TB of RAM). Suppose your cloud provider can only offer you up to 5,000 IOPS in the disk subsystem, but the application needs more than that to work correctly. To work around this performance limitation, it is better to start a cluster and divide the writes among instances. That said, if there are two shards the application will have 10000 IOPS available to use for writes and reads in the disk subsystem.

3) Storage engine limitations

There are a few storage engine limitations that can be a bottleneck in your use case. MMAPv2 does have a lock per collection, while WiredTiger has tickets that will limit the number of writes and reads happening concurrently. Although we can tweak the number of tickets available in WiredTiger, there is a virtual limit – which means that changing the available tickets might generate processor overload instead of increasing performance. If one of these situations becomes a bottleneck in your system, you start a cluster. Once you shard the collection, you distribute the load/lock among the different instances.

4) Hot data vs. cold data

Several databases only work with a small percentage of the data being stored. This is called hot data or working set. Cold data or historical data is rarely read, and demands considerable system resources when it is. So why spend money on expensive machines that only store cold data or low-value data? With a cluster deployment we can choose where the cold data is stored, and use cheap devices and disks to do so. The same is true for hot data – we can use better machines to have better performance. This methodology also speeds up writes and reads on the hot data, as the indexes are smaller and add less overhead to the system.

5) Geo-distributed data

It doesn’t matter whether this need comes from application design or legal compliance. If the data must stay within continent or country borders, a cluster helps make that happen. It is possible to limit data localization so that it is stored solely in a specific “part of the world.” The number of shards and their geographic positions is not essential for the application, as it only views the database. This is commonly used in worldwide companies for better performance, or simply to comply with the local law.

6) Infrastructure limitations

Infrastructure and hardware limitations are very similar. When thinking about infrastructure, however, we focus on specific cases when the instances should be small. An example is running MongoDB on Mesos. Some providers only offer a few cores and a limited amount of RAM. Even if you are willing to pay more for that, it is not possible to purchase more than they offer as their products. A cluster provides the option to split a small amount of data among a lot of shards, reaching the same performance a big and expensive machine provides.

7) Failure isolation

Consider that a replica set or a single instance holds all the data. If for any reason this instance/replica set goes down, the whole application goes down. In a cluster, if we lose one of the five shards, 80% of the data is still available. Running a few shards helps to isolate failures. Obviously, running a bunch of instances makes the cluster prone to have a failed instance, but as each shard must have at least three instances the probability of the entire shard being down is minimal. For providers that offer different zones, it is good practice to have different members of the shard in different availability zones (or even different regions).

8) Speed up queries

Queries can take too long, depending on the number of reads they perform. In a clustered deployment, queries can run in parallel and speed up the query response time. If a query runs in ten seconds in a replica set, it is very likely that the same query will run in five to six seconds if the cluster has two shards, and so on.

I hope this helps with MongoDB sharding. Having a cluster solves several other problems as well, and we have listed only a few of them. Don’t miss our webinar regarding scaling out MongoDB next Wednesday, October 18, 2017!

by Adamo Tonete at October 16, 2017 06:27 PM

October 13, 2017

Peter Zaitsev

This Week in Data with Colin Charles 10: MariaDB and Upcoming Appearances

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Beyond spending time getting ready for Velocity and Open Source Summit Europe this week, there was some feature testing this week that compared MySQL and MariaDB. Naturally, a long report/blog is coming soon. Stay tuned.

Releases

I reckon a lot of folks are swamped after Percona Live Europe Dublin and Oracle OpenWorld, so the releases in the MySQL universe are a bit quieter.

Link List

Upcoming Appearances

Percona’s website keeps track of community events, so check out where to see and listen to a Perconian speak. My upcoming appearances are:

Feedback

I was asked why there weren’t many talks from MariaDB Foundation / MariaDB Corporation at Percona Live Europe 2017. Simple answer: there were hardly any submissions. We had two talk submissions from one speaker from the MariaDB Foundation (we accepted the one on MariaDB 10.3). There was another talk submission from a speaker from MariaDB Corporation (again, accepted). We announced the second talk in the sneak preview, but the talk was canceled as the speaker was unable to attend. We did, however, have a deep breadth of talks about MariaDB, with many talks that discussed high availability, security, proxies and the like.

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at October 13, 2017 09:40 PM

MariaDB AB

What’s New in MariaDB Connector/ODBC 3.0

What’s New in MariaDB Connector/ODBC 3.0 RalfGebhardt Fri, 10/13/2017 - 09:16

We are pleased to announce the general availability (GA) of MariaDB Connector/ODBC 3.0. MariaDB Connector/ODBC 3.0.2 is the newest version of MariaDB Connector/ODBC. This release is compatible with MariaDB Connector/ODBC 2.0 – no code changes necessary to upgrade.

MariaDB Connector/ODBC 3.0 is based on the new MariaDB Connector/C 3.0. It therefore takes advantage of the new security enhancements and plugins provided by MariaDB Connector/C.

Direct dependencies to functions of MariaDB Connector/C have been removed, only the functions provided by the documented API are used. This allows dynamic linking where preferred. Our own binaries still use the static linking.

New features

  • Support of a new bulk load implementation (needs MariaDB Server 10.2)
  • SQLExecDirect, Prepares and executes a statement directly (needs MariaDB Server 10.2)
  • New SSL options based on the implementation in MariaDB Connector/C
    • GnuTLS
    • Windows SChannel: removes dependencies on external libraries
    • Windows SChannel: becomes the default for SSL on Windows 
    • TLSv1.1 and TLSv1.2 support
  • SSL option can now be defined in the setup dialog

Download the MariaDB Connector now and learn about the newest evolution of MariaDB Connector/ODBC 3.0.

We are pleased to announce the general availability (GA) of MariaDB Connector/ODBC 3.0. MariaDB Connector/ODBC 3.0.2 is the newest version of MariaDB Connector/ODBC. This release is compatible with MariaDB Connector/ODBC 2.0 – no code changes necessary to upgrade.

Login or Register to post comments

by RalfGebhardt at October 13, 2017 01:16 PM

Jean-Jerome Schmidt

How to Automate Galera Cluster Using the ClusterControl CLI

As sysadmins and developers, we spend a lot our time in a terminal. So we brought ClusterControl to the terminal with our command line interface tool called s9s. s9s provides an easy interface to the ClusterControl RPC v2 API. You will find it very useful when working with large scale deployments, as the CLI allows will allow you to design more complex features and workflows.

This blog post showcases how to use s9s to automate the management of Galera Cluster for MySQL or MariaDB, as well as a simple master-slave replication setup.

Setup

You can find installation instructions for your particular OS in the documentation. What’s important to note is that if you happen to use the latest s9s-tools, from GitHub, there’s a slight change in the way you create a user. The following command will work fine:

s9s user --create --generate-key --controller="https://localhost:9501" dba

In general, there are two steps required if you want to configure CLI locally on the ClusterControl host. First, you need to create a user and then make some changes in the configuration file - all the steps are included in the documentation.

Deployment

Once the CLI has been configured correctly and has SSH access to your target database hosts, you can start the deployment process. At the time of writing, you can use the CLI to deploy MySQL, MariaDB and PostgreSQL clusters. Let’s start with an example of how to deploy Percona XtraDB Cluster 5.7. A single command is required to do that.

s9s cluster --create --cluster-type=galera --nodes="10.0.0.226;10.0.0.227;10.0.0.228"  --vendor=percona --provider-version=5.7 --db-admin-passwd="pass" --os-user=root --cluster-name="PXC_Cluster_57" --wait

Last option “--wait” means that the command will wait until the job completes, showing its progress. You can skip it if you want - in that case, the s9s command will return immediately to shell after it registers a new job in cmon. This is perfectly fine as cmon is the process which handles the job itself. You can always check the progress of a job separately, using:

root@vagrant:~# s9s job --list -l
--------------------------------------------------------------------------------------
Create Galera Cluster
Installing MySQL on 10.0.0.226                                           [██▊       ]
                                                                                                                                                                                                         26.09%
Created   : 2017-10-05 11:23:00    ID   : 1          Status : RUNNING
Started   : 2017-10-05 11:23:02    User : dba        Host   :
Ended     :                        Group: users
--------------------------------------------------------------------------------------
Total: 1

Let’s take a look at another example. This time we’ll create a new cluster, MySQL replication: simple master - slave pair. Again, a single command is enough:

root@vagrant:~# s9s cluster --create --nodes="10.0.0.229?master;10.0.0.230?slave" --vendor=percona --cluster-type=mysqlreplication --provider-version=5.7 --os-user=root --wait
Create MySQL Replication Cluster
/ Job  6 FINISHED   [██████████] 100% Cluster created

We can now verify that both clusters are up and running:

root@vagrant:~# s9s cluster --list --long
ID STATE   TYPE        OWNER GROUP NAME           COMMENT
 1 STARTED galera      dba   users PXC_Cluster_57 All nodes are operational.
 2 STARTED replication dba   users cluster_2      All nodes are operational.
Total: 2

Of course, all of this is also visible via the GUI:

Now, let’s add a ProxySQL loadbalancer:

root@vagrant:~# s9s cluster --add-node --nodes="proxysql://10.0.0.226" --cluster-id=1
WARNING: admin/admin
WARNING: proxy-monitor/proxy-monitor
Job with ID 7 registered.

This time we didn’t use ‘--wait’ option so, if we want to check the progress, we have to do it on our own. Please note that we are using a job ID which was returned by the previous command, so we’ll obtain information on this particular job only:

root@vagrant:~# s9s job --list --long --job-id=7
--------------------------------------------------------------------------------------
Add ProxySQL to Cluster
Waiting for ProxySQL                                                     [██████▋   ]
                                                                            65.00%
Created   : 2017-10-06 14:09:11    ID   : 7          Status : RUNNING
Started   : 2017-10-06 14:09:12    User : dba        Host   :
Ended     :                        Group: users
--------------------------------------------------------------------------------------
Total: 7

Scaling out

Nodes can be added to our Galera cluster via a single command:

s9s cluster --add-node --nodes 10.0.0.229 --cluster-id 1
Job with ID 8 registered.
root@vagrant:~# s9s job --list --job-id=8
ID CID STATE  OWNER GROUP CREATED  RDY  TITLE
 8   1 FAILED dba   users 14:15:52   0% Add Node to Cluster
Total: 8

Something went wrong. We can check what exactly happened:

root@vagrant:~# s9s job --log --job-id=8
addNode: Verifying job parameters.
10.0.0.229:3306: Adding host to cluster.
10.0.0.229:3306: Testing SSH to host.
10.0.0.229:3306: Installing node.
10.0.0.229:3306: Setup new node (installSoftware = true).
10.0.0.229:3306: Detected a running mysqld server. It must be uninstalled first, or you can also add it to ClusterControl.

Right, that IP is already used for our replication server. We should have used another, free IP. Let’s try that:

root@vagrant:~# s9s cluster --add-node --nodes 10.0.0.231 --cluster-id 1
Job with ID 9 registered.
root@vagrant:~# s9s job --list --job-id=9
ID CID STATE    OWNER GROUP CREATED  RDY  TITLE
 9   1 FINISHED dba   users 14:20:08 100% Add Node to Cluster
Total: 9

Managing

Let’s say we want to take a backup of our replication master. We can do that from the GUI but sometimes we may need to integrate it with external scripts. ClusterControl CLI would make a perfect fit for such case. Let’s check what clusters we have:

root@vagrant:~# s9s cluster --list --long
ID STATE   TYPE        OWNER GROUP NAME           COMMENT
 1 STARTED galera      dba   users PXC_Cluster_57 All nodes are operational.
 2 STARTED replication dba   users cluster_2      All nodes are operational.
Total: 2

Then, let’s check the hosts in our replication cluster, with cluster ID 2:

root@vagrant:~# s9s nodes --list --long --cluster-id=2
STAT VERSION       CID CLUSTER   HOST       PORT COMMENT
soM- 5.7.19-17-log   2 cluster_2 10.0.0.229 3306 Up and running
soS- 5.7.19-17-log   2 cluster_2 10.0.0.230 3306 Up and running
coC- 1.4.3.2145      2 cluster_2 10.0.2.15  9500 Up and running

As we can see, there are three hosts that ClusterControl knows about - two of them are MySQL hosts (10.0.0.229 and 10.0.0.230), the third one is the ClusterControl instance itself. Let’s print only the relevant MySQL hosts:

root@vagrant:~# s9s nodes --list --long --cluster-id=2 10.0.0.2*
STAT VERSION       CID CLUSTER   HOST       PORT COMMENT
soM- 5.7.19-17-log   2 cluster_2 10.0.0.229 3306 Up and running
soS- 5.7.19-17-log   2 cluster_2 10.0.0.230 3306 Up and running
Total: 3

In the “STAT” column you can see some characters there. For more information, we’d suggest to look into the manual page for s9s-nodes (man s9s-nodes). Here we’ll just summarize the most important bits. First character tells us about the type of the node: “s” means it’s regular MySQL node, “c” - ClusterControl controller. Second character describes the state of the node: “o” tells us it’s online. Third character - role of the node. Here “M” describes a master and “S” - a slave while “C” stands for controller. Final, fourth character tells us if the node is in maintenance mode. “-” means there’s no maintenance scheduled. Otherwise we’d see “M” here. So, from this data we can see that our master is a host with IP: 10.0.0.229. Let’s take a backup of it and store it on the controller.

root@vagrant:~# s9s backup --create --nodes=10.0.0.229 --cluster-id=2 --backup-method=xtrabackupfull --wait
Create Backup
| Job 12 FINISHED   [██████████] 100% Command ok

We can then verify if it indeed completed ok. Please note the “--backup-format” option which allows you to define which information should be printed:

root@vagrant:~# s9s backup --list --full --backup-format="Started: %B Completed: %E Method: %M Stored on: %S Size: %s %F\n" --cluster-id=2
Started: 15:29:11 Completed: 15:29:19 Method: xtrabackupfull Stored on: 10.0.0.229 Size: 543382 backup-full-2017-10-06_152911.xbstream.gz
Total 1
Severalnines
 
DevOps Guide to Database Management
Learn about what you need to know to automate and manage your open source databases

Monitoring

All databases have to be monitored. ClusterControl uses advisors to watch some of the metrics on both MySQL and the operating system. When a condition is met, a notification is sent. ClusterControl provides also an extensive set of graphs, both real-time as well as historical ones for post-mortem or capacity planning. Sometimes it would be great to have access to some of those metrics without having to go through the GUI. ClusterControl CLI makes it possible through the s9s-node command. Information on how to do that can be found in the manual page of s9s-node. We’ll show some examples of what you can do with CLI.

First of all, let’s take a look at the “--node-format” option to “s9s node” command. As you can see, there are plenty of options to print interesting content.

root@vagrant:~# s9s node --list --node-format "%N %T %R %c cores %u%% CPU utilization %fmG of free memory, %tMB/s of net TX+RX, %M\n" "10.0.0.2*"
10.0.0.226 galera none 1 cores 13.823200% CPU utilization 0.503227G of free memory, 0.061036MB/s of net TX+RX, Up and running
10.0.0.227 galera none 1 cores 13.033900% CPU utilization 0.543209G of free memory, 0.053596MB/s of net TX+RX, Up and running
10.0.0.228 galera none 1 cores 12.929100% CPU utilization 0.541988G of free memory, 0.052066MB/s of net TX+RX, Up and running
10.0.0.226 proxysql  1 cores 13.823200% CPU utilization 0.503227G of free memory, 0.061036MB/s of net TX+RX, Process 'proxysql' is running.
10.0.0.231 galera none 1 cores 13.104700% CPU utilization 0.544048G of free memory, 0.045713MB/s of net TX+RX, Up and running
10.0.0.229 mysql master 1 cores 11.107300% CPU utilization 0.575871G of free memory, 0.035830MB/s of net TX+RX, Up and running
10.0.0.230 mysql slave 1 cores 9.861590% CPU utilization 0.580315G of free memory, 0.035451MB/s of net TX+RX, Up and running

With what we shown here, you probably can imagine some cases for automation. For example, you can watch the CPU utilization of the nodes and if it reaches some threshold, you can execute another s9s job to spin up a new node in the Galera cluster. You can also, for example, monitor memory utilization and send alerts if it passess some threshold.

The CLI can do more than that. First of all, it is possible to check the graphs from within the command line. Of course, those are not as feature-rich as graphs in the GUI, but sometimes it’s enough just to see a graph to find an unexpected pattern and decide if it is worth further investigation.

root@vagrant:~# s9s node --stat --cluster-id=1 --begin="00:00" --end="14:00" --graph=load 10.0.0.231
root@vagrant:~# s9s node --stat --cluster-id=1 --begin="00:00" --end="14:00" --graph=sqlqueries 10.0.0.231

During emergency situations, you may want to check resource utilization across the cluster. You can create a top-like output that combines data from all of the cluster nodes:

root@vagrant:~# s9s process --top --cluster-id=1
PXC_Cluster_57 - 14:38:01                                                                                                                                                               All nodes are operational.
4 hosts, 7 cores,  2.2 us,  3.1 sy, 94.7 id,  0.0 wa,  0.0 st,
GiB Mem : 2.9 total, 0.2 free, 0.9 used, 0.2 buffers, 1.6 cached
GiB Swap: 3 total, 0 used, 3 free,

PID   USER       HOST       PR  VIRT      RES    S   %CPU   %MEM COMMAND
 8331 root       10.0.2.15  20   743748    40948 S  10.28   5.40 cmon
26479 root       10.0.0.226 20   278532     6448 S   2.49   0.85 accounts-daemon
 5466 root       10.0.0.226 20    95372     7132 R   1.72   0.94 sshd
  651 root       10.0.0.227 20   278416     6184 S   1.37   0.82 accounts-daemon
  716 root       10.0.0.228 20   278304     6052 S   1.35   0.80 accounts-daemon
22447 n/a        10.0.0.226 20  2744444   148820 S   1.20  19.63 mysqld
  975 mysql      10.0.0.228 20  2733624   115212 S   1.18  15.20 mysqld
13691 n/a        10.0.0.227 20  2734104   130568 S   1.11  17.22 mysqld
22994 root       10.0.2.15  20    30400     9312 S   0.93   1.23 s9s
 9115 root       10.0.0.227 20    95368     7192 S   0.68   0.95 sshd
23768 root       10.0.0.228 20    95372     7160 S   0.67   0.94 sshd
15690 mysql      10.0.2.15  20  1102012   209056 S   0.67  27.58 mysqld
11471 root       10.0.0.226 20    95372     7392 S   0.17   0.98 sshd
22086 vagrant    10.0.2.15  20    95372     4960 S   0.17   0.65 sshd
 7282 root       10.0.0.226 20        0        0 S   0.09   0.00 kworker/u4:2
 9003 root       10.0.0.226 20        0        0 S   0.09   0.00 kworker/u4:1
 1195 root       10.0.0.227 20        0        0 S   0.09   0.00 kworker/u4:0
27240 root       10.0.0.227 20        0        0 S   0.09   0.00 kworker/1:1
 9933 root       10.0.0.227 20        0        0 S   0.09   0.00 kworker/u4:2
16181 root       10.0.0.228 20        0        0 S   0.08   0.00 kworker/u4:1
 1744 root       10.0.0.228 20        0        0 S   0.08   0.00 kworker/1:1
28506 root       10.0.0.228 20    95372     7348 S   0.08   0.97 sshd
  691 messagebus 10.0.0.228 20    42896     3872 S   0.08   0.51 dbus-daemon
11892 root       10.0.2.15  20        0        0 S   0.08   0.00 kworker/0:2
15609 root       10.0.2.15  20   403548    12908 S   0.08   1.70 apache2
  256 root       10.0.2.15  20        0        0 S   0.08   0.00 jbd2/dm-0-8
  840 root       10.0.2.15  20   316200     1308 S   0.08   0.17 VBoxService
14694 root       10.0.0.227 20    95368     7200 S   0.00   0.95 sshd
12724 n/a        10.0.0.227 20     4508     1780 S   0.00   0.23 mysqld_safe
10974 root       10.0.0.227 20    95368     7400 S   0.00   0.98 sshd
14712 root       10.0.0.227 20    95368     7384 S   0.00   0.97 sshd
16952 root       10.0.0.227 20    95368     7344 S   0.00   0.97 sshd
17025 root       10.0.0.227 20    95368     7100 S   0.00   0.94 sshd
27075 root       10.0.0.227 20        0        0 S   0.00   0.00 kworker/u4:1
27169 root       10.0.0.227 20        0        0 S   0.00   0.00 kworker/0:0
  881 root       10.0.0.227 20    37976      760 S   0.00   0.10 rpc.mountd
  100 root       10.0.0.227  0        0        0 S   0.00   0.00 deferwq
  102 root       10.0.0.227  0        0        0 S   0.00   0.00 bioset
11876 root       10.0.0.227 20     9588     2572 S   0.00   0.34 bash
11852 root       10.0.0.227 20    95368     7352 S   0.00   0.97 sshd
  104 root       10.0.0.227  0        0        0 S   0.00   0.00 kworker/1:1H

When you take a look at the top, you’ll see CPU and memory statistics aggregated across the whole cluster.

root@vagrant:~# s9s process --top --cluster-id=1
PXC_Cluster_57 - 14:38:01                                                                                                                                                               All nodes are operational.
4 hosts, 7 cores,  2.2 us,  3.1 sy, 94.7 id,  0.0 wa,  0.0 st,
GiB Mem : 2.9 total, 0.2 free, 0.9 used, 0.2 buffers, 1.6 cached
GiB Swap: 3 total, 0 used, 3 free,

Below you can find the list of processes from all of the nodes in the cluster.

PID   USER       HOST       PR  VIRT      RES    S   %CPU   %MEM COMMAND
 8331 root       10.0.2.15  20   743748    40948 S  10.28   5.40 cmon
26479 root       10.0.0.226 20   278532     6448 S   2.49   0.85 accounts-daemon
 5466 root       10.0.0.226 20    95372     7132 R   1.72   0.94 sshd
  651 root       10.0.0.227 20   278416     6184 S   1.37   0.82 accounts-daemon
  716 root       10.0.0.228 20   278304     6052 S   1.35   0.80 accounts-daemon
22447 n/a        10.0.0.226 20  2744444   148820 S   1.20  19.63 mysqld
  975 mysql      10.0.0.228 20  2733624   115212 S   1.18  15.20 mysqld
13691 n/a        10.0.0.227 20  2734104   130568 S   1.11  17.22 mysqld

This can be extremely useful if you need to figure out what’s causing the load and which node is the most affected one.

Hopefully, the CLI tool makes it easier for you to integrate ClusterControl with external scripts and infrastructure orchestration tools. We hope you’ll enjoy using this tool and if you have any feedback on how to improve it, feel free to let us know.

by krzysztof at October 13, 2017 10:37 AM

October 12, 2017

Peter Zaitsev

A Mystery with open_files_limit

open_files_limit

open_files_limitIn this blog, we’ll look at a mystery around setting the open_file_limit variable in MySQL and Percona Server for MySQL.

MySQL Server needs file descriptors to run. It uses them to open new connections, store tables in the cache, create temporary tables to resolve complicated queries and access persistent ones. If mysqld is not able to open new files when needed, it can stop functioning correctly. A common symptom of this issue is error 24: “Too many open files.”

The number of file descriptors

mysqld
 can open simultaneously is defined by the configuration
open_files_limit
 option. You would expect it to work like any other MySQL Server option: set in the configuration file, restart
mysqld
and use more or fewer descriptors. All other configuration variables work this way. But
open_files_limit
also depends on the operating system (OS) limits. This makes setting the variable more complicated.

mysqld

As a user, when you start any application it cannot have limits set to be greater than the limits defined by the operating system for the user in question. Therefore, you would intuitively expect

mysqld
to set
open_files_limit
  to any value that is less than the OS limit. This is not the case, however. No matter what value you set for the
open_files_limit
 variable, the OS limit is used unless it is set to “infinity”.

sveta@Thinkie:~$ ulimit -n
32000
sveta@Thinkie:$ cat /etc/my.cnf
[mysqld]
open-files-limit=16000
...
sveta@Thinkie:$ ./bin/mysqld &
sveta@Thinkie:$ mysql -uroot
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 3
Server version: 5.7.19-17-debug-log Source distribution
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> select @@open_files_limit;
+--------------------+
| @@open_files_limit |
+--------------------+
|              32000 |
+--------------------+
1 row in set (0.00 sec)

The reason for this can be found in the code contained in the 

mysys/my_file.c
 file:

static uint set_max_open_files(uint max_file_limit)
{
  struct rlimit rlimit;
  uint old_cur;
  DBUG_ENTER("set_max_open_files");
  DBUG_PRINT("enter",("files: %u", max_file_limit));
  if (!getrlimit(RLIMIT_NOFILE,&rlimit))
  {
    old_cur= (uint) rlimit.rlim_cur;
    DBUG_PRINT("info", ("rlim_cur: %u  rlim_max: %u",
            (uint) rlimit.rlim_cur,
            (uint) rlimit.rlim_max));
    if (rlimit.rlim_cur == RLIM_INFINITY)
      rlimit.rlim_cur = max_file_limit;
    if (rlimit.rlim_cur >= max_file_limit)
      DBUG_RETURN(rlimit.rlim_cur);     /* purecov: inspected */
    rlimit.rlim_cur= rlimit.rlim_max= max_file_limit;
    if (setrlimit(RLIMIT_NOFILE, &rlimit))
      max_file_limit= old_cur;          /* Use original value */
    else
    { 
      rlimit.rlim_cur= 0;           /* Safety if next call fails */
      (void) getrlimit(RLIMIT_NOFILE,&rlimit);
      DBUG_PRINT("info", ("rlim_cur: %u", (uint) rlimit.rlim_cur));
      if (rlimit.rlim_cur)          /* If call didn't fail */
    max_file_limit= (uint) rlimit.rlim_cur;
    } 
  }
  DBUG_PRINT("exit",("max_file_limit: %u", max_file_limit));
  DBUG_RETURN(max_file_limit);
}

Particularly these lines:

if (rlimit.rlim_cur >= max_file_limit)
  DBUG_RETURN(rlimit.rlim_cur);    /* purecov: inspected */

This code tells

mysqld
to take the maximum value of what is specified in either the variable
open_files_limit
, or the soft system user limit.

I reported this behavior as documentation bug #87681.

mysqld_safe

mysqld_safe
has its own
open_files_limit
 option. This option allows you to overwrite the system soft limit any way you want. However, on:

  • Red Hat Enterprise Linux 7
  • Oracle Linux 7
  • CentOS 7
  • SUSE Linux Enterprise Server 12
  • Fedora 25 and 26
  • Debian 8 or higher
  • Ubuntu 16.04 LTS or higher

This option as specified under the 

[mysqld_safe]
header in the configuration file is not used when you start
mysqld
as a service. To explain the reason for this behavior, we need to step back into history.

init.d

For a long time, many Linux Operating Systems used init.d to start certain commands together with the OS. The Init daemon executes scripts (usually located in the directory

/etc/init.d
) at system startup, depending on the runlevel.

The different implementations of

init.d
vary, but they have known drawbacks. For example,
init.d
starts everything sequentially. This means a new process has to wait if another has already started. This makes the startup process on multi-core machine slow. Another drawback related to our topic is that daemons started by
init.d
 inherit OS limits from the root user. If a program needs to be run by another user, the switch needs to happen in the startup script itself. But the order of option files that such users read can be different, depending if they are logged in via the
ssh
su
or
sudo
commands.

MySQL Server

MySQL Server’s startup sequence for the service is as follow:

  1. <Perform another job>
  2. Start
    mysqld_safe
    as
    mysql
    user:
    su - mysql -s /bin/bash -c "mysqld_safe > /dev/null &"

This behavior has existed at least since version 5.5.

Percona Server for MySQL

Before version 5.7, Percona Server for MySQL had a different startup sequence:

  1. <Perform another job>
  2. Start
    mysqld_safe
    as root and pass option
    --user=mysql
    to it:
    "${PERCONA_PREFIX}"/bin/mysqld_safe > /dev/null 2>&1 &

With this sequence, you only need to set a hard limit for a mysql user in the file 

/etc/security/limits.conf
, and
mysqld_safe
 will do the rest.

In version 5.7, Percona Server for MySQL backported the startup sequence from MySQL Server. Since then, setting a hard limit on the number of open files for mysql users in 

/etc/security/limits.conf
 is not enough. You also need to have a row
session required pam_limits.so
in the file
/etc/pam.d/common-session
. This is needed because the startup sequence for
mysql
users changed due to the design of  
init.d
.

SystemD

Linux developers performed several trials to find a better startup solution than

init.d
. Speaking for MySQL and Percona Server for MySQL startup, the most important innovation is SystemD. SystemD is becoming more and more popular. Therefore MySQL and Percona Server for MySQL do not use
init.d
 on Red Hat Enterprise Linux 7, Oracle Linux 7, CentOS 7, SUSE Linux Enterprise Server 12, Fedora 25 and 26, Debian 8 or higher and Ubuntu 16.04 LTS or higher. Instead, they use SystemD.

What does this mean for MySQL users?

Scripts started by SystemD start as required by the system user from the start. Therefore they do not inherit limits from the root user and use their own limits specified in 

/etc/security/limits.conf
. If you need to have your
mysqld
process limits differ from the defaults for user
mysql
, you need to set the option
LimitNOFILE
under the 
[Service]
section in the service configuration file. Again, you cannot then lower this limit using
open_files_limit
option, unless you set it to
Infinity
.

Both packages

To make things more complex, Percona Server for MySQL packages for Ubuntu contain both the 

mysql.server
script (used by
init.d
) and the service description for SystemD. In fact, SystemD is used after install — but you might be confused when looking at only the package files.

Conclusion

You should set the 

open_files_limit
variable together with the operating system limits. You should study how
init.d
 or SystemD works if you see values that you don’t expect.

How to change

open_files_limit
variable?

Operating System Startup daemon Where to put configuration
Red Hat Enterprise Linux 7, Oracle Linux 7, CentOS 7
SUSE Linux Enterprise Server 12
Fedora 25 and 26
Debian 8+
Ubuntu 16.04 LTS+
SystemD
/etc/security/limits.conf
and
/etc/pam.d/common-session

Service configuration:
sudo systemctl edit mysql

[mysqld]
section of the configuration file
Others init.d
/etc/security/limits.conf
and
/etc/pam.d/common-session

[mysqld_safe]
section of the configuration file
[mysqld]
section of the configuration file

 

Which values of

open_files_limit
variable make sense?

Soft User Limit
open_files_limit
range
Infinity Any
Positive Greater/equal than soft user limit and smaller than hard user limit

 

by Sveta Smirnova at October 12, 2017 07:08 PM

October 11, 2017

Peter Zaitsev

Percona Monitoring and Management 1.3.2 Is Now Available

Percona announces the release of Percona Monitoring and Management 1.3.2. This release only contains bug fixes related to usability.

For install and upgrade instructions, see Deploying Percona Monitoring and Management.

Bug fixes

  • PMM-1529: When the user selected “Today”, “This week”, “This month” or “This year” range in Metrics Monitor and clicked the Query Analytics button, the QAN page opened reporting no data for the selected range even if the data were available.
    Percona Monitoring and Management
  • PMM-1528: In some cases, the page not found error could appear instead of the QAN page after upgrading by using the Upgrade button.
  • PMM-1498 : In some cases, it was not possible to shut down the virtual machine containing the PMM Server imported as an OVA image.

Other bug fixes in this release: PMM-913, PMM-1215, PMM-1481PMM-1483, PMM-1507

 

by Borys Belinsky at October 11, 2017 06:07 PM

Webinar Thursday, October 12, 2017: MongoDB Readiness from an SRE and Ops Viewpoint

MongoDB Readiness

MongoDB ReadinessJoin Percona’s MongoDB Practice Manager David Murphy on Thursday, October 12, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7) as he discusses MongoDB Readiness from an SRE and Ops Viewpoint.

Operations teams (SRE, PE, DevOps, etc.) are being asked to take a more active role in database provisioning and scaling. Much of the MongoDB material available online is from one, two, three or even five years ago (or more). Finding useful content online that is helpful in breaking through the current state of MongoDB maturity and stability can be challenging with all this outdated material exists – especially when MongoDB is massively different than it was even in the 2.X series.

This webinar will cut through the noise and provide the 2017 state of MongoDB. You can expect to leave knowing more about how it behaves, when to use it and how it handles things like high availability and backup and recovery.

We will also review both the good and bad history of MongoDB, and talk about why you need to know how something works today (not how it worked in 2010) in this fast-paced environment. You will leave knowing MongoDB’s current maturity, a high-level view of how it works today and what your risk/benefit charts should look like when considering using it.

Key ops areas covered:

  • MongoDB architecture
  • High availability
  • Ansible and MongoDB
  • Cloud provisioning
  • Effective monitoring solutions
  • How to make sure you have consistency
  • Top five ops challenges and their solutions
  • How to think about multiple regions with MongoDB

Register for the webinar here.

MongoDB BackupsDavid Murphy, MongoDB Practice Manager

David is the Practice Manager for MongoDB @ Percona. He joined Percona in Oct 2015, before that he has been deep in both the MySQL and MongoDB database communities for some time. Other passions include DevOps, tool building and security.

by David Murphy at October 11, 2017 03:24 PM

MariaDB AB

Instant ADD COLUMN for InnoDB

Instant ADD COLUMN for InnoDB Marko Mäkelä Wed, 10/11/2017 - 11:02

MariaDB Server 10.3.2 alpha was released this week. For InnoDB, the new features coming in MariaDB Server 10.3 include CREATE SEQUENCE which is a logical continuation of the Persistent AUTO_INCREMENT that appeared in MariaDB Server 10.2.

Perhaps one of the most important InnoDB changes coming in MariaDB Server 10.3 is Instant ADD COLUMN for InnoDB tables. The design was brainstormed in April by engineers from MariaDB Corporation, Alibaba and Tencent. A prototype was developed by Vin Chen (陈福荣) from the Tencent Game DBA Team.

What is so special about this? Normally, adding a column to a table requires the full table to be rebuilt. The complexity of the operation is proportional to the size of the table, or O(n·m) where n is the number of rows in the table and m is the number of indexes. Sure, with my online ALTER TABLE for InnoDB tables in MySQL 5.6 and MariaDB Server 10.0, you would be able to modify the table while it is being rebuilt, but it would significantly increase the I/O and memory consumption and cause a replication lag. With instant ADD COLUMN, all that is needed is an O(log n) operation to insert a special hidden record into the table, and an update of the data dictionary. For a large table, instead of taking several hours, the operation would be completed in the blink of an eye. The ADD COLUMN operation is only slightly more expensive than a regular INSERT, due to locking constraints.

In the past, some developers may have implemented a kind of instant add column in the application by encoding multiple columns in a single TEXT or BLOB column. MariaDB Dynamic Columns was an early example of that. A more recent example is JSON and related string manipulation functions.

Adding real columns has the following advantages over encoding columns into a single expandable column:

  • Efficient storage in a native binary format
  • Data type safety
  • Indexes can be built natively
  • Constraints are available: UNIQUE, CHECK, FOREIGN KEY
  • DEFAULT values can be specified
  • Triggers can be written more easily

With instant ADD COLUMN, you can enjoy all the benefits of structured storage without the drawback of having to rebuild the table.

Instant ADD COLUMN is available for both old and new InnoDB tables. Basically you can just upgrade from MySQL 5.x or MariaDB and start adding columns instantly.

Limitations

Instant ADD COLUMN only applies when the added columns appear last in the table. The place specifier LAST is the default. If AFTER col is specified, then col must be the last column, or the operation will require the table to be rebuilt.

If the table contains a hidden FTS_DOC_ID column due to FULLTEXT INDEX, then instant ADD COLUMN will not be possible.

If the table becomes empty (either via TRUNCATE or DELETE), the table will be converted to the old non-instant format.

Naturally, the operation is crash-safe. If the server is killed while instant ADD COLUMN was executing, the operation will be rolled back. If it was the very first instant ADD COLUMN operation, the table will be restored to the old non-instant format.

InnoDB data files after instant ADD COLUMN cannot be imported to older versions of MariaDB or MySQL. Any table-rebuilding operation such as ALTER TABLE…FORCE will convert the table to the old format.

For technical reasons, instant ADD COLUMN is not available for ROW_FORMAT=COMPRESSED, which is one of the formats that I created before MySQL 5.5.

A simple example of instant ADD COLUMN

CREATE TABLE t(id INT PRIMARY KEY, u INT UNSIGNED NOT NULL UNIQUE)
ENGINE=InnoDB;
INSERT INTO t(id,u) VALUES(1,1),(2,2),(3,3);
ALTER TABLE t ADD COLUMN
(d DATETIME DEFAULT current_timestamp(),
 p POINT NOT NULL DEFAULT ST_GeomFromText('POINT(0 0)'),
 t TEXT CHARSET utf8 DEFAULT 'The quick brown fox jumps over the lazy dog');
UPDATE t SET t=NULL WHERE id=3;
SELECT id,u,d,ST_AsText(p),t FROM t;

SELECT variable_value FROM information_schema.global_status
WHERE variable_name = 'innodb_instant_alter_column';

The above example illustrates that when the added columns are declared NOT NULL, a DEFAULT value must be available, either implied by the data type or set explicitly by the user. The expression need not be constant, but it must not refer to the columns of the table, such as DEFAULT u+1 (a MariaDB extension). The DEFAULT current_timestamp() would be evaluated at the time of the ALTER TABLE and apply to each row, like it does for non-instant ALTER TABLE. If a subsequent ALTER TABLE changes the DEFAULT value for subsequent INSERT, the values of the columns in existing records will naturally be unaffected.

You can download MariaDB Server 10.3.2 here. Note that MariaDB Server 10.3.2 is an alpha release. Please do not use it in production, but feel free to test it.

For now, ALTER TABLE…DROP COLUMN will require the table to be rebuilt. We are proud of the exciting contributions from the robust MariaDB community. Stay tuned for new improvements coming to MariaDB!

Perhaps one of the most important InnoDB changes coming in MariaDB Server 10.3 is Instant ADD COLUMN for InnoDB tables. Normally, adding a column to a table requires the full table to be rebuilt. With instant ADD COLUMN, all that is needed is a special kind of an INSERT or UPDATE of table metadata. For a large table, instead of taking several hours, the operation would be completed in the blink of an eye.

Alexey Kopytov

Alexey Kopytov

Sat, 10/14/2017 - 00:33

ALGORITHM={COPY,INPLACE} effect

Hi Marko,

I wonder if there is a way to control the instant ADD COLUMN feature. Would ALGORITHM=COPY force the old ADD COLUMN
behavior?

Use the FORCE to avoid instant ADD COLUMN

Hi Alexey,

Sorry, I did not notice your question until now. The FORCE keyword, which was introduced in MySQL 5.6, will request a table to be rebuilt.
It is worth noting that not all table rebuild is eligible for LOCK=NONE or ALGORITHM=INPLACE. If the table contains multiple FULLTEXT INDEX, ALGORITHM=INPLACE is not allowed at all. If the table contains FULLTEXT or SPATIAL INDEX, LOCK=NONE will not be allowed. These restrictions were introduced in MySQL 5.6.
Instant ADD is free from these limitations. (I committed a fix for that after the 10.3.2 release.)

Login or Register to post comments

by Marko Mäkelä at October 11, 2017 03:02 PM

October 10, 2017

Peter Zaitsev

Webinar Wednesday, October 11, 2017: Percona Monitoring and Management (PMM) Demonstration

Percona Monitoring and Management

Percona Monitoring and Management (PMM)Join Percona’s Product Manager Michael Coburn as he presents a Percona Monitoring and Management (PMM) Demonstration on Wednesday, October 11, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

How can you optimize database performance if you can’t see what’s happening? Percona Monitoring and Management (PMM) is a free, open source database troubleshooting and performance optimization platform for MySQL and MongoDB. PMM uses Metrics Monitor (Grafana + Prometheus) for visualization of data points. It also has Query Analytics (QAN), to help identify and quantify non-performant queries and provide thorough time-based analysis to ensure that your data works as efficiently as possible.

Michael Coburn will provide a brief demo of PMM. He will also cover newly released features in PMM such as QAN for MongoDB, new MyRocks dashboards and tooltips for metrics monitoring.

By the end of the webinar you will have a better understanding of how to:

  • Observe database performance from a system and service metrics perspective
  • See database performance from the queries executing in MySQL and MongoDB
  • Leverage the metrics and queries from PMM to make informed decisions about crucial database resources: scaling your database tier, database resource utilization and management, and having your team focus on the most critical database events

Register for the webinar here.

Michael CoburnMichael Coburn, Principal Technical Account Manager

Michael joined Percona as a Consultant in 2012. He progressed through various roles including Managing Consultant, Principal Architect, Technical Account Manager, and Technical Support Engineer. He is now leading Product Management for Percona Monitoring and Management.

 

by Emily Ikuta at October 10, 2017 03:25 PM

Jean-Jerome Schmidt

[Updated] Monitoring Galera Cluster for MySQL or MariaDB - Understanding metrics and their meaning

To operate any database efficiently, you need to have insight into database performance. This might not be obvious when everything is going well, but as soon as something goes wrong, access to information can be instrumental in quickly and correctly diagnosing the problem.

All databases make some of their internal status data available to users. In MySQL, you can get this data mostly by running 'SHOW STATUS' and 'SHOW GLOBAL STATUS', by executing 'SHOW ENGINE INNODB STATUS', checking information_schema tables and, in newer versions, by querying performance_schema tables.

These methods are far from convenient in day-to-day operations, hence the popularity of different monitoring and trending solutions. Tools like Nagios/Icinga are designed to watch hosts/services, and alert when a service falls outside an acceptable range. Other tools such as Cacti and Munin provide a graphical look at host/service information, and give historical context to performance and usage. ClusterControl combines these two types of monitoring, so we’ll have a look at the information it presents, and how we should interpret it.

If you’re using Galera Cluster (MySQL Galera Cluster by Codership or MariaDB Cluster or Percona XtraDB Cluster), you may have noticed the following section in ClusterControl’s "Overview" tab:

Let’s see, step by step, what kind of data we have here.

The first column contains the list of nodes with their IP addresses - there’s not much else to say about it.

Second column is more interesting - it describes node status (wsrep_local_state_comment status). A node can be in different states:

  • Initialized - The node is up and running, but it’s not a part of a cluster. It can be caused, for example, by network issues;
  • Joining - The node is in the process of joining the cluster and it’s either receiving or requesting a state transfer from one of other nodes;
  • Donor/Desynced - The node serves as a donor to some other node which is joining the cluster;
  • Joined - The node is joined the cluster but its busy catching up on committed write sets;
  • Synced - The node is working normally.

In the same column within the bracket is the cluster status (wsrep_cluster_status status). It can have three distinct states:

  • Primary - The communication between nodes is working and quorum is present (majority of nodes is available)
  • Non-Primary - The node was a part of the cluster but, for some reason, it lost contact with the rest of the cluster. As a result, this node is considered inactive and it won’t accept queries
  • Disconnected - The node could not establish group communication.

"WSREP Cluster Size / Ready" tells us about a cluster size as the node sees it, and whether the node is ready to accept queries. Non-Primary components create a cluster with size of 1 and wsrep readiness is OFF.

Let’s take a look at the screenshot above, and see what it is telling us about Galera. We can see three nodes. Two of them (192.168.55.171 and 192.168.55.173) are perfectly fine, they are both "Synced" and the cluster is in "Primary" state. The cluster currently consists of two nodes. Node 192.168.55.172 is "Initialized" and it forms "non-Primary" component. It means that this node lost connection with the cluster - most likely some kind of network issues (in fact, we used iptables to block a traffic to this node from both 192.168.55.171 and 192.168.55.173).

At this moment we have to stop a bit and describe how Galera Cluster works internally. We’ll not go into too much details as it is not within a scope of this blog post but some knowledge is required to understand the importance of the data presented in next columns.

Galera is a "virtually" synchronous, multi-master cluster. It means that you should expect data to be transferred across nodes "virtually" at the same time (no more annoying issues with lagging slaves) and that you can write to any node in a cluster (no more annoying issues with promoting a slave to master). To accomplish that, Galera uses writesets - atomic set of changes that are replicated across the cluster. A writeset can contain several row changes and additional needed information like data regarding locking.

Once a client issues COMMIT, but before MySQL actually commits anything, a writeset is created and sent to all nodes in the cluster for certification. All nodes check whether it’s possible to commit the changes or not (as changes may interfere with other writes executed, in the meantime, directly on another node). If yes, data is actually committed by MySQL, if not, rollback is executed.

What’s important to remember is the fact that nodes, similar to slaves in regular replication, may perform differently - some may have better hardware than others, some may be more loaded than others. Yet Galera requires them to process the writesets in a short and quick manner, in order to maintain "virtual" synchronization. There has to be a mechanism which can throttle the replication and allow slower nodes to keep up with the rest of the cluster.

Let's take a look at "Local Send Q [now/avg]" and "Local Receive Q [now/avg]" columns. Each node has a local queue for sending and receiving writesets. It allows to parallelize some of the writes and queue data which couldn’t be processed at once if node cannot keep up with traffic. In SHOW GLOBAL STATUS we can find eight counters describing both queues, four counters per queue:

  • wsrep_local_send_queue - current state of the send queue
  • wsrep_local_send_queue_min - minimum since FLUSH STATUS
  • wsrep_local_send_queue_max - maximum since FLUSH STATUS
  • wsrep_local_send_queue_avg - average since FLUSH STATUS
  • wsrep_local_recv_queue - current state of the receive queue
  • wsrep_local_recv_queue_min - minimum since FLUSH STATUS
  • wsrep_local_recv_queue_max - maximum since FLUSH STATUS
  • wsrep_local_recv_queue_avg - average since FLUSH STATUS

The above metrics are unified across nodes under ClusterControl -> Performance -> DB Status:

ClusterControl displays "now" and "average" counters, as they are the most meaningful as a single number (you can also create custom graphs based on variables describing the current state of the queues) . When we see that one of the queues is rising, this means that the node can’t keep up with the replication and other nodes will have to slow down to allow it to catch up. We’d recommend to investigate a workload of that given node - check the process list for some long running queries, check OS statistics like CPU utilization and I/O workload. Maybe it’s also possible to redistribute some of the traffic from that node to the rest of the cluster.

"Flow Control Paused" shows information about the percentage of time a given node had to pause its replication because of too heavy load. When a node can’t keep up with the workload it sends Flow Control packets to other nodes, informing them they should throttle down on sending writesets. In our screenshot, we have value of ‘0.30’ for node 192.168.55.172. This means that almost 30% of the time this node had to pause the replication because it wasn’t able to keep up with writeset certification rate required by other nodes (or simpler, too many writes hit it!). As we can see, it’s "Local Receive Q [avg]" points us also to this fact.

Next column, "Flow Control Sent" gives us information about how many Flow Control packets a given node sent to the cluster. Again, we see that it’s node 192.168.55.172 which is slowing down the cluster.

What can we do with this information? Mostly, we should investigate what’s going on in the slow node. Check CPU utilization, check I/O performance and network stats. This first step helps to assess what kind of problem we are facing.

In this case, once we switch to CPU Usage tab, it becomes clear that extensive CPU utilization is causing our issues. Next step would be to identify the culprit by looking into PROCESSLIST (Query Monitor -> Running Queries -> filter by 192.168.55.172) to check for offending queries:

Or, check processes on the node from operating system’s side (Nodes -> 192.168.55.172 -> Top) to see if the load is not caused by something outside of Galera/MySQL.

In this case, we have executed mysqld command through cpulimit, to simulate slow CPU usage specifically for mysqld process by limiting it to 30% out of 400% available CPU (the server has 4 cores).

"Cert Deps Distance" column gives us information about how many writesets, on average, can be applied in parallel. Writesets can, sometimes, be executed at the same time - Galera takes advantage of this by using multiple wsrep_slave_threads to apply writesets. This column gives you some idea how many slave threads you could use on your workload. It’s worth noting that there’s no point in setting up wsrep_slave_threads variable to values higher than you see in this column or in wsrep_cert_deps_distance status variable, on which "Cert Deps Distance" column is based. Another important note - there is no point either in setting wsrep_slave_threads variable to more than number of cores your CPU has.

"Segment ID" - this column will require some more explanation. Segments are a new feature added in Galera 3.0. Before this version, writesets were exchanged between all nodes. Let’s say we have two datacenters:

This kind of chatter works ok on local networks but WAN is a different story - certification slows down due to increased latency, additional costs are generated because of network bandwidth used for transferring writesets between every member of the cluster.

With the introduction of "Segments", things changed. You can assign a node to a segment by modifying wsrep_provider_options variable and adding "gmcast.segment=x" (0, 1, 2) to it. Nodes with the same segment number are treated as they are in the same datacenter, connected by local network. Our graph then becomes different:

The main difference is that it’s no more everyone to everyone communication. Within each segment, yes - it’s still the same mechanism but both segments communicate only through a single connection between two chosen nodes. In case of downtime, this connection will failover automatically. As a result, we get less network chatter and less bandwidth usage between remote datacenters. So, basically, "Segment ID" column tells us to which segment a node is assigned.

"Last Committed" column gives us information about the sequence number of the writeset that was last executed on a given node. It can be useful in determining which node is the most current one if there’s a need to bootstrap the cluster.

Rest of the columns are self-explanatory: Server version, uptime of a node and when the status was updated.

As you can see, the "Galera Nodes" section of the "Nodes/Hosts Stats" in the "Overview" tab gives you a pretty good understanding of the cluster’s health - whether it forms a "Primary" component, how many nodes are healthy, are there any performance issues with some nodes and if yes, which node is slowing down the cluster.

This set of data comes in very handy when you operate your Galera cluster, so hopefully, no more flying blind :-)

by ashraf at October 10, 2017 08:54 AM

October 09, 2017

Peter Zaitsev

MySQL, Percona Server for MySQL and MariaDB Default Configuration Differences

MySQL and MariaDB Default Configuration

MySQL and MariaDB Default ConfigurationIn this blog post, I’ll discuss some of the MySQL and MariaDB default configuration differences, focusing on MySQL 5.7 and MariaDB 10.2. Percona Server for MySQL uses the same defaults as MySQL, so I will not list them separately.

MariaDB Server is a general purpose open source database, created by the founders of MySQL. MariaDB Server (referred to as MariaDB for brevity) has similar roots as Percona Server for MySQL, but is quickly diverging from MySQL compatibility and growing on its own. MariaDB has become the default installation for several operating systems (such as Red Hat Enterprise Linux/CentOS/Fedora). Changes in the default variables can make a large difference in the out-of-box performance of the database, so knowing what is different is important.

As MariaDB grows on its own and doesn’t remain 100% compatible with MySQL, the defaults configuration settings might not mean everything or behave the way they used to. It might use different variable names, or implement the same variables in new ways. You also need to take into account that MariaDB uses it’s own Aria storage engine that has many configuration options that do not exist in MySQL.

Note: In this blog, I am looking at variables common to both MySQL or MariaDB, but have different defaults, not variables that are specific to either MySQL or MariaDB (except for the different switches inside the optimizer_switch).

Binary Logs

Variable MariaDB Default MySQL Default
sync_binlog 0 1
binlog_format Mixed Row

 

MySQL has taken a more conservative stance when it comes to the binary log. In the newest versions of MySQL 5.7, they have updated two variables to help ensure all committed data remains intact and identical. Binlog_format was updated to row in MySQL in order to prevent non-deterministic statements from having different results on the slave. Row-based replication also helps when performing a lot of smaller updates. MariaDB defaults to the Mixed format. Mixed uses statement-based format unless certain criteria are met. It that case, it uses the row format. You can see the detailed criteria for when the row format is used here: https://mariadb.com/kb/en/the-mariadb-library/binary-log-formats/.

The other difference that can cause a significant impact on performance is related to sync_binlog. Sync_binlog controls the number of commit groups to collect before synchronizing the binary log to disk. MySQL has changed this to 1, which means that every transaction is flushed to disk before it is committed. This guarantees that there can never be a committed transaction that is not recorded (even during a system failure). This can create a big impact to performance, as shown by a Roel Van de Paar in his blog: https://www.percona.com/blog/2016/06/03/binary-logs-make-mysql-5-7-slower-than-5-6/

MariaDB utilizes a value of 0 for sync_binlog, which allows the operating system to determine when the binlog needs to be flushed. This provides better performance, but adds the risk that if MariaDB crashes (or power is lost) that some data may be lost.

MyISAM

Variable MariaDB Default MySQL Default
myisam_recover_options BACKUP,QUICK OFF
key_buffer_size 134217728 8388608

 

InnoDB replaced MyISAM as the default storage engine for some time now, but it is still used for many system tables. MySQL has tuned down the MyISAM settings, since it is not heavily used.

When mysqld opens a table, it checks whether the table is marked as crashed, or was not closed properly, and runs a check on it based on the myisam_recover_options settings. MySQL disables this by default, preventing recovery. MariaDB has enabled the BACKUP and QUICK recovery options. BACKUP causes a table_name-datetime.bak file to be created whenever a data file is changed during recovery. QUICK causes mysqld to not check the rows in a table if there are no delete blocks, ensuring recovery can occur faster.

MariaDB 10.2 increased the key_buffer_size. This allows for more index blocks to be stored in memory. All threads use this buffer, so a small buffer can cause information to get moved in and out of it more quickly. MariaDB 10.2 uses a buffer 16 times the size of MySQL 5.7: 134217728 in MariaDB 10.2 vsx 8388608 in MySQL 5.7.

Innodb

Variable MariaDB Default MySQL Default
innodb_max_undo_log_size 10485760(10 MiB) 1073741824(1024 MiB)

 

InnoDB variables have remained primarily unchanged between MariaDB 10.2 and MySQL 5.7. MariaDB has reduced the innodb_max_undo_log_size starting in 10.2.6. This was reduced from MySQL’s default of 1073741824(1024 MiB) to 10485760(10 MiB). These sizes reflect the maximum size an undo tablespace can become before it is marked for truncation. The tablespace doesn’t get truncated unless innodb_undo_log_truncate is enabled, and it is disabled in MySQL 5.7 and MariaDB 10.2 by default.

Logging

Variable MariaDB Default MySQL Default
log_error /var/log/mysqld.log
log_slow_admin_statements ON OFF
log_slow_slave_statements ON OFF
lc_messages_dir /usr/share/mysql

 

Logs are extremely important for troubleshooting any issues so the different choices in logging for MySQL 5.7 and MariaDB 10.2 are very interesting.

The log_error variable allows you to control where errors get logged. MariaDB 10.2 leaves this variable blank, writing all errors to stderr. MySQL 5.7 uses an explicitly created file at: /var/log/mysqld.log.

MariaDB 10.2 has also enabled additional slow statement logging. Log_slow_admin_statements create a record for any administrative statements that are not typically written to the binlog. Log_slow_slave_statements log the replicated statements sent from the master, if they are slow to complete. MySQL 5.7 does not enable logging of these statements by default.

Lc_messages_dir is the directory that contains the error message files for various languages. The variable defaults might be a little misleading in MariaDB 10.2. Lc_messages_dir is left empty by default, although it still uses the same path as MySQL 5.7. The files are located in /usr/share/mysql by default for both databases.

Performance Schema

Variable MariaDB Default MySQL Default
performance_schema OFF ON
performance_schema_setup_actors_size 100 -1 (auto adjusted)
performance_schema_setup_objects_size 100 -1 (auto adjusted)

 

The performance schema is an instrumentation tool that is designed to help troubleshoot various performance concerns. MySQL 5.7 enables the performance schema, and many of its instruments, by default. MySQL even goes so far as to detect the appropriate value for many Performance Schema variables instead of setting a static default. The Performance Schema does come with some overhead, and there are many blogs regarding how much this can impact performance. I think Sveta Smirnova said it best in her blog  Performance Schema Benchmarks OLTP RW: “…test on your system! No generic benchmark can exactly repeat a workload on your site.

MariaDB has disabled the Performance Schema by default, as well as adjusted a couple of the dynamic variables. Note that if you wish to disable or enable the Performance Schema, it requires a restart of the server since these variables are not dynamic. Performance_schema_setup_actors_size and performance_schema_setup_objects_size have both been set to a static 100, instead of the dynamic -1 used in MySQL 5.7. These both limit the number of rows that can be stored in relative tables. This creates a hard limit to the size these tables can grow to, helping to reduce their data footprint.

SSL/TLS

Variable MariaDB Default MySQL Default
ssl_ca ca.pem
ssl_cert server-cert.pem
ssl_key server-key.pem

 

Secure Sockets Layer (SSL) and Transport Layer Security (TLS) are cryptographic protocols that allow for secure communication. SSL is actually the predecessor of TLS, although both are often referred to as SSL. MySQL 5.7 and MariaDB 10.2 support both yaSSL and OpenSSL. The default configurations for SSL/TLS differ only slightly between MySQL 5.7 and MariaDB 10.2. MySQL 5.7 sets a specific file name for ssl_ca, ssl_cert, and ssl_key. These files are created in the base directory, identified by the variable basedir. Each of these variables is left blank in MariaDB 10.2, so you need to set them before using secure connections. These variables are not dynamic, so be sure to set the values before starting your database.

Query Optimizer

MariaDB 10.2 MySQL 5.7 Optimization Meaning Switch
N/A OFF Batched Key Access Controls use of BKA join algorithm batched_key_access
N/A ON Block Nested-Loop Controls use of BNL join algorithm block_nested_loop
N/A ON Condition Filtering Controls use of condition filtering condition_fanout_filter
Deprecated ON Engine Condition Pushdown Controls engine condition pushdown engine_condition_pushdown
ON N/A Engine Condition Pushdown Controls ability to push conditions down into non-mergeable views and derived tables condition_pushdown_for_derived
ON N/A Exists Subquery Allows conversion of in statements to exists statements exists_to_in
ON N/A Exists Subquery Allows conversion of exists statements to in statements in_to_exists
N/A ON Index Extensions Controls use of index extensions use_index_extensions
OFF N/A Index Merge Allows index_merge for non-equality conditions index_merge_sort_intersection
ON N/A Join Algorithms Perform index lookups for a batch of records from the join buffer join_cache_bka
ON N/A Join Algorithms Controls use of BNLH and BKAH algorithms join_cache_hashed
ON N/A Join Algorithms Controls use of incremental algorithms join_cache_incremental
ON N/A Join Algorithms Controls use of block-based algorithms for outer joins outer_join_with_cache
ON N/A Join Algorithms Controls block-based algorithms for use with semi-join operations semijoin_with_cache
OFF N/A Join Buffer Creates the join buffer with an estimated size based on the estimated number of rows in the result optimize_join_buffer_size
ON N/A Materialized Temporary Tables Allows index creation on derived temporary tables derived_keys
ON N/A Materialized Temporary Tables Controls use of the rowid-merge strategy partial_match_rowid_merge
ON N/A Materialized Temporary Tables Controls use of the partial_match_table-scan strategy partial_match_table_scan
OFF ON Multi-Range Read Controls use of the multi-range read strategy mrr
OFF ON Multi-Range Read Controls use of cost-based MRR, if mrr=on mrr_cost_based
OFF N/A Multi-Range Read Enables key ordered scans if mrr=on mrr_sort_keys
ON N/A Order By Considers multiple equalities when ordering results ordery_uses_equalities
ON N/A Query Plan Allows the optimizer to use hidden components of InnoDB keys extended_keys
ON N/A Query Plan Controls the removal of irrelevant tables from the execution plan table_elimination
ON N/A Subquery Stores subquery results and correlation parameters for reuse subquery_cache
N/A ON Subquery Materialization Controls us of cost-based materialization ubquery_materialization_cost_based
N/A ON Subquery Materialization &

Semi-join

Controls the semi-join duplicate weedout strategy duplicateweedout

 

The query optimizer has several variances that not only affect query performance but also how you write SQL statements. The query optimizer is substantially different between MariaDB and MySQL, so even with identical configurations you are likely to see varying performance.

The sql_mode puts restrictions on how you can write queries. MySQL 5.7 has several additional restrictions compared to MariaDB 10.2. Only_full_group_by requires that all fields in any select…group by statement are either aggregated or inside the group by clause. The optimizer doesn’t assume anything regarding the grouping, so you must specify it explicitly.

No_zero_date, and no_zero_in_date both affect how the server interprets 0’s in dates. When no_zero_date is enabled, values of ‘0000-00-00’ are permitted but produce a warning. With strict mode enabled, then the value is not permitted and produces an error. No_zero_in_date is similar, except it applies to any section of the date(month, day, or year). With this disabled, dates with 0 parts, such as ‘2017-00-16’ are allowed as is. When enabled, the date is changed to ‘0000-00-00’ without warning. Strict mode prevents the date being inserted, unless ignore is provided as well. “INSERT IGNORE” and “UPDATE IGNORE” inserts the dates as ‘0000-00-00’. 5.7.4 changed this. No_zero_in_date was consolidated with strict mode, and the explicit option is deprecated.

The query_prealloc_size determines the size of the persistent buffer used for statement parsing and execution. If you regularly use complex queries, it can be useful to increase the size of this buffer, as it does not need to allocate additional memory during the query parsing. MySQL 5.7 has set this buffer to 8192, with a block size of 1024. MariaDB increased this value in 10.1.2 up to 24576.

Query_alloc_block_size dictates the size in bytes of any extra blocks allocated during query parsing. If memory fragmentation is a common problem, you might want to look at increasing this value. MySQL 5.7 uses 8192, while MariaDB 10.2 uses 16384 (twice that). Be careful when adjusting the block sizes: going too high consumes more than the needed amount of memory, and too low causes significant fragmentation.

The optimizer_switch variable contains many different switches that impact how the query optimizer plans and performs different queries. MariaDB 10.2 and MySQL 5.7 have many differences in their enabled options, and even the available options. You can see a brief breakdown of each of the options below. Any options with N/A is not supported in that server.

Miscellaneous

Variable MariaDB Default MySQL Default
default_tmp_storage_engine NULL InnoDB
group_concat_max_len 1048576(1M) 1024(1K)
Lock_wait_timeout 86400 (1 DAY) 31536000 (1 YEAR)
Max_allowed_packet (16777216) 16MB 4194304 (4MB)
Max_write_lock_count 4294967295 18446744073709551615
Old_passwords OFF 0
Open_files_limit 0 dependent on OS
pid_file /var/lib/mysql/ /var/run/mysqld/
secure_file_priv Varies by installation
sort_buffer_size 2097152 262144
table_definition_cache 400 autosized
table_open_cache_instances 8 16
thread_cache_size autosized autosized
thread_stack 292KB 192KB/256KB

 

There are many variables that do not fit well into a group. I will go over those here.

When creating temporary tables, if you do not specify a storage engine then a default is used. In MySQL 5.7 this is set to InnoDB, the same as the default_storage_engine. MariaDB 10.2 also uses InnoDB, but it is not explicitly set. MariaDB sets the default_tmp_storage_engine to NULL, which causes it to use the default_storage_engine. This is important to remember if you change your default storage engine, as it would also change the default for temporary tables. An Important note, in MariaDB this is only relevant to tables created with “CREATE TEMPORARY TABLE”. Internal in-memory temporary tables use the memory storage engine, and internal, on-disk temporary tables use the aria engine by default.

The Group_concat function can cause some very large results if left unchecked. You can restrict the maximum size of results from this function with group_concat_max_len. MySQL 5.7 limits this to 1024(1K). MariaDB increased the value in 10.2.4 up to 1048576(1M).

Lock_wait_timeout controls how long a thread waits as it attempts to acquire a metadata lock. Several statements require a metadata lock, including DDL and DML operations, Lock Tables, Flush Tables with Read Lock and Handler statements. MySQL 5.7 defaults to the maximum possible value (one year), while MariaDB 10.2 has toned this down to one day.

Max_allowed_packet sets a limit to the maximum size of a packet, or a generated/intermediate string. This value is intentionally kept small (4MB) on MySQL 5.7 in order to detect the larger, intentionally incorrect packets. MariaDB has increased this value to 16MB. If using any large BLOB fields, you need to adjust this value to the size of the largest BLOB, in multiples of 1024, or you risk running into errors transferring the results.

Max_write_lock_count controls the number of write locks that can be given before some read lock requests being processed. In extremely heavy write loads your reads can pile up while waiting for the writes to complete. Modifying the max_write_lock_count allows you to tune how many writes can occur before some reads are allowed against the table. MySQL 5.7 keeps this value at the maximum (18446744073709551615), while MariaDB 10.2 lowered this to 4294967295. One thing to note is that this is still the maximum value on MariaDB 10.2.

Old_passwords controls the hashing method used by the password function, create user and grant statements. This variable has undergone several changes in MySQL 5.7. As of 5.7.4 the valid options were MySQL 4.1 native hashing, Pre-4.1 (old) hashing, and SHA-256 hashing. Version 5.7.5 removed the “old” Pre-4.1 method, and in 5.7.6 the variable has been deprecated with the intent of removing it entirely. MariaDB 10.2 uses a simple boolean value for this variable instead of the enumerated one in MySQL 5.7, though the intent is the same. Both default the old_passwords to OFF, or 0, and allow you to enable the older method if necessary.

Open_files_limit restricts the number of file descriptors mysqld can reserve. If set to 0 (the default in MariaDB 10.2) then mysqld reserves max_connections * 5 or max_connections + table_open_cache * 2, whichever is larger. It should be noted that mysqld cannot use an amount larger than the hard limit imposed by the operating system. MySQL 5.7 is also restricted by the operating systems hard limit, but is set at runtime to the real value permitted by the system (not a calculated value).

The pid_file allows you to control where you store the process id file. This isn’t a file you typically need, but it is good to know where it is located in case some unusual errors occur. On MariaDB you can find this inside /var/lib/mysql/, while on MySQL 5.7 you will find it inside /var/run/mysqld/. You will also notice a difference in the actual name of the file. MariaDB 10.2 uses the hostname as the name of the pid, while MySQL 5.7 simply uses the process name (mysqld.pid).

Secure_file_priv is a security feature that allows you to restrict the location of files used in data import and export operations. When this variable is empty, which was the default in MySQL before 5.7.6, there is no restriction. If the value is set to NULL, import and export operations are not permitted. The only other valid value is the directory path where files can be imported from or exported to. MariaDB 10.2 defaults to empty. As of MySQL 5.7.6, the default will depend on the install_layout CMAKE option.

INSTALL_LAYOUT DEFAULT VALUE
STANDALONE,WIN NULL(>=MySQL 5.7.16_,empty(<MySQL 5.7.16)
DEB,RPM,SLES,SVR4 /var/lib/mysql-files
Other Mysql-files under the CMAKE_INSTALL_PREFIX value

 

Mysqld uses a sort buffer regardless of storage engine. Every session that must perform a sort allocates a buffer equal to the value of sort_buffer_size. This buffer should at minimum be large enough to contain 15 tuples. In MySQL 5.7, this defaults to 262144, while MariaDB 10.2 uses the larger value 2097152.

The table_definition_cache restricts the number of table definitions that can be cached. If you have a large number of tables, mysqld may have to read the .frm file to get this information. MySQL 5.7 auto detects the appropriate size to use, while MariaDB 10.2 defaults this value to 400. On my small test VM, MySQL 5.7 chose a value of 1400.

The table_open_cache_instances vary in implementation between MySQL and MariaDB. MySQL 5.7 creates multiple instances of the table_open_cache, each holding a portion of the tables. This helps reduce contention, as a session needs to lock only one instance of the cache for DML statements. In MySQL 5.7.7 the default was a single instance, but this was changed in MySQL 5.7.8 (increased to 16). MariaDB has a more dynamic approach to the table_open_cache. Initially there is only a single instance of the cache, and the table_open_cache_instances variable is the maximum number of instances that can be created. If contention is detected on the single cache, another instance is created and an error logged. MariaDB 10.2 suspects that the maximum eight instances it sets by default should support up to 100 CPU cores.

The thread_cache_size controls when a new thread is created. When a client disconnects the thread is stored in the cache, as long as the maximum number of threads do not exist. Although this is not typically noticeable, if your server sees hundreds of connections per second you should increase this value to so that new connections can use the cache. Thread_cache_size is an automatically detected variable in both MySQL 5.7 and MariaDB 10.2, but their methods to calculate the default vary significantly. MySQL uses a formula, with a maximum of 100: 8+ (max_connections / 100). MariaDB 10.2 uses the smaller value out of 256 or the max_connections size.

The thread_stack is the stack size for each thread. If the stack size is too small, it limits the complexity of SQL statements, the recursion depth of stored procedures and other memory-consuming actions. MySQL 5.7 defaults the stack size to 192KB on 32-bit platforms and 256KB on 64-bit systems. MariaDB 10.2 adjusted this value several times. MariaDB 10.2.0 used 290KB, 10.2.1 used 291KB and 10.2.5 used 292KB.

Conclusion

Hopefully, this helps you with the configurations options between MySQL and MariaDB. Use the comments for any questions.

by Bradley Mickel at October 09, 2017 06:01 PM

MariaDB AB

MariaDB Server 10.3.2 Alpha available

MariaDB Server 10.3.2 Alpha available dbart Mon, 10/09/2017 - 13:38

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.3.2. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.3.2 alpha

Release Notes Changelog What is MariaDB Server 10.3?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.3.2. See the release notes and changelog for details.

Login or Register to post comments

by dbart at October 09, 2017 05:38 PM

MariaDB Foundation

MariaDB 10.3.2 now available

The MariaDB project is pleased to announce the availability of MariaDB 10.3.2, the 2nd alpha release in the MariaDB 10.3 series. See the release notes and changelogs for details. Download MariaDB 10.3.2 Release Notes Changelog What is MariaDB 10.3? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.3.2 now available appeared first on MariaDB.org.

by Ian Gilfillan at October 09, 2017 05:35 PM

October 06, 2017

Peter Zaitsev

This Week in Data with Colin Charles 9: Oracle OpenWorld and Percona Live Europe Post Mortem

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

This week: a quick roundup of releases, a summary of my thoughts about Percona Live Europe 2017 Dublin, links to look at and upcoming appearances. Oracle OpenWorld happened in San Francisco this past week, and there were lots of MySQL talks there as well (and a good community reception). I have a bit on that as well (from afar).

Look for these updates on Planet MySQL.

Releases

Percona Live Europe 2017Percona Live Europe Dublin

I arrived on Sunday and chose to rest for my tutorial on Monday. Ronald Bradford and I delivered a tutorial on MySQL Security, and in the morning we chose to rehearse. Percona Live Europe had a full tutorial schedule this year, albeit with one cancellation: MySQL and Docker by Giuseppe Maxia, whom we missed this conference. Check out his blog for further posts about MySQL, Docker, and SQL Roles in MySQL 8!

We had the welcome reception at Sinott’s Bar. There was a large selection of food on each table, as well as two drinks for each of us. It was lively, and I think we overtook most of the basement. Later that evening, there were drinks around the hotel bar, as people started to stream in for Tuesday’s packed schedule!

Tuesday was the conference kickoff, with Peter Zaitsev doing the opening keynote on the state of the open source database ecosystem. The bonus of this keynote was also the short 5-minute talks that would help you get a pick on the important topics and themes around the conference. I heard good things about this from attendees. While most people attended the talks, I spent most of my day in meetings! Then the Community Dinner (thank you Oracle for sponsoring), where we held this year’s Lightning Talks (and plenty more to drink). A summary of the social events is at Percona Live Europe Social.

Wednesday morning we definitely wanted to start a few minutes later, considering people were streaming in slower thanks to the poor weather (yes, it rained all day). The State of the Dolphin ensured we found out lots of new things coming to MySQL 8.0 (exciting!), then the sponsor keynote by Continuent given by MC Brown, followed by a database reliability engineering panel with the authors of Database Reliability Engineering Charity Majors and Laine Campbell. Their book signing went quickly too – they have many fans. We also heard from Pepper Media on their happy journey with Percona. Another great day of talks before the evening reception (which had less folk, since people were flying off that evening). Feel free to also read Matthias Crauwels, Percona Live Europe 2017 Review.

Percona Live Europe 2017 Dublin had over 350+ attendees, over 140+ speakers – all in a new location! If you have any comments please feel free to shoot me an email.

Oracle Open WorldOracle OpenWorld from Afar

At this year’s Oracle OpenWorld there was talk about Oracle’s new self-driving, machine-learning based autonomous database. There was a focus on Amazon SLAs.

It’s unclear if this will also be what MySQL gets eventually, but we have in the MySQL world lossless semi-sync replication. Amazon RDS for MySQL is still DRBD based, and Google Cloud SQL does use semisync – but we need to check further if this is lossless semisync or not.

Folk like Panoply.io claim they can do autonomous self-driving databases, and have many platform integrations to boot. Anyone using this?

Nice to see a Percona contribution to remove InnoDB buffer pool mutex get accepted, and apparently it was done the right way. This is sustainable engineering: fix and contribute back upstream!

I was particularly interested in StorageTapper released by Uber to do real-time MySQL change data streaming and transformation. The slide deck is worth a read as well.

Booking.com also gave a talk. My real takeaway from this was about why MySQL is strong: “thousands of instances, a handful of DBAs.” Doug Henschen also talks about a lot of custom automation capabilities, the bonus of which is many are probably already open source. There are some good talks and slide decks to review.

It wouldn’t be complete without Dimitri Kravtchuk doing some performance smackdowns, and I highly recommend you read MySQL Performance: 2.1M QPS on 8.0-rc.

And for a little bit of fun: there was also an award given to Alexander Rubin for fixing MySQL#2: does not make toast. It’s quite common for open source projects to have such bugs, like the famous Ubuntu bug #1. I’ve seen Alexander demo this before, and if you want to read more check out his blog post from over a year ago: Fixing MySQL Bug#2: now MySQL makes toast! (Yes, it says April 1! but really, it was done!) Most recently it was done at Percona Live Santa Clara 2017.

Link List

Upcoming appearances

Percona’s website keeps track of community events, to see where to listen to a Perconian speak. My upcoming appearances are:

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at October 06, 2017 04:53 PM

October 05, 2017

Peter Zaitsev

Graph Descriptions for Metrics Monitor in Percona Monitoring and Management 1.3.0

PMM 1.3.0

The Metrics Monitor of Percona Monitoring and Management 1.3.0 (PMM) provides graph descriptions to display more information about the monitored data without cluttering the interface.

Percona Monitoring and Management 1.3.0 is a free and open-source platform for managing and monitoring MySQL®, MariaDB® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL, MariaDB® and MongoDB servers to ensure that your data works as efficiently as possible.

Each dashboard graph in PMM contains a lot of information. Sometimes, it is not easy to understand what the plotted line represents. The metric labels and the plotted data are limited and have to account for the space they can use in dashboards. It is simply not possible to provide additional information which might be helpful when interpreting the monitored metrics.

The new version of the PMM dashboards introduces on-demand descriptions with more details about the metrics in the given graph and about the data.

Percona Monitoring and Management 1.3.0

These on-demand descriptions are available when you hover the mouse pointer over the icon at the top left corner of each graph. The descriptions do not use the valuable space of your dashboard. The graph descriptions appear in small boxes. If more information exists about the monitored metrics, the description contains a link to the associated documentation.

Percona Monitoring and Management 1.3.0

In release 1.3.0 of PMM, Metrics Monitor only starts to use this convenient tool. In subsequent releases, graph descriptions are going to become a standard attribute of each Metrics Monitor dashboard.

by Borys Belinsky at October 05, 2017 05:58 PM

MariaDB AB

Getting Started with MariaDB MaxScale Database Firewall Filter

Getting Started with MariaDB MaxScale Database Firewall Filter anderskarlsson4 Thu, 10/05/2017 - 09:20

MariaDB Server and MariaDB MaxScale provides a secure, high-performance database platform. Some aspects of security goes into MariaDB Server and some into MariaDB MaxScale. This blog post describes one of the security features of MariaDB MaxScale, the Database Firewall filter.

MariaDB MaxScale is a powerful tool mostly used for database load balancing, and as such has many benefits, which we'll cover in our upcoming webinar about advanced database proxies. Another aspect of MariaDB MaxScale though is that there are many additional modules that can be used, in particular a range of filters that can be applied, and in this blog we are looking at the Database Firewall filter.

The Database Firewall filter allows you to specify which SQL statements are allowed to run and which are not, by using what are called whitelists and blacklists respectively. You can combine blacklists and whitelists also. In addition to this, the Database Firewall filter also allows a number of other rules to be applied.

Configuring MariaDB MaxScale and MariaDB Server

Before we start working with configuring the specifics of the Database Firewall filter in MariaDB MaxScale, let us have a look at how to set up a basic configuration of MaxScale and how to configure MariaDB Server to work with MariaDB MaxScale when the latter is used to implement security features.

For MariaDB MaxScale to add a level of security, we have to make sure that we don't have traffic bypassing MariaDB MaxScale and access MariaDB Server directly. There are several means of doing this, but in this case I'm going to choose the easy way out. We will run MariaDB MaxScale and MariaDB Server on the same server. Another assumption is that we want to maximize application transparency as much as possible, so application really should not have to have any special settings to run with MariaDB MaxScale as compared to when connecting directly to MariaDB Server.

For this to work, obviously, we are going to work with an environment where MariaDB Server and MariaDB MaxScale are installed on the same server.

Configuring MariaDB Server

What we are going to do here is to make sure that only MaxScale, or any other service with an appropriate username and password that runs on the same server as MariaDB Server, can connect to MariaDB Server. This we are going to do by changing the bind-address of MariaDB Server. So what is this, you ask? A bind address is the network interface that a program that listens to the network listens on. Usually you don't care much about this, there is just one interface in most cases anyway, right?

Well no, the by far most common number of interfaces are 2! So where is that second RJ-45 connector on your box then? The answer is that there is none, this is a virtual interface called the loopback. The bind-address is always associated with a network address, which in the case of your normal network interface is the node address of the server on the network, you connect to something using that address and the traffic is directed there through some kind of magic.

The loopback interface is only available on the box itself and it is always associated with the address 127.0.0.1. This address is not going through any kind of network hardware, it is all in software! I told you this was magic, right!

For a service running on a server, such as MariaDB Server, MaxScale or Apache, they are by default set up to listen or bind on IPADDR_ANY, which means that they listen on any interface on a given port, including the loopback interface. Note that you can listen on the same port on the same box, as long as they are on different interfaces. What we are going to do first is to have MariaDB Server listen only on connections on the loopback interface. Head off to edit your MariaDB Server configuration file, like /etc/my.cnf.d/server.cnf if you are running CentOS / Red Hat and set up bind-address in the mysqld section like this:

[mysqld]
bind-address=127.0.0.1


Note that we do not have to set the server to listen to a different port, the default 3306 is just fine as even though we are about to set up MariaDB MaxScale to listen to the same port, we are also to set up MariaDB MaxScale to listen bind on another interface, i.e. the normal ethernet interface.

Configuring MariaDB MaxScale to work with MariaDB Server

We now have to force MariaDB MaxScale to listen to the ethernet interface only, so it doesn't collide with MariaDB Server, and also to listen on port 3306. This is for the listener, and we are to give that an appropriate name as we in this case are using MaxScale as a firewall only. So head off to that old time favorite text editor of yours and edit the MariaDB MaxScale configuration file, which is probably in /etc/maxscale.cnf and add a section like this:

[Firewall Listener]
type=listener
service=Firewall Service
protocol=MySQLClient
address=192.168.0.170
port=3306


Replacing the address with the address of your server you are testing this on, obviously.

Starting up MariaDB MaxScale and MariaDB Server

We are soon ready to start up, but MariaDB MaxScale need some more work on the configuration. The default configuration that comes with MariaDB MaxScale has several different services in it and a lot of comments, here I will provide a configuration that is the bare minimum for MariaDB MaxScale to work. We need the listener specific above of course, but also a few other things in our MariaDB MaxScale configuration file /etc/maxscale.cnf.

Server configuration

This section in MariaDB MaxScale defines how MaxScale talks to the different servers. In this case, we will connect to just one server, so that is easy:

[server1]
type=server
address=127.0.0.1
port=3306
protocol=MySQLBackend


The thing to note here is that as we have MariaDB MaxScale talking to MariaDB Server on the loopback interface, we set address to 127.0.0.1, not the address of our host on the network.

Monitor configuration

The we configure a monitor that checks the status of our server, again, this is a bare minimum configuration:

[MySQL Monitor]
type=monitor
module=mysqlmon
servers=server1
user=myuser
passwd=mypwd


The servers parameter points to the server defined in the section above. The user and password arguments are used by MaxScale to connect to MariaDB Server to check the status of it. This user is created like this (but you can, and should, use a different username and password than the one used here). From the MariaDB command line on the server we are working with:

$ mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.2.7-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> CREATE USER 'myuser'@'127.0.0.1' IDENTIFiED BY  'mypwd';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> GRANT SELECT ON mysql.user TO 'myuser'@'127.0.0.1';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> GRANT SELECT ON mysql.db TO 'myuser'@'127.0.0.1';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> GRANT SELECT ON mysql.tables_priv TO 'myuser'@'127.0.0.1';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> GRANT SHOW DATABASES ON *.* TO 'myuser'@'127.0.0.1';
Query OK, 0 rows affected (0.00 sec)

OK, that's it for the monitor and the listener, so what remains before we start looking at the focal point for this blog post, the filter, what remains is the router or service.

Service configuration

As we are focusing on the Database Firewall filter here, we are using just a single basic router for this, mainly to have something to attach our filter to. We will be using the readconnroute service router here. The configurations looks like this:

[Firewall Service]
type=service
router=readconnroute
servers=server1
user=myuser
passwd=mypwd
router_options=running
filters=Firewall


Most of these parameters are rather obvious I guess. The username and password are used to be able to extract username and passwords from the server to use for authentication. In the Monitor configuration above we showed how to create a database user with appropriate privileges, and this included the necessary privileges both for the Monitor as well as for the Service. Also, you would want to use a different username and password from my example here. The router_options set to running means that we can connect to any server, as long as it is running. And finally the filters setting points to the filter we will be using, and lets move on to this.

Firewall filter configuration

The firewall filter is configured in two places, first it is configured in the maxscale configuration just as usual, and then there is a separate file with the firewall rules. Let's start with the maxscale.cnf settings first:

[Firewall]
type=filter
module=dbfwfilter
action=allow
rules=/etc/maxscale.modules.d/rules.txt


There are only two settings here that are interesting, one is the action=allow setting. What this means is that the rules we are to set up define the SQL statements that are allowed, and any other are disallowed, this is called whitelisting. You can define a set up rules for statements that you want to prohibit, and in that case you set allow=block, and this is then called blacklisting. In some cases you might want to do both blacklisting and whitelisting, and to achieve this you create two filters, and then you pipe one into the other in the filters setting in the services.

One more parameter is interesting for this filter, and this is the rules setting, which points to the file where the firewall rules defined, into case /etc/maxscale.modules.d/rules.txt, which is a file that we will create now.

Firewall rules

There are several means to define the firewall rules, for more information see https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-21-database-firewall-filter/. In this example, I will set up a few very basic firewall rules and put them in the file /etc/maxscale.modules.d/rules.txt, let's look at it first and then I'll explain them:

rule allow_select deny regex '^(?i)select .*$'
rule allow_show deny regex '^(?i)show .*$'

users %@% match any rules allow_select allow_show


The first two defines SQL statements that we are allowed to run, that it says "deny" is not relevant here, that is just part of the syntax, instead it is the "allow" setting for the filter instance that is in effect. As you can image, the SQL statement is matched using a regular expression, in this case PCRE (Perl Regular Expression) is used, see https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions for more details. Let's look closer at the three rules we have defined here. The first one says this:

  • ^ - The pattern matchin starts at the beginning of the string.
  • (?i) - Perform case insensitive matching
  • select - The exact word "select" followed by a space.
  • .* - Followed by 0 or more occurences of any character
  • $ - Followed by end of line

Given this, the second pattern should be obvious. In short, we allow any command that starts with the keyword SELECT or SHOW.

The "users" statement is used to map rules to users, in this case we are matching all users, you can have any kind of wildcards here. Then we say that a command is allowed that matches any of the given patterns.

Testing

Having set up the /etc/maxscale.cnf and the rules in /etc/maxscale.modules.d/rules.txt, we are ready to test it. First we restart MaxScale (again this is for CentOS 7):

$ sudo systemctl restart maxscale


Following that, let's connect to MariaDB through MaxScale and see what happens. Note that you have to connect as a non-root user, as root access is blocked by MariaDB MaxScale by default. Also, remember not to connect to the MySQL socket. So:
 

$ mysql -h 192.168.0.170 test
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 8266
Server version: 10.0.0 2.1.6-maxscale MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [test]> select user();
+------------------+
| user()           |
+------------------+
| anders@localhost |
+------------------+
1 row in set (0.00 sec)

MySQL [test]> select c1 from t1;
Empty set (0.00 sec)

MySQL [test]> insert into t1 values(1);
ERROR 1141 (HY000): Access denied for user 'anders'@'192.168.0.170': Permission denied, query matched regular expression.

Conclusion

In this blog, I have shown how to set up a very basic configuration, just to get started. I will follow up later with a blog that includes some details and more advanced configurations. As we have seen, this isn't really complicated per se, what takes some time is to develop a set of rules that match the SQL that is executed by your application.

If you want to dive deeper into advanced database proxies and MariaDB MaxScale, please join our upcoming webinar on Why Innovative Enterprises Deploy Advanced Database Queries on October 18.

Happy SQL'ing
/Karlsson
 

MariaDB Server and MariaDB MaxScale provides a secure, high-performance database platform. Some aspects of security goes into MariaDB Server and some into MariaDB MaxScale. This blog post describes one of the security features of MariaDB MaxScale, the Database Firewall filter.

Login or Register to post comments

by anderskarlsson4 at October 05, 2017 01:20 PM

Open Query Pty Ltd

Cleaning up data: OpenRefine

Always a problem: imported data tends to be messy.  So, you want to clean it – and preferably before it gets into your database!

OpenRefine has existed for some years already, and I particularly like that it runs locally (on Linux, Mac, Windows) rather than being an server “elsewhere”. It does use a web interface, but you can run the (Java based) backend on your laptop or another local place.  Have a look at the videos on the site to see how it works and what different tricks OpenRefine can do for you.

And another thing I like – it’s possibly to call it programmatically.  Once you work out that you need to do certain operations on a particular dataset to sanitise it, you should be able to automate the process for when you grab more of the same data later.

by Arjen Lentz at October 05, 2017 04:51 AM

October 04, 2017

Peter Zaitsev

Percona Live Europe Social

Percona Live Europe Social

One for the road…

Percona Live Europe 2017The social events at Percona Live Europe provide the community with more time to catch up with old friends and make new contacts. The formal sessions provided lots of opportunities for exchanging notes, experiences and ideas. Lunches and coffee breaks proved to be busy too. Even so, what’s better than chilling out over a beer or two (we were in Dublin after all) and enjoying the city nightlife in good company?

Percona Live Europe made it easy for us to get together each evening.  A welcome reception (after tutorials) at Sinnott’s Pub in the heart of the City hosted a lively crowd. The Community Dinner at the Mercantile Bar, another lively city center hostelry, was a sell-out. While our closing reception was held at the conference venue, which had proven to be an excellent base. 

Many delegates took the chance to enjoy the best of Dublin’s hospitality late into the night. It’s credit to their stamina – and the fantastic conference agenda – that opening keynotes on both Tuesday and Wednesday were very well attended.

In case you think we might have been prioritizing the Guinness, though, there was the little matter of the lightning talks at the Community Dinner. Seven community-minded generous souls gave up some of their valuable socializing time to share insights into matters open source. Thank you again to Renato Losio of Funambol, Anirban Rahut of Facebook, Federico Razzoli of Catawiki, Dana Van Aken of Carnegie Mellon University, Toshaan Bharvani of VanTosh, Balys Kriksciunas of Hostinger International and Vishal Loel of Lazada.

More about the lightning talks can be seen on the Percona Live Europe website.

Many of the conference treats – coffee, cakes, community dinner – are sponsored and thanks are due once more to our sponsors who helped make Percona Live Europe the worthwhile, enjoyable event that it was.

And so Percona Live Europe drew to a close. Delegates from 43 countries headed home armed with new knowledge, new ideas and new friends. I’ve put together to give a taste of the Percona Live social meetups in this video. Tempted to join us in 2018?

Sláinte!

by Lorraine Pocklington at October 04, 2017 09:03 PM

ClickHouse MySQL Silicon Valley Meetup Wednesday, October 25 at Uber Engineering with Percona’s CTO Vadim Tkachenko

ClickHouse MySQL

ClickHouse MySQLI will be presenting at the ClickHouse MySQL Silicon Valley Meetup on Wednesday, October 25, 2017, at 6:30 PM.

ClickHouse is a real-time analytical database system. Even though they’re only celebrating one year as open source software, it has already proved itself ready for the serious workloads. We will talk about ClickHouse in general, some internals and why it is so fast. ClickHouse works in conjunction with MySQL – traditionally weak for analytical workloads – and this presentation demonstrates how to make the two systems work together.

My talk will cover how we can improve the experience with real-time analytics using ClickHouse, and how we can integrate ClickHouse with MySQL.

I want to thank our friends at Uber Engineering who agreed to host this event.

Please join us here: https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup/events/243887397/.

Vadim TkachenkoVadim Tkachenko, Percona CTO

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks.

Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products.

He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition. Previously, he founded a web development company in his native Ukraine and spent two years in the High Performance Group within the official MySQL support team.

by Vadim Tkachenko at October 04, 2017 08:41 PM

October 03, 2017

Peter Zaitsev

MyRocks Metrics Now in PMM 1.3.0

MyRocks

One of the most exciting features shipped in the Percona Monitoring and Management 1.3.0 (PMM) release is support for MyRocks metrics via a new Metrics Monitor dashboard titled MySQL MyRocks Metrics. The support in PMM follows the recent Percona Server for MySQL release 5.7.19 from September 6, where Percona delivered an EXPERIMENTAL version of MyRocks for non-Production usage.

The MyRocks storage engine from Facebook is based on RocksDB, a persistent key-value store for fast storage environments. MyRocks is optimized for fast storage and combines outstanding space and write efficiency with acceptable read performance. As a result, MyRocks has the following advantages compared to other storage engines (if your workload uses fast storage, such as SSD):

  • Requires less storage space
  • Provides more storage endurance
  • Ensures better IO capacity

MyRocks Database Operations

This graph will help you visualize MyRocks database operations of Next and Seek attributes:

MyRocks Cache Activity

We also have a graph to help you visualize the count of Hits and Misses on the MyRocks cache:

MyRocks Cache Data Bytes Read/Write

Finally, another important MyRocks graph will help you understand the volume of data read and written to the MyRocks cache:

Please note that the MyRocks storage engine is not suitable (yet) for production workloads, but if you are testing this technology take a moment to install PMM in order to take advantage of our new MySQL MyRocks Metrics dashboard!

In PMM, you can view the metrics provided by the information schema as well as various data reported by the RocksDB engine’s status used by your MySQL database instance.

by Borys Belinsky at October 03, 2017 10:26 PM

Webinar October 4, 2017: Databases in the Hosted Cloud

Databases in the Hosted Cloud 1

Join Percona’s Chief Evangelist, Colin Charles as he presents Databases in the Hosted Cloud on Wednesday, October 4, 2017, at 7:00 am PDT / 10:00 am EDT (UTC-7).Databases in the Hosted Cloud 1


Today you can use hosted MySQL/MariaDB/Percona Server for MySQL/PostgreSQL in several “cloud providers” as a database as a service (DBaaS). Learn the differences, the access methods and the level of control you have for the various public databases in the hosted cloud offerings:

  • Amazon RDS including Aurora
  • Google Cloud SQL
  • Rackspace OpenStack DBaaS
  • Oracle Cloud’s MySQL Service

The administration tools and ideologies behind each are completely different, and you are in a “locked-down” environment. Some considerations include:

  • Different backup strategies
  • Planning for multiple data centers for availability
  • Where do you host your application?
  • How do you get the most performance out of the solution?
  • What does this all cost?
  • Monitoring

Growth topics include:

  • How do you move from one DBaaS to another?
  • How do you move from a DBaaS to your own hosted platform?

Register for the webinar here.

Securing Your MySQLColin Charles, Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team for MariaDB Server in 2009, worked in MySQL since 2005 and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He’s well known within many open source communities and has spoken on the conference circuit.

 

by Emily Ikuta at October 03, 2017 05:33 PM

Big Dataset: All Reddit Comments – Analyzing with ClickHouse

ClickHouse

In this blog, I’ll use ClickHouse and Tabix to look at a new very large dataset for research.

It is hard to come across interesting datasets, especially a big one (and by big I mean one billion rows or more). Before, I’ve used on-time airline performance available from BUREAU OF TRANSPORTATION STATISTICS. Another recent example is NYC Taxi and Uber Trips data, with over one billion records.

However, today I wanted to mention an interesting dataset I found recently that has been available since 2015. This is Reddit’s comments and submissions dataset, made possible thanks to Reddit’s generous API. The dataset was first mentioned at “I have every publicly available Reddit comment for research,” and currently you can find it at pushshift.io. However, there is no guarantee that pushshift.io will provide this dataset in the future. I think it would be valuable for Amazon or another cloud provider made this dataset available for researchers, just as Amazon provides https://aws.amazon.com/public-datasets/.

The dataset contains 2.86 billion records to the end of 2016 and is 709GB in size, uncompressed. This dataset is valuable for a variety of research scenarios, from simple stats to natural language processing and machine learning.

Now let’s see what simple info we can collect from this dataset using ClickHouse and https://tabix.io/, a GUI tool for ClickHouse. In this first round, we’ll figure some basic stats, like number of comments per month, number of authors per month and number of subreddits. I also added how many comments in average are left for a post.

Queries to achieve this:

SELECT toYYYYMM(created_date) dt,count(*) comments FROM commententry1 GROUP BY dt ORDER BY dt
;;
SELECT toYYYYMM(created_date) dt,count(DISTINCT author) authors FROM commententry1 GROUP BY dt ORDER BY dt
;;
SELECT toYYYYMM(created_date) dt,count(DISTINCT subreddit) subreddits FROM commententry1 GROUP BY dt ORDER BY dt
;;
SELECT toYYYYMM(created_date) dt,count(*)/count(distinct link_id) comments_per_post FROM commententry1 GROUP BY dt ORDER BY dt

And the graphical result:
ClickHouse
ClickHouse
It impressive to see the constant growth in comments (to 70mln per month by the end of 2016) and authors (to 3.5mln for the same time period). There is something interesting happening with subreddits, which jump up and down. It’s interesting to see that the average count of comments per post stays stable, with a slight decline to 13 comments/post by the end of 2016.

Now let’s check most popular subreddits:

SELECT subreddit,count(*) cnt FROM commententry1 GROUP BY subreddit ORDER BY cnt DESC limit 100
DRAW_TREEMAP
{
    path:'subreddit.cnt'
}

and using a treemap (available in Tabix.io):
ClickHouse

We can measure subreddits that get the biggest increase in comments in 2016 compared to 2015:

SELECT subreddit,cntnew-cntold diff FROM (SELECT subreddit,count(*) cntnew FROM commententry1 WHERE toYear(created_date)=2016 GROUP BY subreddit) ALL INNER JOIN (SELECT subreddit,count(*) cntold FROM commententry1 WHERE toYear(created_date)=2015 GROUP BY subreddit) USING (subreddit) ORDER BY diff DESC LIMIT 50
 DRAW_TREEMAP
{
    path:'subreddit.diff'
}

ClickHouse

Obviously, Reddit was affected by the United States Presidential Election 2016, but not just that. The gaming community saw an increase in Overwatch, PokemonGO and Dark Souls 3.

Now we can try to run our own DB-Ranking, but only based on Reddit comments. This is how I can do this for MySQL, PostgreSQL and MongoDB:

SELECT toStartOfQuarter(created_date) Quarter,
sum(if(positionCaseInsensitive(body,'mysql')>0,1,0)) mysql,
sum(if(positionCaseInsensitive(body,'postgres')>0,1,0)) postgres,
sum(if(positionCaseInsensitive(body,'mongodb')>0,1,0)) mongodb
FROM commententry1
GROUP BY Quarter ORDER BY Quarter;

I would say the result is aligned with https://db-engines.com/en/ranking, where MySQL is the most popular among the three, followed by PostgreSQL and then MongoDB. There is an interesting spike for PostgreSQL in the second quarter in 2015, caused by a bot in “leagueoflegend” tournaments. The bot was actively announcing that it is powered by PostgreSQL in the comments, like this: http://reddit.com/r/leagueoflegends/comments/37cvc3/c/crln2ef.

To highlight more ClickHouse features: along with standard SQL functions, it provides a variety of statistical functions (for example, Quantile calculations). We can try to see the distribution of the number of comments left by authors:

SELECT
    quantileExact(0.1)(cnt),
    quantileExact(0.2)(cnt),
    quantileExact(0.3)(cnt),
    quantileExact(0.4)(cnt),
    quantileExact(0.5)(cnt),
    quantileExact(0.6)(cnt),
    quantileExact(0.7)(cnt),
    quantileExact(0.8)(cnt),
    quantileExact(0.9)(cnt),
    quantileExact(0.99)(cnt)
FROM
(
    SELECT
        author,
        count(*) AS cnt
    FROM commententry1
    WHERE author != '[deleted]'
    GROUP BY author
)

The result is:

quantileExact(0.1)(cnt) - 1
quantileExact(0.2)(cnt) - 1
quantileExact(0.3)(cnt) - 1
quantileExact(0.4)(cnt) - 2
quantileExact(0.5)(cnt) - 4
quantileExact(0.6)(cnt) - 7
quantileExact(0.7)(cnt) - 16
quantileExact(0.8)(cnt) - 42
quantileExact(0.9)(cnt) - 160
quantileExact(0.99)(cnt) - 2271

Which means that 30% of authors left only one comment, and 50% of authors left four comments or less.

In general, ClickHouse was a pleasure to use when running analytical queries. However, I should note the missing support of WINDOW functions is a huge limitation. Even MySQL 8.0, which recently was released as RC, provides support for WINDOW functions. I hope ClickHouse will implement this as well.

by Vadim Tkachenko at October 03, 2017 12:11 AM

October 02, 2017

MariaDB AB

Getting Started with MariaDB Galera and MariaDB MaxScale on CentOS

Getting Started with MariaDB Galera and MariaDB MaxScale on CentOS anderskarlsson4 Mon, 10/02/2017 - 15:05

Introduction

A good thing with Galera as a High Availability solution for MariaDB is that it is rather easy to set up and use, as far as High Availability can be easy. Also, both MariaDB and Galera are well documented. To use MariaDB MaxScale for failover with a MariaDB Galera Cluster is also rather straightforward and well documented. The issue is that there are three technologies at play here, and although they are all well documented and reasonably easy to use, when it comes to using all these together this turn slightly more difficult.

This blogpost then aims at showing all these three technologies at play together, starting from scratch and creating a MariaDB Galera Cluster based on MariaDB Galera 10.2.7 and MaxScale 2.1. The operating system we are going to use for this example is CentOS version 7.2. Before we get started I also want to mention that we are here looking at a minimal initial installation, we will not look at many fancy features in either product.

Target system

Also, before we start, let's have a look at what we aim in terms of a completed system. We will build a MariaDB Galera Cluster with 3 nodes and a fourth node for MaxScale. The cluster is built on three separate virtual machines in this case, and the IP addresses of all the machines in this set up is:

IP Address Node use
192.168.0.180 MariaDB MaxScale node
192.168.0.181 MariaDB Galera Cluster node 1
192.168.0.182 MariaDB Galera Cluster node 2
192.168.0.183 MariaDB Galera Cluster node 3


Installing MariaDB Galera Cluster

In this section we will set up the MariaDB Galera Cluster from start to finish. As said above we will for the most part look at only the basic settings for the cluster to get started, no advanced settings will be used and for a production environment, you want to fine tune this.

Linux settings

There are just a few things that we are to adjust in the standard Linux installation before we commence, and this is to disable SELinux and the Linux firewall (which is firewalld in CentOS and RedHat 7.0 and up, and not iptables) and also set the hostname.

Disable SELinux

For all intents and purposes, in a production environment running with SELinux enabled is often a good idea. For the purposes of testing as we are doing here though, we do not want SELinux around at all. For this make sure that your SELinux configuration, in the file /etc/selinux/config,  looks something like this:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted


The change here is the SELINUX setting of course.

Disable firewalld

Firewalld is a standard service that is disabled using the systemctl command:

$ sudo systemctl disable firewalld


Set hostname

This is real simple, and we do this to be able to tell from the MariaDB command prompt which server I am connecting to when we use MariaDB MaxScale. On each node run something like this:

$ sudo hostname node181

And be sure to name all the nodes appropriately and different :-)

Rebooting and checking the new settings

At this point it is best to reboot to ensure that your setting are enabled. So reboot now and then check the status of SELinux and firewalld:

$ sestatus
SELinux status:                 disabled
$ systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
 
Aug 10 12:24:09 localhost.localdomain systemd[1]: Stopped firewalld - dynamic firewall daemon.


Make sure that you disable SELinux and firewalld on all the four machines we are using if you are following this example.

Software installation

Before we install the software we need to set up the MariaDB repository on all 4 servers:

$ curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash


Having run this on the four servers, let us now go on with installing MariaDB Galera server on the three nodes where this is appropriate, in the case here we are looking at nodes 192.168.0.181, 192.168.182 and 192.1678.0.183. On these three nodes run this this:

$ sudo yum -y install MariaDB-server


When this is completed, we should have MariaDB Server installed. The next thing to do then is to install MariaDB MaxScale on the 192.168.0.180 box:

$ sudo yum -y install maxscale


Now, only one thing remains to install, which strictly speaking is optional, but it is used when we test what we are setting up here, and that is to install the MariaDB client programs on the machine we run MariaDB MaxScale on, so on 192.168.0.180 run:

$ sudo yum -y install MariaDB-client


With that in place we are ready to get to the next step, which is to configure a MariaDB Galera Cluster.

Setting up MariaDB Galera Cluster

Before we start up MariaDB, we need to configure the cluster. This is not complicated, but there are a few settings that needs to be in place. Again, note that what we are setting up here is the bare minimum required to get started, in real life there are a bunch of more parameters you would want to set up. Also, beyond Galera I am more or less leaving MariaDB as it is, again with the exception of a few things that Galera requires. All in all, we are not configuring any InnoDB cache or metadata cache or defining a non-default name of the cluster.

We have to edit the file /etc/my.cnf.d/server.cnf and we are to adjust the Galera specific settings on the nodes 192.168.0.181, 192.168.182 and 192.1678.0.183. Edit the [galera] section to look like this on all three nodes:

[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://192.168.0.181,192.168.0.182,192.168.0.183
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2


Let's have a look at these settings now and see what they do:

  • wsrep_on - This is a session level flag to indicate of the operations we are performing will be replicated, much like how the sql_log_bin setting controls if statements are written to the binlog when using MariaDB Replication. The default of this is ON, but we set it anyway, to be safe.
  • wsrep_provider - This points to the location of the Galera library. Although MariaDB is set up to use Galera from scratch, you still have to point to the Galera library. This is installed as part of the MariaDB-Server installation above.
  • wsrep_cluster_address - This is where we define the nodes in the cluster, in general you don't need to list all nodes and new nodes can be added later to a running cluster, but in this case we know what the cluster looks like so we set it up here.
  • binlog_format - Although the binlog, in terms of the actual binlog files, isn't used by Galera, the binlog facility is, and for Galera to work you have to run with row format in the binlog.
  • default_storage_engine - Again, this is the default value, but just to be safe let's set up MariaDB Server to explicitly use the InnoDB Storage Engine, which is the only engine supported by Galera.
  • innodb_autoinc_lock_mode - This setting defines how the InnoDB Storage Engine generates values for auto_increment columns. Using mode 2 here is very important for Galera to work properly. In short, mode 2 cause much less locking during auto_increment generation and hence doesn't interfere with other locking. Values other than 2 can cause deadlocking and other issues with Galera.

With these settings in place, we are ready to start the cluster.

Starting MariaDB Cluster

The way starting a Cluster from scratch works is that we run a process called a bootstrap, and the reason this is a bit different from the usual MariaDB startup is that for HA reasons a node in a cluster attaches to one or more other nodes in the cluster, but for the first node, this is not possible. This is not complicated though, there is a script that is included with MariaDB Server that manages this, but note that this script is only to be used when the first node in a Cluster is started with no existing nodes in it. In this case, on 192.168.0.181 run:

$ sudo galera_new_cluster


With this in place, we should have a mariadb server running, let's have a look:

$ ps -f -u mysql | more
UID        PID  PPID  C STIME TTY          TIME CMD
mysql     3472     1  0 14:42 ?        00:00:00 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1


As you can see, the galera_new_cluster script has started MariaDB, but with some additional parameters, notably --wsrep_new_cluster. Before we continue, let's also look at the status of the cluster from the commandline:

$ mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 10
Server version: 10.2.7-MariaDB MariaDB Server
 
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> show global status like 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+
1 row in set (0.00 sec)


This shows that we have a running cluster (if not, the value of wsrep_cluster_size would have been 0), but there is just one node. Let's now start another node, and note that although we are starting a cluster from scratch, only the first node needs bootstraping, so here, on 192.168.0.182, we start MariaDB in the usual fashion:

$ sudo systemctl start mariadb.service


We should now have 2 nodes running in the cluster, let's check it out from the MariaDB commandline on 192.168.0.181:

MariaDB [(none)]> show global status like 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+
1 row in set (0.00 sec)


Yes, we have 2 nodes now, so then on 192.168.0.183, start MariaDB and we have a complete 3-node cluster running.

$ sudo systemctl start mariadb.service


And on 192.168.0.181, let's verify that we are done, so far:
 

MariaDB [(none)]> show global status like 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+
1 row in set (0.00 sec)

Getting Started with MariaDB MaxScale

MariaDB MaxScale will in this scenario act as a router to the cluster, ensuring that traffic is directed to the appropriate server. There are a few means of dealing with this, and MaxScale also has a lot of options, but again we will be dealing with setting up the bara minimum for the use case we have in mind.
Before we get started though, we need to set up the MariaDB servers to work with MariaDB MaxScale and there are a few reasons for this is. One reason is that MaxScale monitors the Cluster out-of-band, which means that the cluster is constantly monitored, even if there are no user connections. Another reason is that when there is a connection from a client through MaxScale to the MariaDB Cluster, it is MaxScale that does the user authentication and authentication data is picked up from the MariaDB cluster.

Setting up MariaDB for MariaDB MaxScale

First we need to set up a user that MariaDB MaxScale use to attach to the cluster to get authentication data. On 192.168.0.181, using the MariaDB command line as the database root user:

$ mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 11
Server version: 10.2.7-MariaDB MariaDB Server
 
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> create user 'myuser'@'192.168.0.180' identified by 'mypwd';
Query OK, 0 rows affected (0.01 sec)
 
MariaDB [(none)]> grant select on mysql.user to 'myuser'@'192.168.0.180';
Query OK, 0 rows affected (0.01 sec)


Following this, we need some extra privileges for table and database level grants:

MariaDB [(none)]> grant select on mysql.db to 'myuser'@'192.168.0.180';
Query OK, 0 rows affected (0.01 sec)
 
MariaDB [(none)]> grant select on mysql.tables_priv to 'myuser'@'192.168.0.180';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [(none)]> grant show databases on *.* to 'myuser'@'192.168.0.180';
Query OK, 0 rows affected (0.00 sec)


Note that the above commands need only be run on one of the servers in the cluster (say, 192.168.0.181) as these grants are replicated to all servers by virtue of Galera.
With this in place, we are ready to configure MariaDB MaxScale.

Configuring MariaDB MaxScale

The configuration for MariaDB MaxScale is in the file /etc/maxscale.cnf and in this case we will create a new configuration from scratch instead of amending the existing one. I will explain the important aspects of this, but before that, this is what it looks like, and again, no fancy configuration, just the basics to get going:
 

# Globals
[maxscale]
threads=1
 
# Servers
[server1]
type=server
address=192.168.0.181
port=3306
protocol=MySQLBackend
 
[server2]
type=server
address=192.168.0.182
port=3306
protocol=MySQLBackend
 
[server3]
type=server
address=192.168.0.183
port=3306
protocol=MySQLBackend
 
# Monitoring for the servers
[Galera Monitor]
type=monitor
module=galeramon
servers=server1,server2,server3
user=myuser
passwd=mypwd
monitor_interval=1000
 
# Galera router service
[Galera Service]
type=service
router=readwritesplit
servers=server1,server2,server3
user=myuser
passwd=mypwd
 
# MaxAdmin Service
[MaxAdmin Service]
type=service
router=cli
 
# Galera cluster listener
[Galera Listener]
type=listener
service=Galera Service
protocol=MySQLClient
port=3306
 
# MaxAdmin listener
[MaxAdmin Listener]
type=listener
service=MaxAdmin Service
protocol=maxscaled
socket=default

MariaDB MaxScale configuration file format

The format of the MariaDB MaxScale configuration file is, as you see above, similar to the one used by MariaDB Server. There are a few differences though. One is that as MariaDB MaxScale does more or less everything through plugins, and this is reflected in the configuration file. Each instance of a plugin has a separate section and the name of the section is used when referenced from some other plugin, so the section names are not fixed, but rather are used to name an instance of a service. This is true except in the case of global settings, which are not related to any particular plugin and are placed in the [maxscale] section.
A related setting is the type setting which defines what type of plugin this section related to, which is currently one of filter, listener, monitor, server or service. In addition, all plugins has a setting that defines the name of the plugin to load, or the name of the shared object file that MariaDB MaxScale will be loaded.

Global settings

In this case I have only one global setting which is to set the number of MariaDB MaxScale threads that we have running, in this case it is set to 1, which is the default. There might be many reasons to have a higher setting here, but for this simple example 1 thread is enough.

Servers

The settings should be fairly obvious, with the protocol setting defining what protocol plugin is being used here, and MySQLBackend is the only option so far. 

Monitors

A monitor is the plugin that checks the status of servers and the important setting here is the module which is set to galeramon in this case. There are a few different monitors available for different type of setup of the backend servers, but in this case we are using Galera so galeramon is what we want to use.
The user and password settings define how the monitor connects to the backend servers to get the current status and monitor_interval defines how often, in milliseconds, that we connect to the servers and check status.

Services

In this we define two services, one which is our main service and then we define an administrative service. The interesting service here is the Galera Service one, and the first thing we need to look at which router we will use, a router here being the actual implementation of the router plugin and there are several to choose from. Another thing we have to define is which servers make up the cluster that we are working with here. Finally we set a user and password that MariaDB MaxScale use to connect to the servers in the cluster to get authentication data.
We are for now ignoring the management MariaDB MaxScale services.

Listeners

This last type of plugin we define are the listeners, and these are the plugins that implement that actual protocol that listens for client connections. The interesting listener here is Galera Listener which listens on the MariaDB client protocol connections, again we are ignoring the MariaDB MaxScale management listener.
An important aspect of defining a listener is to use the correct service parameter, and this is set to the service which this listen connects to when there is a new connection. In this case we connect to the Galera Service we define above.
Note that we have the listener running on the MariaDB Server default port of 3306 and that this is deliberate and works fine as MariaDB MaxScale runs on a separate server from the ones where MariaDB Server runs.

Starting and testing MariaDB MaxScale

With the configuration described above in place, we are ready to start MaxScale on 192.168.0.180, to do this, just run:

$ sudo systemctl start maxscale.service


And before I leave you, let's test a few things. First, let's connect to the cluster through MariaDB MaxScale:

 

$ mysql -h 192.168.0.180 -u myuser -pmypwd
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 4668
Server version: 10.0.0 2.1.5-maxscale MariaDB Server
 
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MySQL [(none)]>

You can see that we are connected to MariaDB MaxScale now, but which server in the MariaDB Galera Cluster? Let's check it up!

MySQL [(none)]> show variables like 'hostname';
+---------------+---------+
| Variable_name | Value   |
+---------------+---------+
| hostname      | node181 |
+---------------+---------+
1 row in set (0.00 sec)


It's the 181 server it seems. Let's then try one more thing before we are done for today, let's stop MariaDB server on 192.168.0.181 and see what happens. On 192.168.0.181 run:

$ sudo systemctl stop mariadb.service


And then we go back to our command prompt on 192.168.0.180 and see what happens when we access MariaDB Server from there.

$ mysql -h 192.168.0.180 -u myuser -pmypwd
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 4668
Server version: 10.0.0 2.1.5-maxscale MariaDB Server
 
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MySQL [(none)]> show variables like 'hostname';
+---------------+---------+
| Variable_name | Value   |
+---------------+---------+
| hostname      | node182 |
+---------------+---------+
1 row in set (0.00 sec)


With that we are done for now. A follow up blogpost or two are planned to show some more advanced settings.

Happy SQL'ing
/Karlsson

MariaDB Galera Cluster is a great way to implement High Availability with MariaDB and combining it with MariaDB MaxScale creates a great High Availability Cluster with transparent failover. Getting started with any HA solution is mostly a bit confusing as there are so many components that work together and each has to be configured to work with the other. This blogpost shows how to get started with MariaDB Galera Cluster with MariaDB MaxScale on CentOS / RHEL 7.

Daniel Heller

Daniel Heller

Fri, 10/06/2017 - 01:42

I have just bookmarked this post as it is going to take me some time to go through all the information you have compiled! As I am in this field I know I will find some very valuable advice and information that can only help ( https://www.writemyessayz.co/ )!

Login or Register to post comments

by anderskarlsson4 at October 02, 2017 07:05 PM

Peter Zaitsev

One Million Tables in MySQL 8.0

MySQL 8.0

In my previous blog post, I talked about new general tablespaces in MySQL 8.0. Recently MySQL 8.0.3-rc was released, which includes a new data dictionary. My goal is to create one million tables in MySQL and test the performance.

Background questions

Q: Why million tables in MySQL? Is it even realistic? How does this happen?

Usually, millions of tables in MySQL is a result of “a schema per customer” Software as a Service (SaaS) approach. For the purposes of customer data isolation (security) and logical data partitioning (performance), each “customer” has a dedicated schema. You can think of a WordPress hosting service (or any CMS based hosting) where each customer has their own dedicated schema. With 10K customers per MySQL server, we could end up with millions of tables.

Q: Should you design an application with >1 million tables?

Having separate tables is one of the easiest designs for a multi-tenant or SaaS application, and makes it easy to shard and re-distribute your workload between servers. In fact, the table-per-customer or schema-per-customer design has the quickest time-to-market, which is why we see it a lot in consulting. In this post, we are not aiming to cover the merits of should you do this (if your application has high churn or millions of free users, for example, it might not be a good idea). Instead, we will focus on if the new data dictionary provides relief to a historical pain point.

Q: Why is one million tables a problem?

The main issue results from the fact that MySQL needs to open (and eventually close) the table structure file (FRM file). With one million tables, we are talking about at least one million files. Originally MySQL fixed it with table_open_cache and table_definition_cache. However, the maximum value for table_open_cache is 524288. In addition, it is split into 16 partitions by default (to reduce the contention). So it is not ideal. MySQL 8.0 has removed FRM files for InnoDB, and will now allow you to create general tablespaces. I’ve demonstrated how we can create tablespace per customer in MySQL 8.0, which is ideal for “schema-per-customer” approach (we can move/migrate one customer data to a new server by importing/exporting the tablespace).

One million tables in MySQL 5.7

Recently, I’ve created the test with one million tables. The test creates 10K databases, and each database contains 100 tables. To use a standard benchmark I’ve employed sysbench table structure.

mysql> select count(*) from information_schema.schemata where schema_name like 'test_sbtest%';
+----------+
| count(*) |
+----------+
| 10000    |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from information_schema.tables where table_schema like 'test_sbtest%';
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (4.61 sec)

This also creates a huge overhead: with one million tables we have ~two million files. Each .frm file and .ibd file size sums up to 175G:

# du -sh /ssd/mysql_57
175G    /ssd/mysql_57

Now I’ve used sysbench Lua script to insert one row randomly into one table

pathtest = "/usr/share/sysbench/tests/include/oltp_legacy/"
if pathtest then
   dofile(pathtest .. "common.lua")
else
   require("common")
end
function event()
   local table_name
   local i
   local c_val
   local k_val
   local pad_val
   local oltp_tables_count = 100
   local oltp_db_count = 10000
   table_name = "test_sbtest_" .. sb_rand_uniform(1, oltp_db_count) .. ".sbtest".. sb_rand_uniform(1, oltp_tables_count)
   k_val = sb_rand(1, oltp_table_size)
   c_val = sb_rand_str([[
###########-###########-###########-###########-###########-###########-###########-###########-###########-###########]])
   pad_val = sb_rand_str([[
###########-###########-###########-###########-###########]])
   rs = db_query("INSERT INTO " .. table_name ..
                       " (id, k, c, pad) VALUES " ..
                       string.format("(%d, %d, '%s', '%s')", i, k_val, c_val,
                                     pad_val))
   end
end

With:

local oltp_tables_count = 100
   local oltp_db_count = 10000

Sysbench will choose one table randomly out of one million. With oltp_tables_count = 1 and oltp_db_count = 100, it will only choose the first table (sbtest1) out of the first 100 databases (randomly).

As expected, MySQL 5.7 has a huge performance degradation when going across one million tables. When running a script that only inserts data into 100 random tables, we can see ~150K transactions per second. When the data is inserted in one million tables (chosen randomly) performance drops to 2K (!) transactions per second:

Insert into 100 random tables:

SQL statistics:
    queries performed:
        read:                            0
        write:                           16879188
        other:                           0
        total:                           16879188
    transactions:                        16879188 (140611.72 per sec.)
    queries:                             16879188 (140611.72 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

Insert into one million random tables:

SQL statistics:
    queries performed:
        read:                            0
        write:                           243533
        other:                           0
        total:                           243533
    transactions:                        243533 (2029.21 per sec.)
    queries:                             243533 (2029.21 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

This is expected. Here I’m testing the worse case scenario, where we can’t keep all table open handlers and table definitions in cache (memory) since the table_open_cache and table_definition_cache both have a limit of 524288.

Also, normally we can expect a huge skew between access to the tables. There can be only 20% active customers (80-20 rule), meaning that we can only expect an active access to 2K databases. In addition, there will be old or unused tables so we can expect around 100K or less of active tables.

Hardware and config files

The above results are from this server:

Processors   | 64xGenuine Intel(R) CPU @ 2.00GHz
Memory Total | 251.8G
Disk         | Samsung 950 Pro PCIE SSD (nvme)

Sysbench script:

sysbench $conn --report-interval=1 --num-threads=32 --max-requests=0 --max-time=600 --test=/root/drupal_demo/insert_custom.lua run

My.cnf:

innodb_buffer_pool_size = 100G
innodb_io_capacity=20000
innodb_flush_log_at_trx_commit = 0
innodb_log_file_size = 2G
innodb_flush_method=O_DIRECT_NO_FSYNC
skip-log-bin
open_files_limit=1000000
table_open_cache=524288
table_definition_cache=524288

One million tables in MySQL 8.0 + general tablespaces

In MySQL 8.0 is it easy and logical to create one general tablespace per each schema (it will host all tables in this schema). In MySQL 5.7, general tablespaces are available – but there are still .frm files.

I’ve used the following script to create 100 tables in one schema all in one tablespace:

mysql test -e "CREATE TABLESPACE t ADD DATAFILE 't.ibd' engine=InnoDB;"
for i in {1..10000}
do
           mysql test -e "create table ab$i(i int) tablespace t"
done

The new MySQL 8.0.3-rc also uses the new data dictionary, so all MyISAM tables in the mysql schema are removed and all metadata is stored in additional mysql.ibd file.

Creating one million tables

Creating InnoDB tables fast enough can be a task by itself. Stewart Smith published a blog post a while ago where he focused on optimizing time to create 30K tables in MySQL.

The problem is that after creating an .ibd file, MySQL needs to “fsync” it. However, when creating a table inside the tablespace, there is no fsync. I’ve created a simple script to create tables in parallel, one thread per database:

#/bin/bash
function do_db {
        mysql -vvv -e "create database $db";
        mysql -vvv $db -e "CREATE TABLESPACE $db ADD DATAFILE '$db.ibd' engine=InnoDB;"
        for i in {1..100}
        do
                table="CREATE TABLE sbtest$i ( id int(10) unsigned NOT NULL AUTO_INCREMENT, k int(10) unsigned NOT NULL DEFAULT '0', c varchar(120) NOT NULL DEFAULT '', pad varchar(60) NOT NULL DEFAULT '', PRIMARY KEY (id), KEY k_1 (k) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 tablespace $db;"
                mysql $db -e "$table"
        done
}
c=0
for m in {1..100}
do
        for i in {1..100}
        do
                let c=$c+1
                echo $c
                db="test_sbtest_$c"
                do_db &
        done
        wait
done

That script works perfectly in MySQL 8.0.1-dmr and creates one million tables in 25 minutes and 28 seconds (1528 seconds). That is ~654 tables per second. That is significantly faster than ~30 tables per second in the original Stewart’s test and 2x faster than a test where all fsyncs were artificially disabled using libeat-my-data library.

Unfortunately, in MySQL 8.0.3-rc some regression was introduced. In MySQL 8.0.3-rc I can see heavy mutex contention, and the table creation speed dropped from 25 minutes to ~280 minutes. I’ve filed a bug report: performance regression: “create table” speed and scalability in 8.0.3.

Size on disk

With general tablespaces and no .frm files, the size on disk decreased:

# du -h -d1 /ssd/
147G    /ssd/mysql_801
119G    /ssd/mysql_803
175G    /ssd/mysql_57

Please note though that in MySQL 8.0.3-rc, with new native data dictionary, the size on disk increased as it needs to write additional information (Serialized Dictionary Information, SDI) to the tablespace files:

InnoDB: Serialized Dictionary Information (SDI) is now present in all InnoDB tablespace files
except for temporary tablespace and undo tablespace files.
SDI is serialized metadata for schema, table, and tablespace objects.
The presence of SDI data provides metadata redundancy.
...
The inclusion of SDI data in tablespace files increases tablespace file size.
An SDI record requires a single index page, which is 16k in size by default.
However, SDI data is compressed when it is stored to reduce the storage footprint.

The general mysql data dictionary in MySQL 8.0.3 is 6.6Gb:

6.6G /ssd/mysql/mysql.ibd

Benchmarking the insert speed in MySQL 8.0 

I’ve repeated the same test I’ve done for MySQL 5.7 in MySQL 8.0.3-rc (and in 8.0.1-dmr), but using general tablespace. I created 10K databases (=10K tablespace files), each database has100 tables and each database resides in its own tablespace.

There are two new tablespace level caches we can use in MySQL 8.0: tablespace_definition_cache and schema_definition_cache:

tablespace_definition_cache = 15000
schema_definition_cache = 524288

Unfortunately, with one million random table accesses in MySQL 8.0 (both 8.0.1 and 8.0.3), we can still see that it stalls on opening tables (even with no .frm files and general tablespaces):

mysql> select conn_id, current_statement, state, statement_latency, lock_latency from sys.processlist where current_statement is not null and conn_id <> CONNECTION_ID();
+---------+-------------------------------------------------------------------+----------------+-------------------+--------------+
| conn_id | current_statement                                                 | state          | statement_latency | lock_latency |
+---------+-------------------------------------------------------------------+----------------+-------------------+--------------+
|     199 | INSERT INTO test_sbtest_9749.s ... 8079-53209333270-93105555128') | Opening tables | 4.45 ms           | 0 ps         |
|     198 | INSERT INTO test_sbtest_1863.s ... 9574-29782886623-39251573705') | Opening tables | 9.95 ms           | 5.67 ms      |
|     189 | INSERT INTO test_sbtest_3948.s ... 9365-63502117132-66650064067') | Opening tables | 16.29 ms          | 15.38 ms     |
|     190 | INSERT INTO test_sbtest_6885.s ... 8436-41291265610-60894472357') | Opening tables | 13.78 ms          | 9.52 ms      |
|     191 | INSERT INTO test_sbtest_247.sb ... 7467-89459234028-92064334941') | Opening tables | 8.36 ms           | 3.18 ms      |
|     192 | INSERT INTO test_sbtest_9689.s ... 8058-74586985382-00185651578') | Opening tables | 6.89 ms           | 0 ps         |
|     193 | INSERT INTO test_sbtest_8777.s ... 1900-02582963670-01868315060') | Opening tables | 7.09 ms           | 5.70 ms      |
|     194 | INSERT INTO test_sbtest_9972.s ... 9057-89011320723-95018545652') | Opening tables | 9.44 ms           | 9.35 ms      |
|     195 | INSERT INTO test_sbtest_6977.s ... 7902-29158428721-66447528241') | Opening tables | 7.82 ms           | 789.00 us    |
|     196 | INSERT INTO test_sbtest_129.sb ... 2091-86346366083-87657045906') | Opening tables | 13.01 ms          | 7.30 ms      |
|     197 | INSERT INTO test_sbtest_1418.s ... 6581-90894769279-68213053531') | Opening tables | 16.35 ms          | 10.07 ms     |
|     208 | INSERT INTO test_sbtest_4757.s ... 4592-86183240946-83973365617') | Opening tables | 8.66 ms           | 2.84 ms      |
|     207 | INSERT INTO test_sbtest_2152.s ... 5459-55779113235-07063155183') | Opening tables | 11.08 ms          | 3.89 ms      |
|     212 | INSERT INTO test_sbtest_7623.s ... 0354-58204256630-57234862746') | Opening tables | 8.67 ms           | 2.80 ms      |
|     215 | INSERT INTO test_sbtest_5216.s ... 9161-37142478639-26288001648') | Opening tables | 9.72 ms           | 3.92 ms      |
|     210 | INSERT INTO test_sbtest_8007.s ... 2999-90116450579-85010442132') | Opening tables | 1.33 ms           | 0 ps         |
|     203 | INSERT INTO test_sbtest_7173.s ... 2718-12894934801-25331023143') | Opening tables | 358.09 us         | 0 ps         |
|     209 | INSERT INTO test_sbtest_1118.s ... 8361-98642762543-17027080501') | Opening tables | 3.32 ms           | 0 ps         |
|     219 | INSERT INTO test_sbtest_5039.s ... 1740-21004115002-49204432949') | Opening tables | 8.56 ms           | 8.44 ms      |
|     202 | INSERT INTO test_sbtest_8322.s ... 8686-46403563348-31237202393') | Opening tables | 1.19 ms           | 0 ps         |
|     205 | INSERT INTO test_sbtest_1563.s ... 6753-76124087654-01753008993') | Opening tables | 9.62 ms           | 2.76 ms      |
|     213 | INSERT INTO test_sbtest_5817.s ... 2771-82142650177-00423653942') | Opening tables | 17.21 ms          | 16.47 ms     |
|     216 | INSERT INTO test_sbtest_238.sb ... 5343-25703812276-82353892989') | Opening tables | 7.24 ms           | 7.20 ms      |
|     200 | INSERT INTO test_sbtest_2637.s ... 8022-62207583903-44136028229') | Opening tables | 7.52 ms           | 7.39 ms      |
|     204 | INSERT INTO test_sbtest_9289.s ... 2786-22417080232-11687891881') | Opening tables | 10.75 ms          | 9.01 ms      |
|     201 | INSERT INTO test_sbtest_6573.s ... 0106-91679428362-14852851066') | Opening tables | 8.43 ms           | 7.03 ms      |
|     217 | INSERT INTO test_sbtest_1071.s ... 9465-09453525844-02377557541') | Opening tables | 8.42 ms           | 7.49 ms      |
|     206 | INSERT INTO test_sbtest_9588.s ... 8804-20770286377-79085399594') | Opening tables | 8.02 ms           | 7.50 ms      |
|     211 | INSERT INTO test_sbtest_4657.s ... 4758-53442917995-98424096745') | Opening tables | 16.62 ms          | 9.76 ms      |
|     218 | INSERT INTO test_sbtest_9672.s ... 1537-13189199316-54071282928') | Opening tables | 10.01 ms          | 7.41 ms      |
|     214 | INSERT INTO test_sbtest_1391.s ... 9241-84702335152-38653248940') | Opening tables | 21.34 ms          | 15.54 ms     |
|     220 | INSERT INTO test_sbtest_6542.s ... 7778-65788940102-87075246009') | Opening tables | 2.96 ms           | 0 ps         |
+---------+-------------------------------------------------------------------+----------------+-------------------+--------------+
32 rows in set (0.11 sec)

And the transactions per second drops to ~2K.

Here I’ve expected different behavior. With the .frm files gone and with tablespace_definition_cache set to more than 10K (we have only 10K tablespace files), I’ve expected that MySQL does not have to open and close files. It looks like this is not the case.

I can also see the table opening (since the server started):

mysql> show global status like '%open%';
+----------------------------+-----------+
| Variable_name              | Value     |
+----------------------------+-----------+
| Com_ha_open                | 0         |
| Com_show_open_tables       | 0         |
| Innodb_num_open_files      | 10040     |
| Open_files                 | 0         |
| Open_streams               | 0         |
| Open_table_definitions     | 524288    |
| Open_tables                | 499794    |
| Opened_files               | 22        |
| Opened_table_definitions   | 1220904   |
| Opened_tables              | 2254648   |
| Slave_open_temp_tables     | 0         |
| Table_open_cache_hits      | 256866421 |
| Table_open_cache_misses    | 2254643   |
| Table_open_cache_overflows | 1254766   |
+----------------------------+-----------+

This is easier to see on the graphs from PMM. Insert per second for the two runs (both running 16 threads):

  1. The first run is 10K random databases/tablespaces and one table (sysbench is choosing table#1 from a randomly chosen list of 10K databases). This way there is also no contention on the tablespace file.
  2. The second run is a randomly chosen table from a list of one million tables.

As we can see, the first run is dong 50K -100K inserts/second. Second run is only limited to ~2.5 inserts per second:

MySQL 8.0

“Table open cache misses” grows significantly after the start of the second benchmark run:
MySQL 8.0

As we can see, MySQL performs ~1.1K table definition openings per second and has ~2K table cache misses due to the overflow:

MySQL 8.0

When inserting against only 1K random tables (one specific table in a random database, that way we almost guarantee that one thread will always write to a different tablespace file), the table_open_cache got warmed up quickly. After a couple of seconds, the sysbench test starts showing > 100K tps. The processlist looks much better (compare the statement latency and lock latency to the above as well):

mysql> select conn_id, current_statement, state, statement_latency, lock_latency from sys.processlist where current_statement is not null and conn_id <> CONNECTION_ID();
+---------+-------------------------------------------------------------------+----------------+-------------------+--------------+
| conn_id | current_statement                                                 | state          | statement_latency | lock_latency |
+---------+-------------------------------------------------------------------+----------------+-------------------+--------------+
|     253 | INSERT INTO test_sbtest_3293.s ... 2282-95400708146-84684851551') | starting       | 22.72 us          | 0 ps         |
|     254 | INSERT INTO test_sbtest_3802.s ... 4030-35983148190-23616685226') | update         | 62.88 us          | 45.00 us     |
|     255 | INSERT INTO test_sbtest_5290.s ... 2361-58374942527-86207214617') | Opening tables | 36.07 us          | 0 ps         |
|     256 | INSERT INTO test_sbtest_5684.s ... 4717-34992549120-04746631452') | Opening tables | 37.61 us          | 37.00 us     |
|     257 | INSERT INTO test_sbtest_5088.s ... 5637-75275906887-76112520982') | starting       | 22.97 us          | 0 ps         |
|     258 | INSERT INTO test_sbtest_1375.s ... 8592-24036624620-65536442287') | query end      | 98.66 us          | 35.00 us     |
|     259 | INSERT INTO test_sbtest_8764.s ... 8566-02569157908-49891861265') | Opening tables | 47.13 us          | 37.00 us     |
|     260 | INSERT INTO test_sbtest_560.sb ... 2605-08226572929-25889530906') | query end      | 155.64 us         | 38.00 us     |
|     261 | INSERT INTO test_sbtest_7776.s ... 0243-86335905542-37976752368') | System lock    | 46.68 us          | 32.00 us     |
|     262 | INSERT INTO test_sbtest_6551.s ... 5496-19983185638-75401382079') | update         | 74.07 us          | 40.00 us     |
|     263 | INSERT INTO test_sbtest_7765.s ... 5428-29707353898-77023627427') | update         | 71.35 us          | 45.00 us     |
|     265 | INSERT INTO test_sbtest_5771.s ... 7065-03531013976-67381721569') | query end      | 138.42 us         | 39.00 us     |
|     266 | INSERT INTO test_sbtest_8603.s ... 7158-66470411444-47085285977') | update         | 64.00 us          | 36.00 us     |
|     267 | INSERT INTO test_sbtest_3983.s ... 5039-55965227945-22430910215') | update         | 21.04 ms          | 39.00 us     |
|     268 | INSERT INTO test_sbtest_8186.s ... 5418-65389322831-81706268892') | query end      | 113.58 us         | 37.00 us     |
|     269 | INSERT INTO test_sbtest_1373.s ... 1399-08304962595-55155170406') | update         | 131.97 us         | 59.00 us     |
|     270 | INSERT INTO test_sbtest_7624.s ... 0589-64243675321-62971916496') | query end      | 120.47 us         | 38.00 us     |
|     271 | INSERT INTO test_sbtest_8201.s ... 6888-31692084119-80855845726') | query end      | 109.97 us         | 37.00 us     |
|     272 | INSERT INTO test_sbtest_7054.s ... 3674-32329064814-59707699237') | update         | 67.99 us          | 35.00 us     |
|     273 | INSERT INTO test_sbtest_3019.s ... 1740-35410584680-96109859552') | update         | 5.21 ms           | 33.00 us     |
|     275 | INSERT INTO test_sbtest_7657.s ... 4985-72017519764-59842283878') | update         | 88.91 us          | 48.00 us     |
|     274 | INSERT INTO test_sbtest_8606.s ... 0580-38496560423-65038119567') | freeing items  | NULL              | 37.00 us     |
|     276 | INSERT INTO test_sbtest_9349.s ... 0295-94997123247-88008705118') | starting       | 25.74 us          | 0 ps         |
|     277 | INSERT INTO test_sbtest_3552.s ... 2080-59650597118-53885660147') | starting       | 32.23 us          | 0 ps         |
|     278 | INSERT INTO test_sbtest_3832.s ... 1580-27778606266-19414961452') | freeing items  | 194.14 us         | 51.00 us     |
|     279 | INSERT INTO test_sbtest_7685.s ... 0234-22016898044-97277319766') | update         | 62.66 us          | 40.00 us     |
|     280 | INSERT INTO test_sbtest_6026.s ... 2629-36599580811-97852201188') | Opening tables | 49.41 us          | 37.00 us     |
|     281 | INSERT INTO test_sbtest_8273.s ... 7957-39977507737-37560332932') | update         | 92.56 us          | 36.00 us     |
|     283 | INSERT INTO test_sbtest_8584.s ... 7604-24831943860-69537745471') | starting       | 31.20 us          | 0 ps         |
|     284 | INSERT INTO test_sbtest_3787.s ... 1644-40368085836-11529677841') | update         | 100.41 us         | 40.00 us     |
+---------+-------------------------------------------------------------------+----------------+-------------------+--------------+
30 rows in set (0.10 sec)

What about the 100K random tables? That should fit into the table_open_cache. At the same time, the default 16 table_open_cache_instances split 500K table_open_cache, so each bucket is only ~30K. To fix that, I’ve set table_open_cache_instances = 4 and was able to get ~50K tps average. However, the contention inside the table_open_cache seems to stall the queries:

MySQL 8.0

There are only a very limited amount of table openings:

MySQL 8.0

 

Conclusion

MySQL 8.0 general tablespaces looks very promising. It is finally possible to create one million tables in MySQL without the need to create two million files. Actually, MySQL 8 can handle many tables very well as long as table cache misses are kept to a minimum.

At the same time, the problem with “Opening tables” (worst case scenario test) still persists in MySQL 8.0.3-rc and limits the throughput. I expected to see that MySQL does not have to open/close the table structure file. I also hope the create table regression bug is fixed in the next MySQL 8.0 version.

I’ve not tested other new features in the new data dictionary in 8.0.3-rc: i.e., atomic DDL (InnoDB now supports atomic DDL, which ensures that DDL operations are either committed in their entirety or rolled back in case of an unplanned server stoppage). That is the topic of the next blog post.

by Alexander Rubin at October 02, 2017 02:26 AM

September 29, 2017

Peter Zaitsev

This Week in Data with Colin Charles 8: Percona Live Europe 2017 Is a Wrap!

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Percona Live Europe 2017

Percona Live Europe 2017 Dublin

We’ve spent a lot of time in the last few months organizing Percona Live Europe Dublin. I want to thank all the speakers, sponsors and attendees for helping us to pull off yet another great event. While we’ll provide some perspectives, thoughts and feedback soon, all the early mornings, jam-packed meetings and the 4 am bedtimes means I’ll probably talk about this event in my next column!

In the meantime, save the date for Percona Live Santa Clara, April 23-25 2018. The call for papers will open in October 2017.

Releases

Link List

Upcoming appearances

Percona’s website keeps track of community events, so check out where to listen to a Perconian speak. My upcoming appearances are:

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at September 29, 2017 06:03 PM

Federico Razzoli

Random thoughts after Percona Live

Percona Live Europe 2017 has ended. As always it’s been a great opportunity to hear great technical talks, ask questions, get in touch with people, drink beer, collect nice & useless gadgets. And, on a personal note, to meet former colleagues and former customers, some of which I never met before. And even joke with a customer about my sleepy voice when I had to start all my Monday mornings with a Skype call with him, at 8, at home. I revealed him I tried everything, including singing, but my voice didn’t improve. Oh, well, if he messaged me after I left Percona, I probably did a decent job.

Here are some completely random thoughts in random order.

  • MariaDB wasn’t there. There was a single talk, officially from the Foundation (I can’t really distinguish the two entities, and I’m not alone). They have their own conference, M18 will be the second edition. So most people have to choose if to attend Percona or Maria. The only consequence I care about is that they’re splitting the community and making every part weaker. As I wrote for the BSL time ago, I can only hope they will change their mind.
  • Tarantool wasn’t there. So bad, I believe in it.
  • Technologies that are growing fast or strongly promoted: ClickHouse, ProxySQL, Vitess, MyRocks storage engine. Wow.
    • But I’ve attended a very nice talk on a TokuDB real life story. Don’t underestimate it.
  • At Percona Live talks, you often hear questions about performance and quality. With MariaDB talk, I was the only one to ask these things (and actually my questions were almost a DoS attack, sorry Vicentiu). I’m not saying that MariaDB has not quality, I’m saying that users have probably a different approach.
  • Sharding is becoming a hot topic.
  • Open source philosophy is consolidating and expanding. It goes far beyond licenses and marketing claims, it is the attitude and pleasure to collaborate and do things that, without a community, wouldn’t be as strong.
  • As remarked by PZ, the first criteria for choosing a technology is usability. There are very good reasons for this, given the current complexity of big or even medium infrastructures. But it’s still a bit dangerous.
  • If you don’t have a reason to be there next year, I’ll give you one: Irish red beer.

Enjoy.
Federico


by Federico at September 29, 2017 05:59 PM

Peter Zaitsev

Percona Live Europe Session Interview: High-Performance JSON – PostgreSQL Vs. MongoDB with Wei Shan Ang and Dominic Dwyer (GlobalSign)

Percona Live Europe Wei Dominic

Percona Live Europe 2017The Percona Live Europe 2017 conference has been going strong for two days, and I’ve been to a bunch of presentations. An excellent one was High-Performance JSON – PostgreSQL Vs. MongoDB with Wei Shan Ang and Dominic Dwyer of GlobalSign.

This talk was very engaging and well attended and provided some enlightening stats from their experiments and tests. For GlobalSign, applications have to be both super fast and consistent, and achieving that balance requires dedicated and detailed testing and development. For example, while one configuration might offer incredibly fast throughput if the tradeoff is that there are dropouts from time to time – even for a second or two – it’s not a solution that would meet GlobalSign’s needs. Or as Wei Shan put it, using such a solution might lead to a few discussions with management!

There were lively questions from the floor that carried on outside the room well after the session. Since they presented the talk as PostgreSQL vs. MongoDB, there were advocates for both. These were handled with cool aplomb by both guys.

I caught up with them after the session:

by Lorraine Pocklington at September 29, 2017 05:41 PM

Percona Monitoring and Management 1.3.1 Is Now Available

Percona Monitoring and Management

Percona Monitoring and ManagementPercona announces the release of Percona Monitoring and Management 1.3.1. This release only contains bug fixes related to usability.

For install and upgrade instructions, see Deploying Percona Monitoring and Management.

Bug fixes

  • PMM-1271: In QAN, when the user selected a database host with no queries, the query monitor could still show metrics.
  • PMM-1512: When clicking the QAN in GrafanaQAN would open the home page. Now, QAN opens and automatically selects the database host and time range active in Grafana.
  • PMM-1523: User-defined Prometheus memory settings were not honored, potentially causing performance issues in high load environments.

Other bug fixes in this release: PMM-1452PMM-1515.

About Percona Monitoring and Management

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

We’re always happy to help! Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, please use the PMM project in JIRA.

by Borys Belinsky at September 29, 2017 05:23 PM

Chris Calender

How to set up and use MariaDB Connector C

I just wanted to provide quick tutorial on using MariaDB’s Connector C.

I downloaded the latest Connector C (3.0.2), running MariaDB 10.1.28, and was able to get it to work fine with a simple C program using the following commands:

1. Downloaded the Connector C .msi file (32-bit, since my VS is 32-bit), extracted, and installed, which placed it at:

C:\Program Files (x86)\MariaDB\MariaDB Connector C\lib

2. You need to add the directory that contains libmaria.dll to your $PATH Environment LIB PATH variable. In my case, it was:

Control Panel -> System -> Advanced System Settings -> Environment Variables -> Choose "LIB" from under "System variables" -> then add the Connector C lib path, like:
C:\Program Files (x86)\MariaDB\MariaDB Connector C\lib

3. Then just run the following command, where my c/c++ program name is “mysql1.c”:

cl /I "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include" /I "C:\Program Files (x86)\mariadb-10.1.25\include\mysql" /I "C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include" /MD "C:\Program Files (x86)\MariaDB\MariaDB Connector C\lib\libmariadb.lib" mysql1.c

Note the first path is to include all normal C headers like stdio.h, the second for mysql.h, and the third is for windows.h, and the last for the Connector C .lib.

Here is the actual session output:

C:\chris\mysql1> cl /I "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include" /I "C:\Program Files (x86)\mariadb-10.1.25\include\mysql" /I "C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include" /MD "C:\Program Files (x86)\MariaDB\MariaDB Connector C\lib\libmariadb.lib" mysql1.c

Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

mysql1.c
Microsoft (R) Incremental Linker Version 10.00.40219.01
Copyright (C) Microsoft Corporation. All rights reserved.

/out:mysql1.exe
C:\chris\mysql1\libmariadb.lib
mysql1.obj

4. If all is successful, as the above, you can invoke your newly created c/c++ program, mysql1.exe, in this case (mine just obtains the version number from the instance):

C:\chris\mysql1>mysql1

MySQL Version = 100128

For reference, here is my mysql1.c code:

#include 
#include 
#include 
 
MYSQL *conn;
int version = 1;

int main ( int argc, char *argv[] )
{
    conn = mysql_init ( NULL );
    mysql_real_connect ( conn, "localhost", "root",
            "xxxxx", "test", 3316, NULL, 0 );
	version = mysql_get_server_version( conn );
	printf("\nMySQL Version = %d\n",version);
    mysql_close ( conn );
    return 0;
}

Previous related posts, if interested:

  1. Creating a basic C/C++ Program to Interact with MySQL and MariaDB
  2. Common Errors and Resolutions for Building your own MySQL or MariaDB C/C++ Program on Windows
  3. Setting Up Connector/C and SkySQL C Connector for MySQL and MariaDB

I hope this helps.

by chris at September 29, 2017 01:58 PM

Jean-Jerome Schmidt

Watch the Replay: MySQL on Docker - Understanding the Basics

Thanks to everyone who joined us this week as we broadcast our MySQL on Docker webinar live from the Percona Live Conference in Dublin!

Our colleague Ashraf Sharif discussed how Docker containers work through to running a simple MySQL container as well as the ClusterControl Docker image (amongst other things)

If you missed the session or would like to watch it again, it’s now online for viewing.

Watch replay

Here’s the full agenda of the topics that were covered during the webinar. The content is aimed at MySQL users who are Docker beginners and who would like to understand the basics of running a MySQL container on Docker.

  • Docker and its components
  • Concept and terminology
  • How a Docker container works
  • Advantages and disadvantages
  • Stateless vs stateful
  • Docker images for MySQL
  • Running a simple MySQL container
  • The ClusterControl Docker image
  • Severalnines on Docker Hub

Watch replay

And if you’re not following our Docker blog series yet, we encourage you to do so: MySQL on Docker.

by jj at September 29, 2017 01:09 PM

September 28, 2017

Peter Zaitsev

Percona Live Europe Session Interview: MySQL on Docker – Containerizing the Dolphin

Percona Live Europe

Percona Live Europe 2017One of the widely discussed technologies at Percona Live Europe was the logistics of running MySQL in containers. Containers – particularly Docker – have become a hot topic, so there was a good-sized crowd on day one of the conference for Ashraf Sharif, Senior Support Engineer with Severalnines. He presented his talk “MySQL on Docker: Containerizing the Dolphin”. 

During his presentation, Ashraf shared some recommendations for best practices when setting out on containerizing MySQL. He sees the trend of moving to containers as a progression from the use of virtual hosts.

After his talk on day one of the Percona Live Europe conference, I caught up with Ashraf and asked about his presentation. I was interested in which concepts are most important for ensuring a smoother implementation.

If you enjoy this brief presentation and would like to find out more, then you might like to subscribe to Ashraf’s blog on the Severalnines website where he regularly posts insights on his special interests of system scalability and high availability.

by Lorraine Pocklington at September 28, 2017 06:04 PM

Percona Monitoring and Management 1.3.0 Query Analytics Support for MongoDB

Percona Monitoring and Management 1.3.0

Percona is pleased to announce the General Availability of Query Analytics (QAN) from Percona Monitoring and Management 1.3.0 (PMM). This new release introduces the support of MongoDB.

In general, the purpose of QAN is to help detect queries that consume the most amount of time inside of your database server. It provides detailed real-time analysis of queries so that your application can work with data efficiently. In the Percona Monitoring and Management 1.3.0 release, QAN adds support for MongoDB.

MongoDB is conceptually different from relational database management systems, such as MySQL or MariaDB. Relational database management systems store data in separate tables that represent single entities, and you may need to link records from multiple tables to represent a complex object. MongoDB, on the other hand, allows a more flexible approach to data storage and stores all essential information pertaining to a complex object together.

In QAN, the difference between the monitored systems is transparent, and you can analyze queries in the same way regardless of the technology used in the database engine. QAN presents the monitored data in both visual and numeric form. The performance-related characteristics appear as plotted graphics.

To start working with QAN, click the Query Analytics button on the Percona Monitoring and Management 1.3.0 home page. Select a MongoDB database from the list of available database instances at the top of the page. The list of the top ten queries opens below. These are the queries that take the longest time to run. To view more queries, click the Load next 10 queries button below the list.

You can limit the list of available queries to only those that you are interested in by using the Query Filter field next to the database selection button.

Percona Monitoring and Management 1.3.0

In the Query Filter field, you can enter a query ID or its fingerprint. The ID is a unique signature of a query. A fingerprint is a simplified form of your query: it replaces all specific values with placeholders. You can enter only a fragment of the fingerprint to make the search less restrictive.

Percona Monitoring and Management 1.3.0

The queries that match your criterion appear below the Query Filter field in a summary table.

In the summary table represents each query as a row, with each column referring to an essential attribute of queries. The Load, Count, and Latency columns visualize their values graphically along with summaries in the numeric form.

The load attribute is the percentage of the amount of time expressed as a percentage value that the MongoDB server spent executing a specific query. The count attribute informs how often the given query appeared in the search traffic. The latency attribute is the amount of time that it takes to run the query and return its result.

If you hover the cursor over one of these attributes in a query, you can see a concrete value appear over your cursor. Move the cursor along the plotted line to watch how the value is changing. Click one of the queries to select it. QAN displays detailed information about the query. The detailed information includes the metrics specific to the query type. It also contains details about the database and tables that the query uses.

Hope this helps you explore your MongoDB queries and get better performance from them!

by Borys Belinsky at September 28, 2017 05:27 PM

Jean-Jerome Schmidt

Percona Live Dublin - Event Recap & Our Sessions

Severalnines was pleased to yet again sponsor Percona Live Europe, the Open Source Database Conference which was held this year in Dublin, Ireland.

At the Conference

Severalnines team members flew in from around the world to partner up with our two local Dubliners to demo ClusterControl in the exhibit hall and present three sessions (see below).

On our Twitter feed we live tweeted both of the keynote sessions to help keep those who weren’t able to attend up-to-speed on the latest happenings in the database world.

We were also able to sit down with René Cannaò, creator of ProxySQL to talk about what’s new with the exciting load balancing technology.

Our Sessions

Members of the Severalnines team presented three technical sessions, all of which were widely attended… some with standing room only!

MySQL Load Balancers - MaxScale, ProxySQL, HAProxy, MySQL Router & nginx - A Close Up Look

Session Details: Load balancing MySQL connections and queries using HAProxy has been popular in the past years. Recently however, we have seen the arrival of MaxScale, MySQL Router, ProxySQL and now also Nginx as a reverse proxy.

For which use cases do you use them and how well do they integrate in your environment? This session aims to give a solid grounding in load balancer technologies for MySQL and MariaDB.

We review the main open-source options available: from application connectors (php-mysqlnd, jdbc), TCP reverse proxies (HAproxy, Keepalived, Nginx) and SQL-aware load balancers (MaxScale, ProxySQL, MySQL Router).

We also look into the best practices for backend health checks to ensure load balanced connections are routed to the correct nodes in several MySQL clustering topologies. You'll gain a good understanding of how the different options compare, and enough knowledge to decide which ones to explore further.

MySQL on Docker - Containerizing the Dolphin

Session Details: Docker is becoming more mainstream and adopted by users as a method to package and deploy self-sufficient applications in primarily stateless Linux containers. It's a great toolset on top of OS-level virtualization (LXC, a.k.a containers) and plays well in the world of micro services.

However, Docker containers are transient by default. If a container is destroyed, all data created is also lost. For a stateful service like a database, this is a major headache to say the least.

There are a number ways to provide persistent storage in Docker containers. In this presentation, we will talk about how to setup a persistence data service with Docker that can be torn down and brought up across hosts and containers.

We touch upon orchestration tools, shared volumes, data-only-containers, security and configuration management, multi-host networking, service discovery and implications on monitoring when we move from host-centric to role-centric services with shorter life cycles.

Automating and Managing MongoDB: An Analysis of Ops Manager vs. ClusterControl

Session Details: In any busy operations environment, there are countless tasks to perform - some monthly, or weekly, some daily or more frequently, and some on an ad-hoc basis. And automation is key to performing fast, efficient and consistently repeatable software deployments and recovery.

There are many generic tools available, both commercial and open source, to aid with the automation of operational tasks. Some of these tools are even deployed in the database world. However, there are a small number of specialist domain-specific automation tools available also, and we are going to compare two of these products: MongoDB?s own Ops Manager, and ClusterControl from Severalnines.

We cover Installation and maintenance, Complexity of architecture, Options for redundancy, Comparative functionality, Monitoring, Dashboard, Alerting, Backing up and restoring, Automated deployment of advanced configurations, and Upgrading existing deployments

Thanks to the Percona Team for organising another great conference and to everyone who participated from near and afar! We hope to see you again soon!

by Severalnines at September 28, 2017 11:53 AM

MariaDB Foundation

MariaDB 10.2.9, MariaDB 10.1.28 and MariaDB Connector/J Releases now available

The MariaDB project is pleased to announce the availability of MariaDB 10.2.9, MariaDB 10.1.28, MariaDB Connector/J 2.1.2 and MariaDB Connector/J 1.6.5. See the release notes and changelogs for details. Download MariaDB 10.2.9 Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator Download MariaDB 10.1.28 Release Notes Changelog What is MariaDB […]

The post MariaDB 10.2.9, MariaDB 10.1.28 and MariaDB Connector/J Releases now available appeared first on MariaDB.org.

by Ian Gilfillan at September 28, 2017 09:48 AM