Planet MariaDB

November 21, 2018

Jean-Jerome Schmidt

How to Encrypt Your MySQL & MariaDB Backups

We usually take care of things we value, whether it is an expensive smartphone or the company’s servers. Data is one of the most important assets of the organisation, and although we do not see it, it has to be carefully protected. We implement data at rest encryption to encrypt database files or whole volumes which contain our data. We implement data in transit encryption using SSL to make sure no one can sniff and collect data sent across networks. Backups are no different. No matter if this is a full backup or incremental, it will store at least a part of your data. As such, backups have to be encrypted too. In this blog post, we will look into some options you may have when it comes to encrypting backups. First though, let’s look at how you can encrypt your backups and what could be use cases for those methods.

How to encrypt your backup?

Encrypt local file

First of all, you can easily encrypt existing files. Let’s imagine that you have a backup process storing all your backups on a backup server. At some point you decided it’s the high time to implement offsite backup storage for disaster recovery. You can use S3 or similar infrastructure from other cloud providers for that. Of course, you don’t want to upload unencrypted backups anywhere outside of your trusted network, therefore it is critical that you implement backup encryption one way or the other. A very simple method, not requiring any changes in your existing backup scripts would be to create a script which will take a backup file, encrypt it and upload it to S3. One of the methods you can use to encrypt the data is to use openssl:

openssl enc -aes-256-cbc -salt -in backup_file.tar.gz -out backup_file.tar.gz.enc -k yoursecretpassword

This will create a new, encrypted file, ‘backup_file.tar.gz.enc’ in the current directory. You can always decrypt it later by running:

openssl aes-256-cbc -d -in backup_file.tar.gz.enc -out backup_file.tar.gz -k yoursecretpassword

This method is very simple, but it has some drawbacks. The biggest one is the disk space requirements. When encrypting like we described above, you have to keep both unencrypted and encrypted file and in the result you require a disk space twice the size of the backup file. Of course, depending on your requirements, this might be something you want - keeping non-encrypted files locally will improve recovery speed - after all decrypting the data will also take some time.

Encrypt backup on the fly

To avoid the need of storing both encrypted and unencrypted data, you may want to implement the encryption at the earlier stage of the backup process. We can do that through pipes. Pipes are, in short, a way of sending the data from one command to another. This makes it possible to create a chain of commands that processes data. You can generate the data, then compress it and encrypt. An example of such chain might be:

mysqldump --all-databases --single-transaction --triggers --routines | gzip | openssl  enc -aes-256-cbc -k mypass > backup.xb.enc

You can also use this method with xtrabackup or mariabackup. In fact, this is the example from MariaDB documentation:

mariabackup --user=root --backup --stream=xbstream  | openssl  enc -aes-256-cbc -k mypass > backup.xb.enc

If you want, you can even try to upload data as the part of the process:

mysqldump --all-databases --single-transaction --triggers --routines | gzip | openssl  enc -aes-256-cbc -k mysecretpassword | tee -a mysqldump.gz.enc | nc 10.0.0.102 9991

As a result, you will see a local file ‘mysqldump.gz.enc’ and copy of the data will be piped to a program which will do something about it. We used ‘nc’, which streamed data to another host on which following was executed:

nc -l 9991 > backup.gz.enc

In this example we used mysqldump and nc but you can use xtrabackup or mariabackup and some tool which will upload the stream to Amazon S3, Google Cloud Storage or some other cloud provider. Any program or script which accepts data on stdin can be used instead of ‘nc’.

Use embedded encryption

In some of the cases, a backup tool has embedded support for encryption. An example here is xtrabackup, which can internally encrypt the file. Unfortunately, mariabackup, even though it is a fork of xtrabackup, does not support this feature.

Before we can use it, we have to create a key which will be used to encrypt the data. It can be done by running the following command:

root@vagrant:~# openssl rand -base64 24
HnliYiaRo7NUvc1dbtBMvt4rt1Fhunjb

Xtrabackup can accept the key in plain text format (using --encrypt-key option) or it can read it from file (using --encrypt-key-file option). The latter is safer as passing passwords and keys as plain text to commands result in storing them in the bash history. You can also see it clearly on the list of running processes, which is quite bad:

root      1130  0.0  0.6  65508  4988 ?        Ss   00:46   0:00 /usr/sbin/sshd -D
root     13826  0.0  0.8  93100  6648 ?        Ss   01:26   0:00  \_ sshd: root@notty
root     25363  0.0  0.8  92796  6700 ?        Ss   08:54   0:00  \_ sshd: vagrant [priv]
vagrant  25393  0.0  0.6  93072  4936 ?        S    08:54   0:01  |   \_ sshd: vagrant@pts/1
vagrant  25394  0.0  0.4  21196  3488 pts/1    Ss   08:54   0:00  |       \_ -bash
root     25402  0.0  0.4  52700  3568 pts/1    S    08:54   0:00  |           \_ sudo su -
root     25403  0.0  0.4  52284  3264 pts/1    S    08:54   0:00  |               \_ su -
root     25404  0.0  0.4  21196  3536 pts/1    S    08:54   0:00  |                   \_ -su
root     26686  6.0  4.0 570008 30980 pts/1    Sl+  09:48   0:00  |                       \_ innobackupex --encrypt=AES256 --encrypt-key=TzIZ7g+WzLt0PXWf8WDPf/sjIt7UzCKw /backup/

Ideally, you will use the key stored in a file but then there’s a small gotcha you have to be aware of.

root@vagrant:~# openssl rand -base64 24 > encrypt.key
root@vagrant:~# innobackupex --encrypt=AES256 --encrypt-key-file=/root/encrypt.key /backup/
.
.
.
xtrabackup: using O_DIRECT
InnoDB: Number of pools: 1
encryption: unable to set libgcrypt cipher key - User defined source 1 : Invalid key length
encrypt: failed to create worker threads.
Error: failed to initialize datasink.

You may wonder what the problem is. It’ll become clear when we will open encrypt.key file in a hexadecimal editor like hexedit:

00000000   6D 6B 2B 4B  66 69 55 4E  5A 49 48 77  39 42 36 72  68 70 39 79  6A 56 44 72  47 61 79 45  6F 75 6D 70  0A                                     mk+KfiUNZIHw9B6rhp9yjVDrGayEoump.

You can now notice the last character encoded using ‘0A’. This is basically the line feed character, but it is taken under consideration while evaluating the encryption key. Once we remove it, we can finally run the backup.

root@vagrant:~# innobackupex --encrypt=AES256 --encrypt-key-file=/root/encrypt.key /backup/
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --innodb_buffer_pool_size=185M --innodb_flush_log_at_trx_commit=2 --innodb_file_per_table=1 --innodb_data_file_path=ibdata1:100M:autoextend --innodb_read_io_threads=4 --innodb_write_io_threads=4 --innodb_doublewrite=1 --innodb_log_file_size=64M --innodb_log_buffer_size=16M --innodb_log_files_in_group=2 --innodb_flush_method=O_DIRECT --server-id=1
xtrabackup: recognized client arguments: --datadir=/var/lib/mysql --innodb_buffer_pool_size=185M --innodb_flush_log_at_trx_commit=2 --innodb_file_per_table=1 --innodb_data_file_path=ibdata1:100M:autoextend --innodb_read_io_threads=4 --innodb_write_io_threads=4 --innodb_doublewrite=1 --innodb_log_file_size=64M --innodb_log_buffer_size=16M --innodb_log_files_in_group=2 --innodb_flush_method=O_DIRECT --server-id=1 --databases-exclude=lost+found --ssl-mode=DISABLED
encryption: using gcrypt 1.6.5
181116 10:11:25 innobackupex: Starting the backup operation

IMPORTANT: Please check that the backup run completes successfully.
           At the end of a successful backup run innobackupex
           prints "completed OK!".

181116 10:11:25  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'backupuser'  (using password: YES).
181116 10:11:25  version_check Connected to MySQL server
181116 10:11:25  version_check Executing a version check against the server...
181116 10:11:25  version_check Done.
181116 10:11:25 Connecting to MySQL server host: localhost, user: backupuser, password: set, port: not set, socket: /var/lib/mysql/mysql.sock
Using server version 5.7.23-23-57
innobackupex version 2.4.12 based on MySQL server 5.7.19 Linux (x86_64) (revision id: 170eb8c)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql
xtrabackup: open files limit requested 0, set to 1024
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:100M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 67108864
xtrabackup: using O_DIRECT
InnoDB: Number of pools: 1
181116 10:11:25 >> log scanned up to (2597648)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 19 for mysql/server_cost, old maximum was 0
181116 10:11:25 [01] Encrypting ./ibdata1 to /backup/2018-11-16_10-11-25/ibdata1.xbcrypt
181116 10:11:26 >> log scanned up to (2597648)
181116 10:11:27 >> log scanned up to (2597648)
181116 10:11:28 [01]        ...done
181116 10:11:28 [01] Encrypting ./mysql/server_cost.ibd to /backup/2018-11-16_10-11-25/mysql/server_cost.ibd.xbcrypt
181116 10:11:28 [01]        ...done
181116 10:11:28 [01] Encrypting ./mysql/help_category.ibd to /backup/2018-11-16_10-11-25/mysql/help_category.ibd.xbcrypt
181116 10:11:28 [01]        ...done
181116 10:11:28 [01] Encrypting ./mysql/slave_master_info.ibd to /backup/2018-11-16_10-11-25/mysql/slave_master_info.ibd.xbcrypt
181116 10:11:28 [01]        ...done

As a result we will end up with a backup directory full of encrypted files:

root@vagrant:~# ls -alh /backup/2018-11-16_10-11-25/
total 101M
drwxr-x---  5 root root 4.0K Nov 16 10:11 .
drwxr-xr-x 16 root root 4.0K Nov 16 10:11 ..
-rw-r-----  1 root root  580 Nov 16 10:11 backup-my.cnf.xbcrypt
-rw-r-----  1 root root  515 Nov 16 10:11 ib_buffer_pool.xbcrypt
-rw-r-----  1 root root 101M Nov 16 10:11 ibdata1.xbcrypt
drwxr-x---  2 root root 4.0K Nov 16 10:11 mysql
drwxr-x---  2 root root  12K Nov 16 10:11 performance_schema
drwxr-x---  2 root root  12K Nov 16 10:11 sys
-rw-r-----  1 root root  113 Nov 16 10:11 xtrabackup_checkpoints
-rw-r-----  1 root root  525 Nov 16 10:11 xtrabackup_info.xbcrypt
-rw-r-----  1 root root 2.7K Nov 16 10:11 xtrabackup_logfile.xbcrypt

Xtrabackup has some other variables which can be used to tune encryption performance:

  • --encrypt-threads allows for parallelization of the encryption process
  • --encrypt-chunk-size defines a working buffer for encryption process.

Should you need to decrypt the files, you can use innobackupex with --decrypt option for that:

root@vagrant:~# innobackupex --decrypt=AES256 --encrypt-key-file=/root/encrypt.key /backup/2018-11-16_10-11-25/

As xtrabackup does not clean encrypted files, you may want to remove them using following one-liner:

for i in `find /backup/2018-11-16_10-11-25/ -iname "*\.xbcrypt"`; do rm $i ; done

Backup encryption in ClusterControl

With ClusterControl encrypted backups are just one click away. All backup methods (mysqldump, xtrabackup or mariabackup) support encryption. You can both create a backup ad hoc or you can prepare a schedule for your backups.

In our example we picked a full xtrabackup, we will store it on the controller instance.

On the next page we enabled the encryption. As stated, ClusterControl will automatically create an encryption key for us. This is it, when you click at the “Create Backup” button a process will be started.

New backup is visible on the backup list. It is marked as encrypted (the lock icon).

We hope that this blog post gives you some insights into how to make sure your backups are properly encrypted.

by krzysztof at November 21, 2018 10:58 AM

November 20, 2018

Serge Frezefond

Using Terraform to provision a managed MariaDB server in AWS

How to rapidly provision a MariaDB in the cloud ? Various option are available. A very effective approach is to provision MariaDB with Terraform. Terraform is a powerful tool to deploy infrastructure as code. Terraform is developed by Hashicorp that started their business with the very successful Vagrant deployment tool. Terraform allows you to describe [...]

by Serge at November 20, 2018 06:00 PM

MariaDB Foundation

MariaDB 10.3.11, and MariaDB Connector/C 3.0.7, Connector/ODBC 3.0.7 and Connector/Node.js 2.0.1 now available

The MariaDB Foundation is pleased to announce the availability of MariaDB 10.3.11, the latest stable release in the MariaDB 10.3 series, as well as MariaDB Connector/C 3.0.7 and MariaDB Connector/ODBC 3.0.7, both stable releases, and MariaDB Connector/Node.js 2.0.1, the first beta release of the new 100% JavaScript non-blocking MariaDB client for Node.js, compatible with Node.js […]

The post MariaDB 10.3.11, and MariaDB Connector/C 3.0.7, Connector/ODBC 3.0.7 and Connector/Node.js 2.0.1 now available appeared first on MariaDB.org.

by Ian Gilfillan at November 20, 2018 02:43 PM

Jean-Jerome Schmidt

New Webinar: How to Manage Replication Failover Processes for MySQL, MariaDB & PostgreSQL

If you’re looking at minimizing downtime and meet your SLAs through an automated or semi-automated approach, then this webinar is for you:

A detailed overview of what failover processes may look like in MySQL, MariaDB and PostgreSQL replication setups.

Failover is the process of moving to a healthy standby component, during a failure or maintenance event, in order to preserve uptime. The quicker it can be done, the faster you can be back online.

However, failover can be tricky for transactional database systems as we strive to preserve data integrity - especially in asynchronous or semi-synchronous topologies.

There are risks associated: from diverging datasets to loss of data. Failing over due to incorrect reasoning, e.g., failed heartbeats in the case of network partitioning, can also cause significant harm.

In this webinar we’ll cover the dangers related to the failover process, and discuss the tradeoffs between failover speed and data integrity. We’ll find out about how to shield applications from database failures with the help of proxies.

And we will finally have a look at how ClusterControl manages the failover process, and how it can be configured for both assisted and automated failover.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, December 11th at 09:00 GMT / 10:00 CET (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, December 11th at 09:00 PT (US) / 12:00 ET (US)

Register Now

Agenda

  • An introduction to failover - what, when, how
    • in MySQL / MariaDB
    • in PostgreSQL
  • To automate or not to automate
  • Understanding the failover process
  • Orchestrating failover across the whole HA stack
  • Difficult problems
    • Network partitioning
    • Missed heartbeats
    • Split brain
  • From assisted to fully automated failover with ClusterControl
    • Demo

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

by jj at November 20, 2018 12:39 PM

November 18, 2018

Valeriy Kravchuk

Fun with Bugs #72 - On MySQL Bug Reports I am Subscribed to, Part IX

I've subscribed to more than 60 new bug reports since my previous post in this series. It means that I'd need 4-5 posts to cover all new subscriptions and reasons behind them. I still plan to write about most of the bug reports I was interested in recently, but for this post I decided to pick up only MySQL 8.0 regression bugs and pay special attention to those that could be handled better or faster by Oracle engineers, as well as those handled perfectly.

The initial reason for this special attention originally was Bug #93085 - "Stall when concurrently execute create/alter user with flush privilege", that caused a lot of interesting Twitter discussions. It took some time, comments (in the bug report and in social media) and pressure from MySQL Community (including yours truly) to get it accepted as a real (regression!) bug to work on, and got "Verified". Unfortunately too often recently I see more time spent on arguing that something is not a bug or can not be reproduced, or is an example of improper usage of some MySQL feature etc instead of simply checking how things worked before MySQL 8.0 and how this changed, to worse.

Another example of "interesting" approach to bugs in MySQL 8.0 is Bug #93102 - "I_S queries don't return column names in the same case as requested.". It's indeed a duplicate of old and well known Bug #84456 - "column names in metadata appearing as uppercase when selecting from I_S" reported at early 8.0 development stage by Shane Bester from Oracle and community user (see Bug #85947). Still, it was decided NOT to fix it and tell users to rely on workaround, while this breaks application compatibility and is a regression.

Take a look at Bug #92998 - "After finishing the backup with mysqldump, mysql crashes and restarts" also. It ended up in "Unsupported" status, with some statements that "Dumping and restoring data between different 8.0 releases is not supported". This can be classified as a regression by itself. What I miss is a link to the manual saying it's not supported (was not able to find it in 5 minutes) any explanation of crash and restart - supported or not, running mysqldump should NOT cause server restarts in a general case. I think this bug report could end up in many statuses, but of them all "Unsupported" is hardly correct.

This my photo is far from ideal and can be criticized from different points of view, but there is no point to argue with the fact that it shows clouds in the sky. I wish the fact that MySQL 8.0 GA releases still have regression bugs is accepted with less arguing and more attention.

Now let me continue with a list of recently reported regression bugs in MySQL 8.0 that were handled mostly properly:

  • Bug #93215 - "Case statement use in conditional with sub_part index regression in 8.0". MySQL of versions < 8.0 (and MariaDB 10.3.7) work as expected also. The bug was verified fast, but it still misses explicit "regression" tag.
  • Bug #93214 - "LIMIT is applied before HAVING when you have a subquery". The bug was "Verified" quickly, but I still miss the exact 8.0.x version(s) affected and the results of checking with older versions. I strongly suspect it's a regression, as MariaDB 10.3.7 provides expected result:
MariaDB [test]> CREATE TABLE test (id INT PRIMARY KEY, value INT);
Query OK, 0 rows affected (0.510 sec)
MariaDB [test]> INSERT INTO test VALUES (1, 99), (2,98), (3, 97);
Query OK, 3 rows affected (0.057 sec)
Records: 3  Duplicates: 0  Warnings: 0
MariaDB [test]> SELECT t1.id, (SELECT t2.value FROM test t2 WHERE t1.id = t2.id) AS sub_value FROM test t1 HAVING sub_value = 99 ORDER BY value LIMIT 1;
+----+-----------+
| id | sub_value |
+----+-----------+
|  1 |        99 |
+----+-----------+
1 row in set (0.116 sec)
            • Bug #93170 - "undo truncation in 8.0.13 is not crash safe". The bug was quickly verified (after all, it's a failure of existing innodb_undo.truncate_recover MTR test case), but had not got "regression" tag. I am still not sure how it was missed during regular testing and ended up in the MySQL 8.0.13 release.
            • Bug #93147 - "Upgrade to 8.0.13 from 8.0.11 fails". In pre-8.0 releases there was no strict need to update to every intermediate minor version, so it's also a regression of a kind for any production DBA.
            • Bug #92979 - "MySQL 8.0 performance degradation on INSERT with foreign_key_checks=0". This is a verified performance regression comparing to MySQL 5.7, but "regression" tag is still missing. 

            To summarize, there are some regressions noted by community users recently in MySQL 8.0 GA releases. Some of them were demonstrated with simple test cases, so it's strange they were not noted by Oracle's QA. What's worse, it seems some of Oracle engineers are not ready to accept the fact that the best ever MySQL 8.0 GA release they worked on may get some things done incorrectly and worse than before, so they seem to waste time on useless discussions that everything is OK, work as expected and nothing can be done differently.  I also see some processed and verified bug reports without detailed check for regressions presented to users or even with "regression" tag NOT added when needed.

            I hope this is not going to become a new trend. I wish all community bug reports and features of MySQL get as much attention and detailed study from Oracle employees as (far from perfect) JSON support in MariaDB...

            by Valeriy Kravchuk (noreply@blogger.com) at November 18, 2018 05:51 PM

            November 17, 2018

            MariaDB Foundation

            2019 Developers Unconference, New York

            February in New York City is again MariaDB time, and the first MariaDB Developers Unconference of 2019 will take place on Saturday 23 and Sunday 24 February, with Hudson River Trading as kind hosts. The event is free to attend and you can join for the entire weekend, or as little time as you wish. […]

            The post 2019 Developers Unconference, New York appeared first on MariaDB.org.

            by Ian Gilfillan at November 17, 2018 12:35 PM

            Daniël van Eeden

            November 15, 2018

            Jean-Jerome Schmidt

            Webinar Replay: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB with ClusterControl

            Thanks to everyone who participated in this week’s webinar on ‘Backup Management with ClusterControl’. The replay is now available to watch online as well as the slide deck.

            If you feel frustrated by traditional, labour-intensive backup and archive practices for your MySQL, MariaDB, MongoDB and PostgreSQL databases … then this session is for you!

            What if you could have one backup management solution for all your business data? What if you could ensure integrity of all your backups? And what if you could leverage the competitive pricing and almost limitless capacity of cloud-based backup while meeting cost, manageability, and compliance requirements from the business?

            ClusterControl’s centralized backup management for open source databases provides you with hot backups of large datasets, point in time recovery in a couple of clicks, at-rest and in-transit data encryption, data integrity via automatic restore verification, cloud backups (AWS, Google and Azure) for Disaster Recovery, retention policies to ensure compliance, and automated alerts and reporting.

            Whether you are looking at rebuilding your existing backup infrastructure, or updating it, this webinar provides the necessary insights and details on how to go about that.

            Agenda

            • Backup and recovery management of local or remote databases
              • Logical or physical backups
              • Full or Incremental backups
              • Position or time-based Point in Time Recovery (for MySQL and PostgreSQL)
              • Upload to the cloud (Amazon S3, Google Cloud Storage, Azure Storage)
              • Encryption of backup data
              • Compression of backup data
            • One centralized backup system for your open source databases (Demo)
              • Schedule, manage and operate backups
              • Define backup policies, retention, history
              • Validation - Automatic restore verification
              • Backup reporting

            Speaker

            Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.

            by jj at November 15, 2018 03:46 PM

            November 14, 2018

            Jean-Jerome Schmidt

            Percona Live Frankfurt 2018 - Event Recap & Our Sessions

            Severalnines was pleased to yet again sponsor Percona Live Europe which was held this year in Frankfurt, Germany. Thanks to the Percona Team for having us and the great organisation.

            At the Conference

            Severalnines team members flew in from around the world to show off the latest edition of ClusterControl in the exhibit hall and present five sessions (see below).

            On our Twitter feed we live tweeted both of the keynote sessions to help keep those who weren’t able to attend up-to-speed on the latest happenings in the open source database world.

            Our Sessions

            Members of the Severalnines Team presented five sessions in total at the event about MySQL, MariaDB & MongoDB, each of which showcased ClusterControl and how it delivers on those topics.

            Disaster Recovery Planning for MySQL & MariaDB

            Presented by: Bart Oles - Severalnines AB

            Session Details: Organizations need an appropriate disaster recovery plan to mitigate the impact of downtime. But how much should a business invest? Designing a highly available system comes at a cost, and not all businesses and indeed not all applications need five 9's availability. We will explain fundamental disaster recovery concepts and walk you through the relevant options from the MySQL & MariaDB ecosystem to meet different tiers of disaster recovery requirements, and demonstrate how to automate an appropriate disaster recovery plan.

            MariaDB Performance Tuning Crash Course

            Presented by: Krzysztof Ksiazek - Severalnines AB

            Session Details: So, you are a developer or sysadmin and showed some abilities in dealing with databases issues. And now, you have been elected to the role of DBA. And as you start managing the databases, you wonder

            • How do I tune them to make best use of the hardware?
            • How do I optimize the Operating System?
            • How do I best configure MySQL or MariaDB for a specific database workload?

            If you're asking yourself the following questions when it comes to optimally running your MySQL or MariaDB databases, then this talk is for you!

            We will discuss some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL or MariaDB database. We will also cover some of the variables which are frequently modified even though they should not.

            Performance tuning is not easy, especially if you're not an experienced DBA, but you can go a surprisingly long way with a few basic guidelines.

            Performance Tuning Cheat Sheet for MongoDB

            Presented by: Bart Oles - Severalnines AB

            Session Details: Database performance affects organizational performance, and we tend to look for quick fixes when under stress. But how can we better understand our database workload and factors that may cause harm to it? What are the limitations in MongoDB that could potentially impact cluster performance?

            In this talk, we will show you how to identify the factors that limit database performance. We will start with the free MongoDB Cloud monitoring tools. Then we will move on to log files and queries. To be able to achieve optimal use of hardware resources, we will take a look into kernel optimization and other crucial OS settings. Finally, we will look into how to examine performance of MongoDB replication.

            Advanced MySql Data-at-Rest Encryption in Percona Server

            Presented by: Iwo Panowicz - Percona & Bart Oles - Severalnines AB

            Session Details: The purpose of the talk is to present data-at-rest encryption implementation in Percona Server for MySQL.
            Differences between Oracle's MySQL and MariaDB implementation.

            • How it is implemented?
            • What is encrypted:
            • Tablespaces?
            • General tablespace?
            • Double write buffer/parallel double write buffer?
            • Temporary tablespaces? (KEY BLOCKS)
            • Binlogs?
            • Slow/general/error logs?
            • MyISAM? MyRocks? X?
            • Performance overhead.
            • Backups?
            • Transportable tablespaces. Transfer key.
            • Plugins
            • Keyrings in general
            • Key rotation?
            • General-Purpose Keyring Key-Management Functions
            • Keyring_file
            • Is useful? How to make it profitable?
            • Keyring Vault
            • How does it work?
            • How to make a transition from keyring_file

            Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife

            Presented by: Art Van Scheppingen - vidaXL & Bart Oles - Severalnines AB

            Session Details: Over the past few years, VidaXL has become a European market leader in the online retail of slow moving consumer goods. When a company achieved over 50% year over year growth for the past 9 years, there is hardly enough time to overhaul existing systems. This means existing systems will be stretched to the maximum of their capabilities, and often additional performance will be gained by utilizing a large variety of datastores. Polyglot persistence reigns in rapidly growing environments and the traditional one-size-fits-all strategy of monoglots is over. VidaXL has a broad landscape of datastores, ranging from traditional SQL data stores, like MySQL or PostgreSQL alongside more recent load balancing technologies such as ProxySQL, to document stores like MongoDB and search engines such as SOLR and Elasticsearch.

            by fwlymburner at November 14, 2018 12:45 PM

            November 13, 2018

            Peter Zaitsev

            ProxySQL 1.4.12 and Updated proxysql-admin Tool

            ProxySQL 1.4.12

            ProxySQL 1.4.12

            ProxySQL 1.4.12, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

            ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

            The ProxySQL 1.4.12 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.12 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases. GitHub hosts the documentation in the wiki format.

            Improvements

            • #68: Scripts are now compatible with Percona XtraDB Cluster (PXC) hosts using IPv6
            • #107: In include-slaves, slaves are not moved into the write hostgroup even if the whole cluster went down. A new option --use-slave-as-writer specifies whether or not the slave is added to the write hostgroup.

            Bugs Fixed

            • #110: In some cases, pattern cluster hostname did not work with proxysql-admin.
            • #104: proxysql-admin testsuite bug fixes.
            • #113: proxysql_galera_checker assumed that parameters were given in the long format
            • #114: In some cases, ProxySQL could not be started
            • #115: proxysql_node_monitor could fail with more than one command in the scheduler
            • #116: In some cases, the scheduler was reloading servers on every run
            • #117: The --syncusers option did not work when enabling cluster
            • #125: The function check_is_galera_checker_running was not preventing multiple instances of the script from running

            Other bugs fixed: #112#120

            ProxySQL is available under Open Source license GPLv3.

            by Borys Belinsky at November 13, 2018 06:54 PM

            MariaDB Foundation

            MariaDB 10.2.19 now available

            The MariaDB Foundation is pleased to announce the availability of MariaDB 10.2.19, the latest stable release in the MariaDB 10.2 series. See the release notes and changelogs for details. Download MariaDB 10.2.19 Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator Contributors to MariaDB 10.2.19 Alexander Barkov (MariaDB Corporation) Alexey […]

            The post MariaDB 10.2.19 now available appeared first on MariaDB.org.

            by Ian Gilfillan at November 13, 2018 06:26 PM

            November 12, 2018

            Peter Zaitsev

            Percona Server for MySQL 5.7.23-24 Is Now Available

            Percona Server for MySQL 5.7

            Percona Server for MySQL 5.7Percona announces the release of Percona Server for MySQL 5.7.23-24 on November 12, 2018 (downloads are available here and from the Percona Software Repositories). This release merges changes of MySQL 5.7.23, including all the bug fixes in it. Percona Server for MySQL 5.7.23-24 is now the current GA release in the 5.7 series. All of Percona’s software is open-source and free.

            This release introduces InnoDB encryption improvements and merges upstream MyRocks changes. Also, we’ve improved the usage of column families in MyRocks. The InnoDB encryption improvements are in Alpha quality and we don’t recommend that they are used in production.

            New Features

            Bugs Fixed

            • #4723: PURGE CHANGED_PAGE_BITMAPS did not work when innodb_data_home_dir was used
            • #4937: rocksdb_update_cf_options was ignored when specified in my.cnf or on the command line
            • #1107: The binlog could be corrupted when tmpdir got full
            • #4834: The encrypted system tablespace could have an empty uuid

            Other bugs fixed

            • #4106: “Assertion `log.getting_synced’ failed in rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&)
            • #4930: “main.percona_log_slow_innodb: Result content mismatch”
            • #4811: “5.7 Merge and fixup for old DB-937 introduces possible regression”
            • #4705: “crash on snapshot size check in RocksDB”

            Find the release notes for Percona Server for MySQL 5.7.23-24 in our online documentation. Report bugs in the Jira bug tracker.

            by Borys Belinsky at November 12, 2018 06:52 PM

            MariaDB Foundation

            Running MariaDB in a Docker container

            Virtualisation has been a very popular technique for both development and production systems for many years. It allows multiple software environments to run on the same physical machine. Containerisation takes this idea even further. It allows you to segment your software environment down to the level of individual software packages. This means you can install […]

            The post Running MariaDB in a Docker container appeared first on MariaDB.org.

            by Jonathan Oxer at November 12, 2018 10:48 AM

            November 09, 2018

            MariaDB Foundation

            First MariaDB 10.4 alpha release

            The MariaDB Foundation is pleased to announce the availability of MariaDB 10.4.0, the first alpha release in the new MariaDB 10.4 series. See the release notes and changelogs for details. Download MariaDB 10.4.0 Release Notes Changelog What is MariaDB 10.4? MariaDB APT and YUM Repository Configuration Generator Contributors to MariaDB 10.4.0 Aleksey Midenkov (Tempesta) Alexander […]

            The post First MariaDB 10.4 alpha release appeared first on MariaDB.org.

            by Ian Gilfillan at November 09, 2018 05:57 PM

            November 08, 2018

            Peter Zaitsev

            Oracle Recognizes Percona Fixes in MySQL 8.0

            MySQL 8.0 Code Contributions
            MySQL 8.0 Code Contributions

            MySQL 8.0 Code Contributions (Shutterstock)

            An Oracle engineer thanked two Percona engineers by name, along with engineers from Facebook and elsewhere, for their recent MySQL 8.0 source code contributions. Oracle incorporated their work into its latest MySQL production release (8.0.13).

            Percona’s Zsolt Parragi authored a patch for a rare replication bug that left locked mutexes in production builds following debug crashes (bug #89421). Yura Sorokin authored a patch to fix wrong file I/O statistics in the MySQL Performance Schema (bug #90264).  Percona CTO Vadim Tkachenko cited both patches as examples of Percona’s continuing contributions to the open source community. This is one of Percona’s core ideals since the company’s founding in 2006.  

            In past last three years alone, Percona has reported on over 600 bugs in the MySQL server.  Most of these bug reports Percona provided Oracle engineers with reproducible test cases. They also contained detailed stack traces and other information appropriate for analyzing and fixing the bug. During that same period, Oracle accepted at least 20 patches authored by Percona engineers into its MySQL code base.

            Over its 12 year history, Percona engineers have created numerous open source projects that have won widespread community adoption.  These include Percona Server for MySQL, an enhanced version of the flagship MySQL database, Percona XtraDB Cluster, a high availability database solution, Percona Server for MongoDB®, an enhanced fork of the MongoDB® database, a  Percona XtraBackup, a database backup tool, Percona Tookit, a suite of utilities for database administrators, and the most recent, Percona Monitoring and Management (PMM), a GUI tool providing visibility into database performance.

            by Tom Basil at November 08, 2018 09:21 PM

            November 07, 2018

            Peter Zaitsev

            Percona Live Europe 2018: What’s Up for Wednesday

            Percona Live Europe Open Source Database Conference PLE 2018

            Welcome to Wednesday at Percona Live Europe 2018! Today is the final day! Check out all of the excellent sessions to attend.

            Please see the important updates below.

            Download the conference App

            If you haven’t already downloaded the app, go to the app store and download the official Percona Live App! You can view the schedule, be alerted for any important updates, create your own personalized schedule, rate the talks and interact with fellow attendees.

            For Apple: Download here
            For Android: Download here

            Rate the talks!

            We want to encourage all attendees to rate the talks which you have attended. Please take a few moments to rate the talks which you attended on the Percona Live App.

            Registration and Badge Pick Up

            Registration is open from 8 am.

            AWS Cloud Track

            Join the featured cloud track today where AWS will be presenting A Deep Dive on Amazon Aurora, Zero to Serverless in 60 Seconds, Top 10 Mistakes When Migrating From Oracle to PostgreSQL to name a few! These sessions will run in Wallstreet 2!

            AWS LEC

            Keynotes

            Keynotes begin promptly at 9:15 am. Please be seated and ready! Arrive early to secure your spot! Keynotes will take place in Dow Jones next to the expo area.

            Expo Opening Hours

            Have you visited the expo area yet? The expo will be open from 8:00 am to 4:30 pm today.

            Conference Slides

            Conference slides and presentations will be available to view after the conference and will be located on the Percona Live Europe website.

            Breaks and Lunch

            Coffee Breaks: The morning break is at 10:50 am – 11:20 am and the afternoon break from 4:10 pm- 4:30 pm (Conference Floor Foyer)
            Lunch: 1:10 pm -2:10 pm Lunch will be served on the conference floor and in Showroom and Gaia restaurant on the lobby level.

            With Thanks to Our Sponsors!

            Percona Live Europe 2018 Sponsors
            We hope you have enjoyed the conference!

            Save the Date!

            Percona Live 2019 will happen in Austin, Texas. Save the dates in your diary for May 28-30, 2019!

            The conference will take place just after Memorial Day at The Hyatt Regency, Austin on the shores of Lady Bird Lake. This is also an ideal central location for those who wish to extend their stay and explore what Austin has to offer! Call for papers, ticket sales and sponsorship opportunities will be announced soon, so stay tuned!

            by Bronwyn Campbell at November 07, 2018 05:14 AM

            November 06, 2018

            MariaDB Foundation

            MariaDB Galera Cluster 10.0.37 now available

            The MariaDB Foundation is pleased to announce the availability of MariaDB Galera Cluster 10.0.37, the latest stable release in the MariaDB Galera Cluster 10.0 series. See the release notes and changelogs for details. Download MariaDB Galera Cluster 10.0.37 Release Notes Changelog What is MariaDB Galera Cluster? Contributors to MariaDB Galera Cluster 10.0.37 Alexander Barkov (MariaDB […]

            The post MariaDB Galera Cluster 10.0.37 now available appeared first on MariaDB.org.

            by Ian Gilfillan at November 06, 2018 03:36 PM

            Peter Zaitsev

            Welcome to Percona Live Europe 2018 Tuesday Keynotes and Sessions!

            Percona Live Europe Open Source Database Conference PLE 2018

            Hello, open source database enthusiasts at Percona Live Europe 2018! There is a lot to see and do today, and we’ve got some of the highlights listed below.

            On Facebook? Go here for some pics that captured the action on Percona Live Europe 2018 Tutorials day (Monday, Nov. 5, 2018). 
            Percona Live Europe 2018 app

             

            Download the Conference App

            We apologize for the confusion yesterday on the app but can assure you, the schedule and timings have been updated! If you haven’t already downloaded the app, go to the app store and download the official Percona Live App! You can view the schedule, be alerted for any important updates, create your own personalized schedule, rate the talks and interact with fellow attendees.

            For Apple: Download here
            For Android: Download here

            Percona Live Frankfurt 1st Day-1244Registration and Badge Pick Up

            Registration is open from 8 am. The registration desk is located at the top of the stairs on the first floor of the Radisson Blu Hotel. 

            Keynotes

            Keynotes begin promptly at 9:15 am. Please be seated and ready! Arrive early to secure your spot! Keynotes will take place in Dow Jones next to the expo area. 

            Community Networking Reception

            Join the Open Source community on Tuesday evening at Chicago Meatpackers (Riverside), Frankfurt!

            This is a great opportunity to socialize and network with Percona Live Attendees and Other Open Source Enthusiasts who’d like to come along too!

            This is not a ticketed event or an official event of Percona Live Europe, simply an open invitation with a place to congregate for food and drinks! An A La Carte food menu and cash bar will be available.

            Percona Live Frankfurt 1st Day-1000Expo Opening Hours

            The expo will be open from 8:00 am to 4:30 pm today. 

            Breaks & Lunch

            Coffee Breaks: Sponsored by Facebook!  AM Break will be at 10:50am – 11:20 am and the Afternoon break from  4:10 pm- 4:30 pm (Conference Floor Foyer)
            Lunch: 1:10 pm -2:10 pm Lunch will be served on the conference floor and in Showroom and Gaia restaurant on the lobby level.

            With thanks to our Sponsors!

            Percona Live Europe 2018 Sponsors

            Enjoy the conference!

            by Bronwyn Campbell at November 06, 2018 08:21 AM

            November 05, 2018

            Peter Zaitsev

            How to Quickly Add a Node to an InnoDB Cluster or Group Replication

            Quickly Add a Node in InnoDB Cluster or Group Replication
            Quickly Add a Node to an InnoDB Cluster or Group Replication

            Quickly Add a Node to an InnoDB Cluster or Group Replication (Shutterstock)

            In this blog, we’ll look at how to quickly add a node to an InnoDB Cluster or Group Replication using Percona XtraBackup.

            Adding nodes to a Group Replication cluster can be easy (documented here), but it only works if the existing nodes have retained all the binary logs since the creation of the cluster. Obviously, this is possible if you create a new cluster from scratch. The nodes rotate old logs after some time, however. Technically, if the

            gtid_purged
             set is non-empty, it means you will need another method to add a new node to a cluster. You also need a different method if data becomes inconsistent across cluster nodes for any reason. For example, you might hit something similar to this bug, or fall prey to human error.

            Hot Backup to the Rescue

            The quick and simple method I’ll present here requires the Percona XtraBackup tool to be installed, as well as some additional small tools for convenience. I tested my example on Centos 7, but it works similarly on other Linux distributions. First of all, you will need the Percona repository installed:

            # yum install http://www.percona.com/downloads/percona-release/redhat/0.1-6/percona-release-0.1-6.noarch.rpm -y -q

            Then, install Percona XtraBackup and the additional tools. You might need to enable the EPEL repo for the additional tools and the experimental Percona repo for XtraBackup 8.0 that works with MySQL 8.0. (Note: XtraBackup 8.0 is still not GA when writing this article, and we do NOT recommend or advise that you install XtraBackup 8.0 into a production environment until it is GA). For MySQL 5.7, Xtrabackup 2.4 from the regular repo is what you are looking for:

            # grep -A3 percona-experimental-\$basearch /etc/yum.repos.d/percona-release.repo
            [percona-experimental-$basearch]
            name = Percona-Experimental YUM repository - $basearch
            baseurl = http://repo.percona.com/experimental/$releasever/RPMS/$basearch
            enabled = 1

            # yum install pv pigz nmap-ncat percona-xtrabackup-80 -q
            ==============================================================================================================================================
             Package                             Arch                 Version                             Repository                                 Size
            ==============================================================================================================================================
            Installing:
             nmap-ncat                           x86_64               2:6.40-13.el7                       base                                      205 k
             percona-xtrabackup-80               x86_64               8.0.1-2.alpha2.el7                  percona-experimental-x86_64                13 M
             pigz                                x86_64               2.3.4-1.el7                         epel                                       81 k
             pv                                  x86_64               1.4.6-1.el7                         epel                                       47 k
            Installing for dependencies:
             perl-DBD-MySQL                      x86_64               4.023-6.el7                         base                                      140 k
            Transaction Summary
            ==============================================================================================================================================
            Install  4 Packages (+1 Dependent package)
            Is this ok [y/d/N]: y
            #

            You need to do it on both the source and destination nodes. Now, my existing cluster node (I will call it a donor) – gr01 looks like this:

            gr01 > select * from performance_schema.replication_group_members\G
            *************************** 1. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: 76df8268-c95e-11e8-b55d-525400cae48b
               MEMBER_HOST: gr01
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: PRIMARY
            MEMBER_VERSION: 8.0.13
            1 row in set (0.00 sec)
            gr01 > show global variables like 'gtid%';
            +----------------------------------+-----------------------------------------------+
            | Variable_name                    | Value                                         |
            +----------------------------------+-----------------------------------------------+
            | gtid_executed                    | aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302662 |
            | gtid_executed_compression_period | 1000                                          |
            | gtid_mode                        | ON                                            |
            | gtid_owned                       |                                               |
            | gtid_purged                      | aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-295538 |
            +----------------------------------+-----------------------------------------------+
            5 rows in set (0.01 sec)

            The new node candidate (I will call it a joiner) – gr02, has no data but the same MySQL version installed. It also has the required settings in place, like the existing node address in group_replication_group_seeds, etc. The next step is to stop the MySQL service on the joiner (if already running), and wipe out it’s datadir:

            [root@gr02 ~]# rm -fr /var/lib/mysql/*

            and start the “listener” process, that waits to receive the data snapshot (remember to open the TCP port if you have a firewall):

            [root@gr02 ~]# nc -l -p 4444 |pv| unpigz -c | xbstream -x -C /var/lib/mysql

            Then, start the backup job on the donor:

            [root@gr01 ~]# xtrabackup --user=root --password=*** --backup --parallel=4 --stream=xbstream --target-dir=./ 2> backup.log |pv|pigz -c --fast| nc -w 2 192.168.56.98 4444
            240MiB 0:00:02 [81.4MiB/s] [ <=>

            On the joiner side, we will see:

            [root@gr02 ~]# nc -l -p 4444 |pv| unpigz -c | xbstream -x -C /var/lib/mysql
            21.2MiB 0:03:30 [ 103kiB/s] [ <=> ]
            [root@gr02 ~]# du -hs /var/lib/mysql
            241M /var/lib/mysql

            BTW, if you noticed the difference in transfer rate between the two, please note that on the donor side I put

            |pv|
             before the compressor while in the joiner before decompressor. This way, I can monitor the compression ratio at the same time!

            The next step will be to prepare the backup on joiner:

            [root@gr02 ~]# xtrabackup --use-memory=1G --prepare --target-dir=/var/lib/mysql 2>prepare.log
            [root@gr02 ~]# tail -1 prepare.log
            181019 19:18:56 completed OK!

            and fix the files ownership:

            [root@gr02 ~]# chown -R mysql:mysql /var/lib/mysql

            Now we should verify the GTID position information and restart the joiner (I have the

            group_replication_start_on_boot=off
             in my.cnf):

            [root@gr02 ~]# cat /var/lib/mysql/xtrabackup_binlog_info
            binlog.000023 893 aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302662
            [root@gr02 ~]# systemctl restart mysqld

            Now, let’s check if the position reported by the node is consistent with the above:

            gr02 > select @@GLOBAL.gtid_executed;
            +-----------------------------------------------+
            | @@GLOBAL.gtid_executed                        |
            +-----------------------------------------------+
            | aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302660 |
            +-----------------------------------------------+
            1 row in set (0.00 sec)

            No, it is not. We have to correct it:

            gr02 > reset master; set global gtid_purged="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302662";
            Query OK, 0 rows affected (0.05 sec)
            Query OK, 0 rows affected (0.00 sec)

            Finally, start the replication:

            gr02 > START GROUP_REPLICATION;
            Query OK, 0 rows affected (3.91 sec)

            Let’s check the cluster status again:

            gr01 > select * from performance_schema.replication_group_members\G
            *************************** 1. row ***************************
            CHANNEL_NAME: group_replication_applier
            MEMBER_ID: 76df8268-c95e-11e8-b55d-525400cae48b
            MEMBER_HOST: gr01
            MEMBER_PORT: 3306
            MEMBER_STATE: ONLINE
            MEMBER_ROLE: PRIMARY
            MEMBER_VERSION: 8.0.13
            *************************** 2. row ***************************
            CHANNEL_NAME: group_replication_applier
            MEMBER_ID: a60a4124-d3d4-11e8-8ef2-525400cae48b
            MEMBER_HOST: gr02
            MEMBER_PORT: 3306
            MEMBER_STATE: ONLINE
            MEMBER_ROLE: SECONDARY
            MEMBER_VERSION: 8.0.13
            2 rows in set (0.00 sec)
            gr01 > select * from performance_schema.replication_group_member_stats\G
            *************************** 1. row ***************************
                                          CHANNEL_NAME: group_replication_applier
                                               VIEW_ID: 15399708149765074:4
                                             MEMBER_ID: 76df8268-c95e-11e8-b55d-525400cae48b
                           COUNT_TRANSACTIONS_IN_QUEUE: 0
                            COUNT_TRANSACTIONS_CHECKED: 3
                              COUNT_CONFLICTS_DETECTED: 0
                    COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
                    TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302666
                        LAST_CONFLICT_FREE_TRANSACTION: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:302665
            COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
                     COUNT_TRANSACTIONS_REMOTE_APPLIED: 2
                     COUNT_TRANSACTIONS_LOCAL_PROPOSED: 3
                     COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
            *************************** 2. row ***************************
                                          CHANNEL_NAME: group_replication_applier
                                               VIEW_ID: 15399708149765074:4
                                             MEMBER_ID: a60a4124-d3d4-11e8-8ef2-525400cae48b
                           COUNT_TRANSACTIONS_IN_QUEUE: 0
                            COUNT_TRANSACTIONS_CHECKED: 0
                              COUNT_CONFLICTS_DETECTED: 0
                    COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
                    TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302666
                        LAST_CONFLICT_FREE_TRANSACTION:
            COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
                     COUNT_TRANSACTIONS_REMOTE_APPLIED: 0
                     COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
                     COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
            2 rows in set (0.00 sec)

            OK, our cluster is consistent! The new node joined successfully as secondary. We can proceed to add more nodes!

            by Przemysław Malkowski at November 05, 2018 07:46 PM

            November 04, 2018

            Valeriy Kravchuk

            On New Severity Levels for MySQL Bugs

            Four weeks ago while working on a blog post about half baked XA transactions feature of MySQL server I've noted that there are new severity levels added by Oracle for MySQL bug reports. Previously we had 5 levels:

            • S1 (Critical) - mostly for all kinds of crashes, DoS attack vectors, data corruptions etc
            • S2 (Serious) - mostly for wrong results bugs, broken replication etc
            • S3 (Non-critical) - all kinds of minor but annoying bugs, from unexpected results in some corner cases to misleading or wrong error messages, inefficient or unclear code etc
            • S4 (Feature requests) - anything that should work or be implemented based on common sense, but is not documented in the manual and was not required by the original specification or implementation of some feature.
            • S5 (Performance) - everything works as expected and documented, but the resulting performance is bad or less than expected. Something does not scale well, doesn't return results fast enough in some cases, or could be made faster or some specific platform using some different code or library. This severity level was also probably added at Oracle times, at least it was not there in 2005 when I started to work on MySQL bugs.

            Informal descriptions above are mine and may be incorrect or different from definitions Oracle engineers currently use. I tried to search for Oracle definitions that apply to MySQL, but was not able to find anything immediately useful (any help with public URL is appreciated). 

            In general, severity is defined as the degree of impact a bug has on the operation or use of some software, so less severity assumes less impact on common MySQL operations. One may also expect that bugs with higher severity are fixed first (have higher internal priority). It may not be that simple (and was not during my days in MySQL, when many more inputs were taken into account while setting priority for the bug fix), but it's a valid assumption for any community member.

            By default when searching for bugs you got all bugs of severity levels S1, S2, S3 and S5. You had to specifically care to get feature requests in search results while using bugs database search interface.

            If you try to search bugs today, you'll see two more severity levels added, S6 (Debug Builds) and S7 (Test Cases):

            Now we have 7 Severity levels for MySQL bug reports
            S6 severity level seems to be used for assertion failures and other bugs that affect only debug builds and can not be reproduced literally with non-debug binaries. S7 severity level is probably used for bug reports about failing MTR test cases, assuming that failure does NOT show a regression in MySQL software, but rather some non-determinism, platform dependencies, timing assumptions or other defects of the test case itself.

            By default bug reports with these severity levels are NOT included in search (they are not considered "Production Bugs"). So, one has to care to see them. This, as well as normal common sense based assumption that lower severity eventually means to lower priority for the fix, caused some concerns. It would be great for somebody from Oracle to explain the intended use and reasons for introduction of these severity levels with some more text than a single tweet, to clarify possible FUD people may have. If applied formally, these new severity values may lead to low priority for quite important problems. Most debug assertions are in the code for really good reason, as many weird things (up to crashes and data corruption) may happen in non-debug binaries somewhere later in cases when debug-only assertion fails.

            I was surprised to find out that at the moment we have 67 active S6 bug reports, and 32 active S7 bug reports. The latter list obviously includes reports that should not be S7, like Bug #92274 - "Question of mysql 8.0 sysbench oltp_point_select test" that is obviously about a performance regression noted in MySQL 8 (vs MySQL 5.7) by the bug reporter.

            Any comments from Oracle colleagues on the reasons to introduce new severity levels, their formal definitions and impact on community bug reports processing are greatly appreciated.

            by Valeriy Kravchuk (noreply@blogger.com) at November 04, 2018 04:05 PM

            November 02, 2018

            MariaDB Foundation

            MariaDB 10.1.37 now available

            The MariaDB Foundation is pleased to announce the availability of MariaDB 10.1.37, the latest stable release in the MariaDB 10.1 series. See the release notes and changelogs for details. Download MariaDB 10.1.37 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Contributors to MariaDB 10.1.37 Alexander Barkov (MariaDB Corporation) Alexey […]

            The post MariaDB 10.1.37 now available appeared first on MariaDB.org.

            by Ian Gilfillan at November 02, 2018 04:37 PM

            Peter Zaitsev

            Maintenance Windows in the Cloud

            maintenance windows cloud

            maintenance windows cloudRecently, I’ve been working with a customer to evaluate the different cloud solutions for MySQL. In this post I am going to focus on maintenance windows and requirements, and what the different cloud platforms offer.

            Why is this important at all?

            Maintenance windows are required so that the cloud provider can do the necessary updates, patches, and changes to our setup. But there are many questions like:

            • Is this going to impact our production traffic?
            • Is this going to cause any downtime?
            • How long does it take?
            • Any way to avoid it?

            Let’s discuss the three most popular cloud provider: AWS, Google, Microsoft. These three each have a MySQL based database service where we can compare the maintenance settings.

            AWS

            When you create an instance you can define your maintenance window. It’s a 30 minutes block when AWS can update and restart your instances, but it might take more time, AWS does not guarantee the update will be done in 30 minutes. The are two different type of updates, Required and Available. 

            If you defer a required update, you receive a notice from Amazon RDS indicating when the update will be performed. Other updates are marked as available, and these you can defer indefinitely.

            It is even possible to disable auto upgrade for minor versions, and in that case you can decide when do you want to do the maintenance.

            AWS separate OS updates and database engine updates.

            OS Updates

            It requires some downtime, but you can minimise it by using Multi-AZ deployments. First, the secondary instance will be updated. Then AWS do a failover and update the Primary instance as well. This means some small outage during the failover.

            DB Engine Updates

            For DB maintenance, the updates are applied to both instances (primary and secondary) at the same time. That will cause some downtime.

            More information: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_UpgradeDBInstance.Maintenance.html#USER_UpgradeDBInstance.Maintenance.Multi-AZ

            Google CloudSQL

            With CloudSQL you have to define an hour for a maintenance window, for example 01:00–02:00, and in that hour, they can restart the instances at any time. It is not guaranteed the update will be done in that hour. The primary and the secondary have the same maintenance window. The read replicas do not have any maintenance window, they can be stopped at any time.

            CloudSQL does not differentiate between OS or DB engine, or between required and available upgrades. Because the failover replica has the same maintenance window, any upgrade might cause database outage in that time frame.

            More information: https://cloud.google.com/sql/docs/mysql/instance-settings

            Microsoft Azure

            Azure provides a service called Azure Database for MySQL servers. I was reading the documentation and doing some research trying to find anything regarding the maintenance window, but I did not find anything.

            I span up an instance in Azure to see if there is any available settings, but I did not find anything so at this point I do not know how Azure does OS or DB maintenance or how that impacts production traffic.

            If someone knows where can I find this information in the documentation, please let me know.

            Conclusion

            AWS CloudSQL Azure
            Maintenance Window 30m 1h Unknown
            Maintenance Window for Read Replicas No No Unknown
            Separate OS and DB updates Yes No Unknown
            Outage during update Possible Possible Unknown
            Postpone an update Possible No Unknown
            Different priority for updates Yes No Unknown

             

            While I do not intend  to prefer or promote any of the providers, for this specific question, AWS offers the most options and controls for how we want to deal with maintenance.


            Photo by Caitlin Oriel on Unsplash

            by Tibor Korocz at November 02, 2018 01:02 PM

            Jean-Jerome Schmidt

            Effective Monitoring of MySQL With SCUMM Dashboards - Part 3

            We discussed in our previous blogs about the MySQL-related dashboards. We highlighted the things that a DBA can benefit from by studying the graphs, especially when performing their daily routines from diagnostics, metric reporting, and capacity planning. In this blog, we will discuss the InnoDB Metrics and the MySQL Performance Schema, which is very important especially on monitoring InnoDB transactions, disk/cpu/memory I/O, optimizing your queries, or performance tuning of the server.

            This blog touches upon the deep topic of performance, considering that InnoDB would require extensive coverage if we tackle its internals. The Performance Schema is also extensive as it covers kernel and core parts of MySQL and storage engines.

            Let’s begin walking through the graphs.

            MySQL InnoDB Metrics

            This dashboard is great for any MySQL DBA or ops person, as it offers a very good view into the InnoDB storage engine. There are certain graphs here that a user has to consider to enable, because not in all situations that the variables are set correctly in the MySQL configuration.

            • Innodb Checkpoint Age

              According to the manual, checkpointing is defined as follows: “As changes are made to data pages that are cached in the buffer pool, those changes are written to the data files sometime later, a process known as flushing. The checkpoint is a record of the latest changes (represented by an LSN value) that have been successfully written to the data files”. This graph is useful when you would like to determine how your server is performing checkpointing data to your disk. This can be a good reference if your transaction log (redo log or ib_logfile0) is too large. This graph is also a good indicator if you need to adjust variables such as innodb_log_file_size,, innodb_log_buffer_size, innodb_max_dirty_pages_pct, or innodb_adaptive_flushing_method. The closer checkpoint age is to the max checkpoint age, the more filled are the logs and InnoDB will be doing more I/O in order to maintain some free space in the logs. Checkpointing mechanism differs in subtle details between Percona XtraDB-based flavours, MariaDB and Oracle’s version, you can also find differences in it’s implementation between MySQL versions.

            • InnoDB Transactions

              Whenever there’s a large transaction on-going in your MySQL server, this graph is a good reference. It will count the transactions that were created at a specific time, and the history length (or is actually the history list length found in SHOW ENGINE INNODB STATUS) is the number of pages in the undo log. The trends you’ll see here is a good resource to check if it could mean, for example, that purge is delayed due to a very high insert rate of reloading the data or due to a long-running transaction, or if purge simply can't keep up due to a high disk I/O in the volume where your $DATADIR resides.

            • Innodb Row Operations

              For certain DBA tasks, you might want to determine the number of deletes, inserts, reads, and rows updated. Then this graph is what you can use to check these.

            • Innodb Row Lock Time

              This graph is a good resource to look upon when you are noticing that your application is encountering lots of occurrences for “Lock wait timeout exceeded; try restarting transaction”. This can also help you determine if you might have an indication for using bad queries on handling locks. This is also a good reference to look upon when optimizing your queries that involves locking of rows. If the time to wait is too high, you need to check the slow query log or run a pt-query-digest and see what are those suspecting queries causing that bloat in the graph.

            • InnoDB I/O

              Whenever you want to determine the amount of InnoDB data reads, disk flushes, writes, and log writes, this graph has what you need to look at. You can use this graph to determine if your InnoDB variables are well tuned to handle your specific requirements. For example, if you have Battery Backup Module cache but you are not gaining much of its optimum performance, you can rely on this graph to determine if your fsyncs() are higher than expected. Then changing the variable innodb_flush_method and using O_DSYNC can resolve the issue.

            • InnoDB Log File Usage Hourly

              This graph shows only the number of bytes written to the InnoDB redo log files and the growth of your InnoDB log files based on the 24-hour time range of the current date.

            • InnoDB Logging Performance

              This graph is closely related to InnoDB Log File Usage Hourly graph. You have to use this graph whenever you need to determine how large your innodb_log_file_size needs to be. You can determine the number of bytes written to the InnoDB redo log files and how efficiently your MySQL flushes data from memory to disk. Whenever you are experiencing a low-time in need to use your redo log space, then it would indicate that you have to increase your innodb_log_file size. In that case, this graph would tell you that you need to do so. However, to dig more into how much you need for your innodb_log_file, it might make more sense to check the LSN (Log Sequence Number) in SHOW ENGINE INNODB STATUS. Percona has a good blog related to this which is a good source to look at.

            • InnoDB Deadlocks

              In certain situations that your application client is often experiencing deadlocks or you have to look at how much your MySQL is experiencing deadlocks, this graph serves the purpose. Deadlocks indicate that you have poor SQL design which leads to your transactions creating a race condition causing deadlocks.

            • Index Condition Pushdown

              A little word of caution when looking at this graph. First, you have to determine that you have your MySQL global variable innodb_monitor_enable set to the correct value that is module_icp. Otherwise, you’ll experience a “No Data Points” as shown below:

              The graph’s purpose, if has data points defined as what I have in the sample outputs, will provide a DBA with an overlook of how well your queries are benefiting with Index Condition Pushdown or ICP for short. ICP is great feature in MySQL that offers optimization to your queries. Instead of MySQL reading the full rows filtered in your WHERE queries upon retrieval, it will add more checks after your secondary indexes. This adds more granularity and saves time, otherwise the engine has to read the full-table rows instead when it is based only on the filtered index and no ICP is used. This avoids reading the full rows corresponding to your index tuples that matches your secondary indexes.

              Let me elaborate a bit about this graph, let say I have a table named:

              mysql> show create table a\G
              *************************** 1. row ***************************
                     Table: a
              Create Table: CREATE TABLE `a` (
                `id` int(11) NOT NULL,
                `age` int(11) NOT NULL,
                KEY `id` (`id`)
              ) ENGINE=InnoDB DEFAULT CHARSET=latin1
              1 row in set (0.00 sec)

              And has some small data:

              mysql> select * from a;
              +----+-----+
              | id | age |
              +----+-----+
              |  1 |   1 |
              |  2 |   1 |
              |  3 |   1 |
              |  3 |  41 |
              |  4 |  41 |
              |  5 |   4 |
              |  4 |   4 |
              |  4 |   4 |
              +----+-----+
              8 rows in set (0.00 sec)

              When ICP is enabled, results is more efficient and feasible:

              mysql> explain extended select * from a where id>2 and id<4 and age=41;
              +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+------------------------------------+
              | id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                              |
              +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+------------------------------------+
              |  1 | SIMPLE      | a     | NULL       | range | id            | id   | 4       | NULL |    2 |    12.50 | Using index condition; Using where |
              +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+------------------------------------+
              1 row in set, 2 warnings (0.00 sec)

              Than without ICP,

              mysql> set optimizer_switch='index_condition_pushdown=off';
              Query OK, 0 rows affected (0.00 sec)
              
              mysql> explain extended select * from a where id>2 and id<4 and age=41;
              +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
              | id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
              +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
              |  1 | SIMPLE      | a     | NULL       | range | id            | id   | 4       | NULL |    2 |    12.50 | Using where |
              +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
              1 row in set, 2 warnings (0.00 sec)

              This is a simple example of ICP, and how this graph can benefit a DBA.

            • InnoDB Buffer Pool Content

              When working with MySQL and using InnoDB engine, this graph is one of the most common values (innodb_buffer_pool*) that you have to tune up to optimize MySQL performance. Specifically speaking on its buffer pool content, it displays the trends for dirty pages against the total buffer pool content. The total buffer pool content includes the clean pages aside of dirty pages. Determining how efficient your MySQL is handling the buffer pool, this graph serves its purpose.

            • InnoDB Buffer Pool Pages

              This graph is helpful when you want to check how efficient MySQL is using your InnoDB buffer pool. You can use this graph, for instance, if your daily traffic doesn’t fill up the assigned innodb_buffer_pool_size, then this could indicate that certain parts of an application aren’t useful or do not serve any purpose or if you set the innodb_buffer_pool_size very high which might be good to lower the value and reclaim back space to your memory.

            • InnoDB Buffer Pool I/O

              When you have to check the number of pages created and written on InnoDB tables or page reads to InnoDB buffer pool by operations on InnoDB tables.

            • InnoDB Buffer Pool Requests

              When you want to determine how efficiently are your queries are accessing the InnoDB buffer pool, this graph serves the purpose. This graph will show the trends based on the data points on how your MySQL server performs when InnoDB engine has to frequently access the disk (indication of buffer pool has not warmed up yet), how frequent the buffer pool requests were handling read requests and write requests.

            • InnoDB Read-Ahead

              When you have the variable innodb_random_read_ahead set to ON, then add this graph as a valuable trend to look at as part of your DBA routine. It shows the trends on how your MySQL InnoDB storage engine manages the buffer pool by the read-ahead background thread, how it manages those subsequently evicted without having been accessed by queries, and how does InnoDB initiate the random read-ahead when a query scans a large portion of a table but in random order.

            • InnoDB Change Buffer

              When you have Percona Server 5.7 running, this graph is useful when monitoring how well InnoDB has allocated change buffering. This changes includes those inserts, updates, and deletes which are specified by innodb_change_buffering variable. Change buffering helps speed up queries, avoiding substantial random access I/O that would be required to read-in secondary index pages from disk.

            • InnoDB Change Buffer Activity

              This is related to the InnoDB Change Buffer graph, but dissects the information into more viable data points. These provide more information to monitor how InnoDB handles change buffering. This is useful in a particular DBA task to determine if your innodb_change_buffer_max_size is set to a too high value, since the change buffering shares the same memory of the InnoDB buffer pool reducing the memory available to cache data pages. You might have to consider to disable change buffering if the working set almost fits in the buffer pool, or if your tables have relatively few secondary indexes. Remember that change buffering does not impose extra overhead, because it only applies to pages that are not in the buffer pool. This graph is also useful if you have to determine how merges are useful if you do have to benchmark your application based on certain requests for particular scenarios. Let say you have a bulk inserts, you have to set innodb_change_buffering=insert and determine if having the values set in your buffer pool and innodb_change_buffer_max_size do not impact disk I/O, specially during recovery or slow shutdown (necessary if you want to do a failover with low downtime requirement). Also, this graph can serve your purpose to evaluate certain scenarios, since merging of change buffer may take several hours when there are numerous secondary indexes to update and many affected rows. During this time, disk I/O is increased, which can cause a significant slowdown for disk-bound queries.

            MySQL Performance Schema

            The MySQL Performance Schema is a complicated topic. It’s a long and hard one, but I’m going to discuss only information that is specific to the graphs we have in SCUMM. There are certain variables as well that you must consider, and ensure they are set properly. Ensure that you have your variable innodb_monitor_enable = all and userstat=1 to see data points in your graphs. As a note, when I am using the word “event” here, it does not mean that this is related to MySQL Event Scheduler. I’m talking about specific events such as MySQL parses a query, MySQL is reading or writing to relay/binary log file, etc.

            Let’s proceed with the graphs then.

            • Performance Schema File IO (Events)

              This graph fetches data points related to any events that occurred in MySQL which might have been instrumented to create multiple instances of the instrumented object (e.g. binary log reads or InnoDB data file reads). Each row summarizes events for a given event name. For example, if there is an instrument for a mutex that is created for each connection, then there could be many instances of this instrumented event as there are multiple connections. The summary row for the instrument summarizes over all these instances. You can check these events in MySQL manual for Performance Schema Summary Tables for more info.

            • Performance Schema File IO (Load)

              This graph is same as “Performance Schema File IO (Events)” graph except that it’s instrumented based on the load.

            • Performance Schema File IO (Bytes)

              This graph is same as “Performance Schema File IO (Events)” graph except that it’s instrumented based on the the size in bytes. For example, how much time did a specific event take when MySQL triggered wait/io/file/innodb/innodb_data_file event.

            • Performance Schema Waits (Events)

              This graph has the data graph for all waits spent on a specific event. You can check Wait Event Summary Tables in the manual for more info.

            • Performance Schema Waits (Load)

              Same as the “Performance Schema Waits (Events)” graph but this time it shows the trends for the load.

            • Index Access Operations (Load)

              This graph is an aggregation of all the table index I/O wait events grouped by index(es) of a table, as generated by the wait/io/table/sql/handler instrument. You can check the MySQL manual about the Performance Schema table table_io_waits_summary_by_index_usage for more info.

            • Table Access Operations (Load)

              “Same as Index Access Operations (Load)” graph, it’s an aggregation of all table I/O wait events group by table, as generated by the wait/io/table/sql/handler instrument. This is very useful to DBAs. For example, you would like to trace how fast it takes to access (fetch) or update (insert, delete, update) a specific table. You can check in the MySQL manual about the Performance Schema table table_io_waits_summary_by_table for more info.

            • Performance Schema SQL & External Locks (Events)

              This graph is an aggregation (counts of how many times it occured) of all table lock wait events, as generated by the wait/lock/table/sql/handler instrument which is group by table. The SQL lock here in the graph means of the internal locks. These internal locks are read normal, read with shared locks, read high priority, read no insert, write allow write, write concurrent insert, write delayed, write low priority, write normal. While the external locks are read external and write external. In any DBA task, this is very useful if you have to trace and investigate locks on a particular table regardless of its type. You can check the table table_lock_waits_summary_by_table for more info.

            • Performance Schema SQL and External Locks (Seconds)

              Same as graph “Performance Schema SQL & External Locks (Events)”, but specified in seconds. If you want to look for your table locks based on seconds it held the locks, then this graph is your good resource.

            Conclusion

            The InnoDB Metrics and MySQL Performance Schema are some of the most in-depth and complicated parts in the MySQL domain, especially when there is no visualization to assist the interpretation. Thus, going to a manual trace and investigations may take some of your time and hard work. SCUMM dashboards offer a very efficient and feasible way to handle these and lower the extra load on any DBA routine task.

            In this blog, you learnt how to use the dashboards for InnoDB and Performance Schema to improve database performance. These dashboards can make you more efficient at analyzing performance.

            by Paul Namuag at November 02, 2018 08:25 AM

            Peter Zaitsev

            Percona Monitoring and Management (PMM) 1.16.0 Is Now Available

            Percona Monitoring and Management

            PMM (Percona Monitoring and Management) is a free and open-source platform for managing and monitoring MySQL, MongoDB, and PostgreSQL performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

            Percona Monitoring and Management

            While much of the team is working on longer-term projects, we were able to provide the following feature:

            • MySQL and PostgreSQL support for all cloud DBaaS providers – Use PMM Server to gather Metrics and Queries from remote instances!
            • Query Analytics + Metric Series – See Database activity alongside queries
            • Collect local metrics using node_exporter + textfile collector

            We addressed 11 new features and improvements, and fixed 21 bugs.

            MySQL and PostgreSQL support for all cloud DBaaS providers

            You’re now able to connect PMM Server to your MySQL and PostgreSQL instances, whether they run in a cloud DBaaS environment, or you simply want Database metrics without the OS metrics.  This can help you get up and running with PMM using minimal configuration and zero client installation, however be aware there are limitations – there won’t be any host-level dashboards populated for these nodes since we don’t attempt to connect to the provider’s API nor are we granted access to the instance in order to deploy an exporter.

            How to use

            Using the PMM Add Instance screen, you can now add instances from any cloud provider (AWS RDS and Aurora, Google Cloud SQL for MySQL, Azure Database for MySQL) and benefit from the same dashboards that you are already accustomed to. You’ll be able to collect Metrics and Queries from MySQL, and Metrics from PostgreSQL.  You can add remote instances by selecting the PMM Add Instance item in a PMM group of the system menu:

            https://github.com/percona/pmm/blob/679471210d476a5e98d26a632318f1680cfd98a2/doc/source/.res/graphics/png/metrics-monitor.menu.pmm1.png?raw=true

            where you will then have the opportunity to add a Remote MySQL or Remote PostgreSQL instance:

            You’ll add the instance by supplying just the Hostname, database Username and Password (and optional Port and Name):

            metrics-monitor.add-remote-mysql-instance.png

            Also new as part of this release is the ability to display nodes you’ve added, on screen RDS and Remote Instances:

            metrics-monitor.add-rds-or-remote-instance1.png

            Server activity metrics in the PMM Query Analytics dashboard

            The Query Analytics dashboard now shows a summary of the selected host and database activity metrics in addition to the top ten queries listed in a summary table.  This brings a view of System Activity (CPU, Disk, and Network) and Database Server Activity (Connections, Queries per Second, and Threads Running) to help you better pinpoint query pileups and other bottlenecks:

            https://raw.githubusercontent.com/percona/pmm/86e4215a58e788a8ec7cb1ebe679e1593c484078/doc/source/.res/graphics/png/query-analytics.png

            Extending metrics with node_exporter textfile collector

            While PMM provides an excellent solution for system monitoring, sometimes you may have the need for a metric that’s not present in the list of node_exporter metrics out of the box. There is a simple method to extend the list of available metrics without modifying the node_exporter code. It is based on the textfile collector.  We’ve enabled this collector as on by default, and is deployed as part of linux:metrics in PMM Client.

            The default directory for reading text files with the metrics is /usr/local/percona/pmm-client/textfile-collector, and the exporter reads files from it with the .prom extension. By default it contains an example file example.prom which has commented contents and can be used as a template.

            You are responsible for running a cronjob or other regular process to generate the metric series data and write it to this directory.

            Example – collecting docker container information

            This example will show you how to collect the number of running and stopped docker containers on a host. It uses a crontab task, set with the following lines in the cron configuration file (e.g. in /etc/crontab):

            */1* * * *     root   echo -n "" > /tmp/docker_all.prom; docker ps -a -q | wc -l | xargs echo node_docker_containers_total >> /usr/local/percona/pmm-client/docker_all.prom;
            */1* * * *     root   echo -n "" > /tmp/docker_running.prom; docker ps | wc -l | xargs echo node_docker_containers_running_total >> /usr/local/percona/pmm-client/docker_running.prom;

            The result of the commands is placed into the docker_all.prom and docker_running.prom files and read by exporter and will create two new metric series named node_docker_containers_total and node_docker_containers_running_total, which we’ll then plot on a graph:

            pmm 1.16

            New Features and Improvements

            • PMM-3195 Remove the light bulb
            • PMM-3194 Change link for “Where do I get the security credentials for my Amazon RDS DB instance?”
            • PMM-3189 Include Remote MySQL & PostgreSQL instance logs into PMM Server logs.zip system
            • PMM-3166 Convert status integers to strings on ProxySQL Overview Dashboard – Thanks,  Iwo Panowicz for  https://github.com/percona/grafana-dashboards/pull/239
            • PMM-3133 Include Metric Series on Query Analytics Dashboard
            • PMM-3078 Generate warning “how to troubleshoot postgresql:metrics” after failed pmm-admin add postgresql execution
            • PMM-3061 Provide Ability to Monitor Remote MySQL and PostgreSQL Instances
            • PMM-2888 Enable Textfile Collector by Default in node_exporter
            • PMM-2880 Use consistent favicon (Percona logo) across all distribution methods
            • PMM-2306 Configure EBS disk resize utility to run from crontab in PMM Server
            • PMM-1358 Improve Tooltips on Disk Space Dashboard – thanks, Corrado Pandiani for texts

            Fixed Bugs

            • PMM-3202 Cannot add remote PostgreSQL to monitoring without specified dbname
            • PMM-3186 Strange “Quick ranges” tag appears when you hover over documentation links on PMM Add Instance screen
            • PMM-3182 Some sections for MongoDB are collapsed by default
            • PMM-3171 Remote RDS instance cannot be deleted
            • PMM-3159 Problem with enabling RDS instance
            • PMM-3127 “Expand all” button affects JSON in all queries instead of the selected one
            • PMM-3126 Last check displays locale format of the date
            • PMM-3097 Update home dashboard to support PostgreSQL nodes in Environment Overview
            • PMM-3091 postgres_exporter typo
            • PMM-3090 TLS handshake error in PostgreSQL metric
            • PMM-3088 It’s possible to downgrade PMM from Home dashboard
            • PMM-3072 Copy to clipboard is not visible for JSON in case of long queries
            • PMM-3038 Error adding MySQL queries when options for mysqld_exporters are used
            • PMM-3028 Mark points are hidden if an annotation isn’t added in advance
            • PMM-3027 Number of vCPUs for RDS is displayed incorrectly – report and proposal from Janos Ruszo
            • PMM-2762 Page refresh makes Search condition lost and shows all queries
            • PMM-2483 LVM in the PMM Server AMI is poorly configured/documented – reported by Olivier Mignault  and lot of people involved.  Special thanks to  Chris Schneider for checking with fix options
            • PMM-2003 Delete all info related to external exporters on pmm-admin list output

            How to get PMM Server

            PMM is available for installation using three methods:

            Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

            by Dmitriy Kostiuk at November 02, 2018 01:26 AM

            November 01, 2018

            Peter Zaitsev

            WiredTiger Encryption at Rest with Percona Server for MongoDB

            wired-tiger-encryption

            wired-tiger-encryptionEncryption has become an important function in the database industry, as most companies are taking extra care to keep their data safe. It is important to keep the data safe on disk as well as when it is moving in the network. This restricts any unauthorized access to the data. These two types of protection are known as encryption at REST for the data in storage, and encryption in TRANSPORT for the data moving in the network.

            In upstream MongoDB software, data encryption at rest is available – but in the Enterprise version only. So those who are using the community version and want to implement encryption at rest have to use disk level encryption or file system encryption (like LUKS or DM-crypt) to achieve the same effect. This seems to solve for encrypting the data, but it comes with the added complexity of implementing and maintaining an extra set of operations. We have seen some customers face trouble after implementing the encryption at storage level due to the bugs in the encryption software.

            Now the good NEWS!

            Percona Server for MongoDB now provides WiredTiger encryption at rest with Percona Server for MongoDB 3.6.8-2.0 in BETA, and it is free to use. This useful feature applies encryption to only the MongoDB data, rather than full storage encryption. More importantly, it requires very minimal steps and is easy to implement when starting the DB. This is available only for the WiredTiger engine now, and can encrypt the data with the local key management via a keyfile. We expect that future releases will support third-party key management and vaults.

            How to implement encryption:

            The example below shows how to implement WiredTiger encryption at rest in Percona Server for MongoDB:

            Add the encryption options below into mongod.conf:

            [root@app ~]# grep security -A2 /etc/mongod.conf
            security:
              enableEncryption: true
              encryptionKeyFile: /data/key/mongodb.key

            By default, Percona Server for MongoDB uses the AES256-CBC cipher mode. If you want to use the AES256-GCM cipher mode, then use the encryptionCipherMode parameter to change it. In general, CBC and GCM cipher modes work differently. CBC is faster and GCM is safer (compared to each other). I found some interesting discussion and benchmark here and here.

            encryptionCipherMode: AES256-GCM

            Create your key with openssl as below:

            [root@app ~]# mkdir /data/key
            [root@app ~]# openssl rand -base64 32 > /data/key/mongodb.key
            [root@app ~]# chmod 600 /data/key/mongodb.key

            Now start Percona Server for MongoDB:

            [root@app ~]# systemctl start mongod
            [root@app ~]#

            How to confirm that you have enabled encryption at rest in Percona Server for MongoDB:

            To check whether you have enabled the encryption successfully in the database, you can use the command below to check:

            > db.serverCmdLineOpts().parsed.security
            { "enableEncryption" : true, "encryptionKeyFile" : "/data/key/mongodb.key" }

            Search for the string “percona_encryption_extension_init” in your log file:

            [root@app ~]# grep -i "percona_encryption_extension_init" /var/log/mongo/mongod.log
            2018-10-30T10:32:40.895+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=256M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),cache_cursors=false,compatibility=(release="3.0",require_max="3.0"),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),encryption=(name=percona,keyid="/default"),extensions=[local=(entry=percona_encryption_extension_init,early_load=true,config=(cipher=AES256-CBC)),],cache_size=256M

            Hope this helped with how to encrypt your MongoDB data with the Percona Server MongoDB 3.6.8-2.0 package. We will let you know as we make future versions support third-party key management and vaults soon!


            Photo by Wayne Chan on Unsplash

            by Vinodh Krishnaswamy at November 01, 2018 07:01 PM

            MariaDB Foundation

            MariaDB 10.0.37 now available

            The MariaDB Foundation is pleased to announce the availability of MariaDB 10.0.37, the latest stable release in the MariaDB 10.0 series. See the release notes and changelogs for details. Download MariaDB 10.0.37 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository Configuration Generator Contributors to MariaDB 10.0.37 Alexander Barkov (MariaDB Corporation) Anel […]

            The post MariaDB 10.0.37 now available appeared first on MariaDB.org.

            by Ian Gilfillan at November 01, 2018 03:19 PM

            Peter Zaitsev

            How To Best Use Percona Server Column Compression With Dictionary

            Database Compression

            column compressionVery often, database performance is affected by the inability to cache all the required data in memory. Disk IO, even when using the fastest devices, takes much more time than a memory access. With MySQL/InnoDB, the main memory cache is the InnoDB buffer pool. There are many strategies we can try to fit as much data as possible in the buffer pool, and one of them is data compression.

            With regular MySQL, to compress InnoDB data you can either use “Barracuda page compression” or “transparent page compression with punch holes”. The use of the ZFS filesystem is another possibility, but it is external to MySQL and doesn’t help with caching. All these solutions are transparent, but often they also have performance and management implications. If you are using Percona Server for MySQL, you have yet another option, “column compression with dictionary“. This feature is certainly not receiving the attention it merits. I think it is really cool—let me show you why.

            We all know what compression means, who has not zipped a file before attaching it to an email? Compression removes redundancy from a file. What about the dictionary? A compression dictionary is a way to seed the compressor with expected patterns, in order to improve the compression ratio. Because you can specify a dictionary, the scope of usefulness of column compression with the Percona Server for MySQL feature is greatly increased. In the following sections, we’ll review the impacts of a good dictionary, and devise a way to create a good one without any guessing.

            A simple use case

            A compression algorithm needs a minimal amount of data in order to achieve a reasonable compression ratio. Typically, if the object is below a few hundred bytes, there is rarely enough data to have repetitive patterns and when the compression header is added, the compressed data can end up larger than the original.

            mysql> select length('Hi!'), length(compress('Hi!'));
            +---------------+-------------------------+
            | length('Hi!') | length(compress('Hi!')) |
            +---------------+-------------------------+
            |             3 |                      15 |
            +---------------+-------------------------+
            1 row in set (0.02 sec)

            Compressing a string of three bytes results in a binary object of 15 bytes. That’s counter productive.

            In order to illustrate the potential of the dictionary, I used this dataset:

            http://skeeto.s3.amazonaws.com/share/JEOPARDY_QUESTIONS1.json.gz

            It is a set of 100k Jeopardy questions written in JSON. To load the data in MySQL, I created the following table:

            mysql> show create table TestColCompression\G
            *************************** 1. row ***************************
            Table: TestColCompression
            Create Table: CREATE TABLE `TestColCompression` (
            `id` int(11) NOT NULL AUTO_INCREMENT,
            `question` text NOT NULL,
            PRIMARY KEY (`id`)
            ) ENGINE=InnoDB AUTO_INCREMENT=79977 DEFAULT CHARSET=latin1
            1 row in set (0.00 sec)

            Then, I did some formatting to create insert statements:

            zcat JEOPARDY_QUESTIONS1.json.gz | perl -p -e 's/\[\{/\{/g' | perl -p -e 's/\}, \{/\}\n\{/g' | perl -p -e "s/'/''/g" | \
              (while read line; do echo "insert into testColComp (questionJson) values ('$line');"; done )

            And I executed the inserts. About 20% of the rows had some formatting issues but nevertheless, I ended up with close to 80k rows:

            mysql> show table status\G
            *************************** 1. row ***************************
            Name: TestColCompression
            Engine: InnoDB
            Version: 10
            Row_format: Dynamic
            Rows: 78110
            Avg_row_length: 316
            Data_length: 24690688
            Max_data_length: 0
            Index_length: 0
            Data_free: 4194304
            Auto_increment: 79977
            Create_time: 2018-10-26 15:16:41
            Update_time: 2018-10-26 15:40:34
            Check_time: NULL
            Collation: latin1_swedish_ci
            Checksum: NULL
            Create_options:
            Comment:
            1 row in set (0.00 sec)

            The average row length is 316 bytes for a total data size of 23.55MB. The question JSON objects are large enough to matter, but barely large enough for compression. Here are the first five rows:

            mysql> select question from TestColCompression limit 5\G
            *************************** 1. row ***************************
            question: {"category": "HISTORY", "air_date": "2004-12-31", "question": "'For the last 8 years of his life, Galileo was under house arrest for espousing this man's theory'", "value": "$200", "answer": "Copernicus", "round": "Jeopardy!", "show_number": "4680"}
            *************************** 2. row ***************************
            question: {"category": "ESPN's TOP 10 ALL-TIME ATHLETES", "air_date": "2004-12-31", "question": "'No. 2: 1912 Olympian; football star at Carlisle Indian School; 6 MLB seasons with the Reds, Giants & Braves'", "value": "$200", "answer": "Jim Thorpe", "round": "Jeopardy!", "show_number": "4680"}
            *************************** 3. row ***************************
            question: {"category": "EVERYBODY TALKS ABOUT IT...", "air_date": "2004-12-31", "question": "'The city of Yuma in this state has a record average of 4,055 hours of sunshine each year'", "value": "$200", "answer": "Arizona", "round": "Jeopardy!", "show_number": "4680"}
            *************************** 4. row ***************************
            question: {"category": "OLD FOLKS IN THEIR 30s", "air_date": "2009-05-08", "question": "'The district of conservative rep. Patrick McHenry in this state includes Mooresville, a home of NASCAR'", "value": "$800", "answer": "North Carolina", "round": "Jeopardy!", "show_number": "5690"}
            *************************** 5. row ***************************
            question: {"category": "MOVIES & TV", "air_date": "2009-05-08", "question": "'Tim Robbins played a public TV newsman in "Anchorman: The Legend of" him'", "value": "$800", "answer": "Ron Burgundy", "round": "Jeopardy!", "show_number": "5690"}

            Let’s begin by a straight column compression without specifying a dictionary:

            mysql> alter table TestColCompression modify question text COLUMN_FORMAT COMPRESSED;
            Query OK, 79976 rows affected (4.25 sec)
            Records: 79976 Duplicates: 0 Warnings: 0
            mysql> analyze table TestColCompression;
            +----------------------------+---------+----------+----------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------------+---------+----------+----------+
            | colcomp.TestColCompression | analyze | status | OK |
            +----------------------------+---------+----------+----------+
            mysql> show table status\G
            *************************** 1. row ***************************
            Name: TestColCompression
            Engine: InnoDB
            Version: 10
            Row_format: Dynamic
            Rows: 78995
            Avg_row_length: 259
            Data_length: 20496384
            Max_data_length: 0
            Index_length: 0
            Data_free: 4194304
            Auto_increment: 79977
            Create_time: 2018-10-26 15:47:56
            Update_time: 2018-10-26 15:47:56
            Check_time: NULL
            Collation: latin1_swedish_ci
            Checksum: NULL
            Create_options:
            Comment:
            1 row in set (0.00 sec)

            As expected the data didn’t compress much. The compression ration is 0.82 or if expressed as a percentage, 18%. Since the JSON headers are always the same, and are present in all questions, we should minimally use them for the dictionary. Trying a minimal dictionary made of the headers gives:

            mysql> SET @dictionary_data = 'category' 'air_date' 'question' 'value' 'answer' 'round' 'show_number' ;
            Query OK, 0 rows affected (0.01 sec)
            mysql> CREATE COMPRESSION_DICTIONARY simple_dictionary (@dictionary_data);
            Query OK, 0 rows affected (0.00 sec)
            mysql> alter table TestColCompression modify question text COLUMN_FORMAT COMPRESSED WITH COMPRESSION_DICTIONARY simple_dictionary;
            Query OK, 79976 rows affected (4.72 sec)
            Records: 79976 Duplicates: 0 Warnings: 0
            mysql> analyze table TestColCompression;
            +----------------------------+---------+----------+----------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------------+---------+----------+----------+
            | colcomp.TestColCompression | analyze | status | OK |
            +----------------------------+---------+----------+----------+
            1 row in set (0.00 sec)
            mysql> show table status\G
            *************************** 1. row ***************************
            Name: TestColCompression
            Engine: InnoDB
            Version: 10
            Row_format: Dynamic
            Rows: 78786
            Avg_row_length: 246
            Data_length: 19447808
            Max_data_length: 0
            Index_length: 0
            Data_free: 4194304
            Auto_increment: 79977
            Create_time: 2018-10-26 17:58:17
            Update_time: 2018-10-26 17:58:17
            Check_time: NULL
            Collation: latin1_swedish_ci
            Checksum: NULL
            Create_options:
            Comment:
            1 row in set (0.00 sec)

            There is a little progress, we now have a compression ratio of 0.79. Obviously, we could do more but without a tool, we’ll have to guess. A compressor like zlib builds a dictionary as part of its compression effort, could we use that? Yes, but only if we can generate it correctly and access the result. That’s not readily available with the common compressors I know. Fortunately, someone else had the same issue and wrote a compressor able to save its dictionary. Please let me introduce femtozip.

            Femtozip to the rescue

            The tool, by itself, has no magic algorithm. It is based on zlib, from what I can understand from the code. Anyway, we won’t compress anything with it, we’ll use it to generate a good dictionary. In order to create a dictionary, the tool looks at a set of files and try to see patterns between them. The use of a single big file defeat the purpose. So, I generated one file per question with:

            mkdir questions
            cd questions
            l=1; mysql -u blog -pblog colcomp -e 'select question from TestColCompression' | (while read line; do echo $line > ${l}; let l=l+1; done)

            Then, I used the following command to generate a 1024 bytes dictionary using all the files starting by “1”:

            ../femtozip/cpp/fzip/src/fzip --model ../questions_1s.mod --build --dictonly --maxdict 1024 1*
            Building dictionary...

            In about 10s the job was done. I tried with all the 80k files and… I had to kill the process after thirty minutes. Anyway, there are 11111 files starting with “1”, a very decent sample. Our generated dictionary looks like:

            cat ../questions_1s.mod
            ", "air_date", "round": "Double Jeopardy!", "show_number": " of this for 00", "answer": "the 0", "question": "'e", "round": "Jeopardy!", "show_number": "r", "round": "{"cate gory": "S", "air_date": "1998-s", "round": "Double Jeopardy!", "show_number": " of the ", "air_date": "2008-{"category": "THE {"category": "As", "round": "Jeopardy!", "show_number": "4", "question": "'Jeopardy!", "show_number": "2'", "value": "$1000", "answer": "7", "question": "'The ", "question": "'A'", "value": "$600", "answer": "9", "questi on": "'In ", "question": "'This 3", "question": "'2", "question": "'e'", "value": "$", "round": "Double Jeopardy!", "show_number": "4", "round": "Jeopardy!", "show_number": "4"'", "value": "$S", "air_date": "199", "round": "Double Jeopardy!", "show_number": "5s'", "value": "$", "round": "Double Jeopardy!", "show_number": "3", "round": "Jeopardy !", "show_number": "3", "round": "Jeopardy!", "show_number": "5'", "value": "$200", "answer": "'", "value": "$800", "answer": "'", "value": "$400", "answer": "

            With some formatting, I was able to create a dictionary with the above data:

            mysql> SET @dictionary_data = '", "air_date", "round": "Double Jeopardy!", "show_number": " of this for 00", "answer": "the 0", "question": "''e", "round": "Jeopardy!", "show_number": "r", "round": "{"category": "S", "air_date": "1998-s", "round": "Double Jeopardy!", "show_number": " of the ", "air_date": "2008-{"category": "THE {"category": "As", "round": "Jeopardy!", "show_number": "4", "question": "''Jeopardy!", "show_number": "2''", "value": "$1000", "answer": "7", "question": "''The ", "question": "''A''", "value": "$600", "answer": "9", "question": "''In ", "question": "''This 3", "question": "''2", "question": "''e''", "value": "$", "round": "Double Jeopardy!", "show_number": "4", "round": "Jeopardy!", "show_number": "4"''", "value": "$S", "air_date": "199", "round": "Double Jeopardy!", "show_number": "5s''", "value": "$", "round": "Double Jeopardy!", "show_number": "3", "round": "Jeopardy!", "show_number": "3", "round": "Jeopardy!", "show_number": "5''", "value": "$200", "answer": "''", "value": "$800", "answer": "''", "value": "$400", "answer": "' ;
            Query OK, 0 rows affected (0.00 sec)
            mysql> CREATE COMPRESSION_DICTIONARY femtozip_dictionary (@dictionary_data);
            Query OK, 0 rows affected (0.00 sec)
            <\pre>
            And then, I altered the table to use the new dictionary:

            mysql> alter table TestColCompression modify question text COLUMN_FORMAT COMPRESSED WITH COMPRESSION_DICTIONARY femtozip_dictionary;
            Query OK, 79976 rows affected (4.05 sec)
            Records: 79976 Duplicates: 0 Warnings: 0
            mysql> analyze table TestColCompression;
            +----------------------------+---------+----------+----------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------------+---------+----------+----------+
            | colcomp.TestColCompression | analyze | status | OK |
            +----------------------------+---------+----------+----------+
            1 row in set (0.00 sec)
            mysql> show table status\G
            *************************** 1. row ***************************
            Name: TestColCompression
            Engine: InnoDB
            Version: 10
            Row_format: Dynamic
            Rows: 79861
            Avg_row_length: 190
            Data_length: 15220736
            Max_data_length: 0
            Index_length: 0
            Data_free: 4194304
            Auto_increment: 79977
            Create_time: 2018-10-26 17:56:09
            Update_time: 2018-10-26 17:56:09
            Check_time: NULL
            Collation: latin1_swedish_ci
            Checksum: NULL
            Create_options:
            Comment:
            1 row in set (0.00 sec)

            That’s interesting, we are now achieving a ratio of 0.61, a significant improvement. I pushed my luck and tried with a 2048 bytes dictionary. It further reduced the ratio to 0.57 but that was about the best I got. Larger dictionaries didn’t lower the ratio below 0.57. Zlib supports up to 32KB for the dictionary.

            So, to recap:

            • column compression without dictionary, ratio of 0.82
            • column compression with simple dictionary, ratio of 0.79
            • column compression with a 1k dictionary from femtozip, ratio of 0.61
            • column compression with a 2k dictionary from femtozip, ratio of 0.57

            The above example stores a JSON document in a text column. MySQL 5.7 includes a JSON datatype which behaves a bit differently regarding the dictionary. Delimiting characters like ‘{}’ are removed in the on disk representation of a JSON column. If you have TBs of data in similar tables, you should really consider column compression and a systematic way of determining the dictionary with femtozip. In addition to improve the compression, it is likely to be the less performance impacting solution. Would it be interesting to generate a dictionary from existing data with a command like this one?

            CREATE COMPRESSION_DICTIONARY_FROM_DATA A_good_dictionary (2048, select questions from TestColCompression limit 10000);

            where the dictionary creation process would implicitly includes steps similar to the ones I did with femtozip.

            by Yves Trudeau at November 01, 2018 02:04 PM

            October 31, 2018

            Peter Zaitsev

            Percona Server for MongoDB 3.6.8-2.0 Is Now Available

            Percona Server for MongoDB

            Percona Server for MongoDB

            Percona announces the release of Percona Server for MongoDB 3.6.8-2.0 on October 31, 2018. Download the latest version from the Percona website or the Percona Software Repositories.

            Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.6 Community Edition. It supports MongoDB 3.6 protocols and drivers.

            Percona Server for MongoDB extends Community Edition functionality by including the Percona Memory Engine storage engine, as well as several enterprise-grade features. Also, it includes MongoRocks storage engine, which is now deprecated. Percona Server for MongoDB requires no changes to MongoDB applications or code.

            This release introduces data at rest encryption for the WiredTiger storage engine. Data at rest encryption for WiredTiger in Percona Server for MongoDB is compatible with the upstream implementation. In this release of Percona Server for MongoDB, this feature is of BETA quality and should not be used in a production environment.

            Note that Percona Server for MongoDB 3.6.8-2.0 is based on  MongoDB 3.6.8 which is distributed under the GNU AGPLv3 license. Subsequent releases of Percona Server for MongoDB will change its license to SSPL when we move to the SSPL codebase released by MongoDB. For more information, see Percona Statement on MongoDB Community Server License Change.

            This release also contains a fix for bug PSMDB-238

            Known Issues

            • PSMDB-233: When starting Percona Server for MongoDB 3.6 with WiredTiger encryption options but using a different storage engine, the server starts normally and produces no warnings that these options are ignored.
            • PSMDB-239: WiredTiger encryption is not disabled with using the Percona Memory Engine storage engine.
            • PSMDB-245: KeyDB’s WiredTiger logs are not properly rotated without restarting the server.

            The Percona Server for MongoDB 3.6.8-2.0 release notes are available in the official documentation.

            by Borys Belinsky at October 31, 2018 05:25 PM

            MariaDB Foundation

            MariaDB Galera Cluster 5.5.62 now available

            The MariaDB Foundation is pleased to announce the availability of MariaDB 5.5.62, the latest stable release in the MariaDB Galera Cluster 5.5 series. See the release notes and changelogs for details. Download MariaDB Galera Cluster 5.5.62 Release Notes Changelog What is MariaDB Galera Cluster? Contributors to MariaDB Galera Cluster 5.5.62 Alexander Barkov (MariaDB Corporation) Daniel […]

            The post MariaDB Galera Cluster 5.5.62 now available appeared first on MariaDB.org.

            by Ian Gilfillan at October 31, 2018 03:06 PM

            Peter Zaitsev

            Percona Server for MySQL 8.0 Delivers Increased Reliability, Performance and Security

            Percona Server for MySQL 5.7

            Percona Server for MySQLPercona released a Release Candidate (RC) version of Percona Server for MySQL 8.0, the company’s free, enhanced, drop-in replacement for MySQL Community Edition. Percona Server for MySQL 8.0 includes all the features of MySQL Community Edition 8.0, along with enterprise-class features from Percona that make it ideal for enterprise production environments. The latest release offers increased reliability, performance and security.

            Percona Server for MySQL 8.0 General Availability (GA) will be available later this year. You learn how to install the release candidate software here. Please note this release candidate is not intended nor recommended for production environments

            MySQL databases remain a pillar of enterprise data centers. But as the amount of data collected continues to soar, and as the types of databases deployed for specific applications continue to expand, organizations require a powerful and cost-effective MySQL solution for their production environments. Percona meets this need with its mature, proven open source alternative to MySQL Community Edition. Percona also backs MySQL 8.0 with the support and consulting services enterprises need to achieve optimal performance and maximize the value they obtain from the software – all with lower cost and complexity.

            With more than 4,550,000 downloads, Percona Server for MySQL offers self-tuning algorithms and support for extremely high-performance hardware, delivering excellent performance and reliability for production environments. Percona Server for MySQL is trusted by thousands of enterprises to provide better performance. The following capabilities are unique to Percona Server for MySQL 8.0:

            • Greater scalability and availability, enhanced backups, and increased visibility to improve performance, reliability and usability
            • Parallel doublewrite functionality for greatly improved write performance, resulting in increased speed
            • Additional write-optimized storage engine, MyRocks, which takes advantage of modern hardware and database technology to reduce storage requirements, maintenance costs, and increase ROI in both on-premises and cloud-based applications, delivered with a MySQL-compatible interface.
            • Enhanced encryption functionality – including integration with Hashicorp Vault to simplify the management of encryption keys – for increased security
            • Advanced PAM-based authentication, audit logging, and threadpool scalability – enterprise-grade features available in Percona Server for MySQL without a commercial license

            Percona Server for MySQL 8.0 also contains all the new features introduced in MySQL Community Edition 8.0, including:

            • Greater Reliability – A new transactional data dictionary makes recovery from failure easier, providing users with a higher level of comfort that data will not be lost. The transactional data dictionary is now crash-safe and centralized. In addition, support for atomic Data Definition Language (DDL) statements ensures that all operations associated with a DDL transaction are committed or rejected as a unit.
            • Enhanced Performance – New functions and expressions (along with Percona’s parallel doublewrite buffer) improve overall performance, allowing users to see their data more quickly. Enhanced JSON functionality improves document storage and query capabilities, and the addition of Window functions provides greater flexibility for aggregation queries. Common Table Expressions (CTE) enable improved query syntax for complex queries.
            • Increased Security – The ability to collect a typical series of permission grant statements for a user into a defined role and then apply that role to a user in MySQL makes the database environment more secure from outside attack by allowing administrators to better manage access permissions for authorized users. SQL Roles also enable administrators to more easily and efficiently determine a set of privileges for multiple associated users, saving time and reducing errors.
            • Expanded Queries – Percona Server for MySQL 8.0 provides support for spatial data types and indexes, as well as for Spatial Reference System (SRS), the industry-standard method for geospatial lookup.

            Learn more about the Percona Server for MySQL 8.0 RC release here.

            by Tyler Duzan at October 31, 2018 07:57 AM

            Percona Server for MongoDB 4.0 Is an Ideal Solution for Even More Production Use Cases

            Percona Server for MongoDB

            Percona Server for MongoDBPercona announces the planned release of Percona Server for MongoDB 4.0, the latest version of the company’s free, enhanced, drop-in replacement for MongoDB Community Edition. Percona Server for MongoDB 4.0 includes all the features of MongoDB Community Edition 4.0, along with enterprise-class features from Percona that make it ideal for enterprise production environments.

            Percona Server for MongoDB 4.0 will be generally available later this year.

            Organizations are increasingly leveraging multiple open source databases to meet their diverse application requirements and improve the customer experience. Percona supports these organizations with a range of enhanced, production-ready open source databases, enterprise-grade support, and consulting services.

            With more than 385,000 downloads, Percona Server for MongoDB provides all the cost and agility benefits of free, proven open source software, along with greater security, reliability and flexibility. With Percona Server for MongoDB, an increasing number of organizations can confidently run the document-based NoSQL MongoDB database to support their product catalogs, online shopping carts, Internet of Things (IoT) applications, mobile/social apps and more. Percona also backs Percona Server for MongoDB 4.0 and MongoDB Community Edition with the support and consulting services enterprises need to achieve optimal performance and maximize the value they obtain from the software – all with lower cost and complexity.

            Percona Kubernetes Operator for MongoDB enables easier deployment of Percona Server for MongoDB environments – including standalone, replica set, or sharded cluster – inside Kubernetes and OpenShift platforms, without the need to move to an Enterprise Server. Management and backup are provided through the Operator working with Percona Monitoring and Management and hot backup.

            Percona Server for MongoDB 4.0 contains all of the features in MongoDB Community Edition 4.0, including important capabilities that greatly expand its use cases:

            • Support for multi-document ACID transactions – ensuring accurate updates to all of the documents involved in a transaction, and moving the complexity of achieving these updates from the application to the database.
            • Support for SCRAM-SHA-256 authentication – making the production environment more secure and less vulnerable to external attack.
            • New type conversion and string operators – providing greater flexibility when performing aggregations.

            Percona Server for MongoDB 4.0 also offers essential enterprise features for free, including:

            • Encrypted WiredTiger storage engine (“data at rest encryption”) with local key management. Integration with key management systems will be available in future releases of Percona Server for MongoDB.
            • SASL Authentication plugin for enabling authentication through OpenLDAP or Active Directory.
            • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as usernames and IP addresses) from log files.
            • Hot backups to protect against data loss in the event of a crash or disaster – backup activity does not impact performance.
            • Percona Memory Engine, a 100 percent open source in-memory storage engine designed for Percona Server for MongoDB, which is ideal for in-memory computing and other applications demanding very low-latency workloads.
            • Integration with Percona Toolkit and Percona Monitoring and Management for query performance analytics and troubleshooting.
            • Enhanced query profiling.

            by Vadim Tkachenko at October 31, 2018 07:55 AM

            Percona XtraBackup 8.0-3-rc1 Is Available

            Percona XtraBackup 8.0

            Percona XtraBackup 8.0Percona is glad to announce the release candidate of Percona XtraBackup 8.0-3-rc1 on October 31 2018. You can download it from our download site and apt and yum repositories.

            This is a Release Candidate quality release and it is not intended for
            production. If you want a high quality, Generally Available release, use the current stable version (the most recent stable version at the time of writing is 2.4.12 in the 2.4 series).

            This release supports backing up and restoring MySQL 8.0 and Percona Server for MySQL 8.0

            Things to Note

            • innobackupex was previously deprecated and has been removed
            • Due to the new MySQL redo log and data dictionary formats the Percona XtraBackup 8.0.x versions will only be compatible with MySQL 8.0.x and the upcoming Percona Server for MySQL 8.0.x
            • For experimental migrations from earlier database server versions, you will need to backup and restore and using XtraBackup 2.4 and then use mysql_upgrade from MySQL 8.0.x

            Installation

            As this is a release candidate, installation is performed by enabling the testing repository and installing the software via your package manager. For Debian based distributions see apt installation instructions, for RPM based distributions see yum installation instructions. Note that in both cases after installing the current percona-release package, you’ll need to enable the testing repository in order to install Percona XtraBackup 8.0.3-rc1.

            Improvements

            • PXB-1655:  The --lock-ddl option is supported when backing up MySQL 8

            Bugs Fixed

            • PXB-1678:  Incremental backup prepare run with the --apply-log-only option could roll back uncommitted transactions.
            • PXB-1672:  The MTS slave without GTID could be backed up when the --safe-slave-backup option was applied.

            by Borys Belinsky at October 31, 2018 07:55 AM

            Release Candidate for Percona Server 8.0.12-2rc1 Is Available

            Percona Server for MySQL 5.7

            Following the alpha release announced earlier, Percona announces the release candidate of Percona Server for MySQL 8.0.12-2rc1 on October 31, 2018. Download the latest version from the Percona website or from the Percona Software Repositories.

            This release is based on MySQL 8.0.12 and includes all the bug fixes in it. It is a Release Candidate quality release and it is not intended for production. If you want a high quality, Generally Available release, use the current Stable version (the most recent stable release at the time of writing in the 5.7 series is 5.7.23-23).

            Percona provides completely open-source and free software.

            Installation

            As this is a release candidate, installation is performed by enabling the testing repository and installing the software via your package manager.  For Debian based distributions see apt installation instructions, for RPM based distributions see yum installation instructions.  Note that in both cases after installing the current percona-release package, you’ll need to enable the testing repository in order to install Percona Server for MySQL 8.0.12-2rc1.  For manual installations you can download from the testing repository directly through our website.

            New Features

            • #4550: Native Partitioning support for MyRocks storage engine
            • #3911: Native Partitioning support for TokuDB storage engine
            • #4946: Add an option to prevent implicit creation of column family in MyRocks
            • #4839: Better default configuration for MyRocks and TokuDB
            • InnoDB changed page tracking has been rewritten to account for redo logging changes in MySQL 8.0.11.  This fixes fast incremental backups for PS 8.0
            • #4434: TokuDB ROW_FORMAT clause has been removed, compression may be set by using the session variable tokudb_row_format instead.

            Improvements

            • Several packaging changes to bring Percona packages more in line with upstream, including split repositories. As you’ll note from our instructions above we now ship a tool with our release packages to help manage this.

            Bugs Fixed

            • #4785: Setting version_suffix to NULL could lead to handle_fatal_signal (sig=11) in Sys_var_version::global_value_ptr
            • #4788: Setting log_slow_verbosity and enabling the slow_query_log could lead to a server crash
            • #4947: Any index comment generated a new column family in MyRocks
            • #1107: Binlog could be corrupted when tmpdir got full
            • #1549: Server side prepared statements lead to a potential off-by-second timestamp on slaves
            • #4937: rocksdb_update_cf_options was useless when specified in my.cnf or on command line.
            • #4705: The server could crash on snapshot size check in RocksDB
            • #4791: SQL injection on slave due to non-quoting in binlogged ROLLBACK TO SAVEPOINT
            • #4953: rocksdb.truncate_table3 was unstable

            Other bugs fixed:

            • #4811: 5.7 Merge and fixup for old DB-937 introduces possible regression
            • #4885: Using ALTER … ROW_FORMAT=TOKUDB_QUICKLZ leads to InnoDB: Assertion failure: ha_innodb.cc:12198:m_form->s->row_type == m_create_info->row_type
            • Numerous testsuite failures/crashes

            Upcoming Features

            by Borys Belinsky at October 31, 2018 07:50 AM

            October 30, 2018

            Peter Zaitsev

            20+ MongoDB Alternatives You Should Know About

            alternatives to MongoDB

            alternatives to MongoDBAs MongoDB® has changed their license from AGPL to SSPL many are concerned by this change, and by how sudden it has been. Will SSPL be protective enough for MongoDB, or will the next change be to go to an altogether proprietary license? According to our poll, many are going to explore MongoDB alternatives. This blog post provides a brief outline of technologies to consider.

            Open Source Data Stores

            • PostgreSQL is the darling of the open source database community. Especially if your concern is the license,  PostgreSQL’s permissive licence is hard to beat. PostgreSQL has powerful JSON Support, and there are many successful stories of migrating from MongoDB to PostgreSQL
            • Citus While PostgreSQL is a powerful database, and you can store terabytes of data on a single cluster, at a larger scale you will need sharding. If so, consider the Citus PostgreSQL extension, or the DBaaS offering from the same guys.
            • TimescaleDB  If on the other hand you are storing  time series data in MongoDB, then TimescaleDB might be a good fit.
            • ToroDB If you would love to use PostgreSQL but need MongoDB wire protocol compatibility, take a look at ToroDB. While it can’t serve as a full drop-in replacement for MongoDB server just yet, the developer told me that with some work it is possible.
            • CockroachDB While not based on the PostgreSQL codebase, CockroachDB is PostgreSQL wire protocol compatible and it is natively distributed, so you will not need to do manual sharding.
            • MySQL® is another feasible replacement. MySQL 5.7 and MySQL 8 have great support for JSON, and it continues to get better with every maintenance release. You can also consider MySQL Cluster for medium size sharded environments. You can also consider MariaDB and Percona Server  for MySQL
            • MySQL DocStore is a CRUD interface for JSON data stored in MySQL, and while it is not the same as MongoDB’s query language, it is much easier to transition to compared to SQL.
            • Vitess Would you love to use MySQL but can’t stand manual sharding? Vitess is a powerful sharding engine for MySQL which will allow you to grow to great scale while using proven MySQL as a backend.
            • TiDB is another take on MySQL compatible sharding. This NewSQL engine is MySQL wire protocol compatible but underneath is a distributed database designed from the ground up.
            • CouchDB is a document database which speaks JSON natively.
            • CouchBase is another database engine to consider. While being a document based database, CouchBase offers the N1QL language which has SQL look and feel.
            • ArangoDB is multi-model database, which can be used as document store.
            • Elastic While not a perfect choice for every MongoDB workload, for workloads where document data is searched and analyzed ElasticSearch can be a great alternative.
            • Redis is another contender for some MongoDB workloads. Often used as a cache in front of MongoDB, it can also be used as a JSON store through extensions.  While such extensions from RedisLabs are no longer open source, GoodForm projects provides open source alternatives.
            • ClickHouse may be a great contender for moving analytical workloads from MongoDB. Much faster, and with JSON support and Nested Data Structures, it can be great choice for storing and analyzing document data.
            • Cassandra does not have a document data model, but it has proven to be extremely successful for building scalable distributed clusters. If this is your main use case for MongoDB, then you should consider Cassandra.
            • ScyllaDB is a protocol compatible Cassandra alternative which claims to offer much higher per node performance.
            • HBase is another option worth considering, especially if you already have a Hadoop/HDFS infrastructure.

            Public Cloud Document Stores

            Most major cloud providers offer some variant of a native document database for you to consider.

            • Microsoft Azure Cosmos DB is an interesting engine that provides multiple NoSQL APIs, including for MongoDB and Cassandra.
            • Amazon DynamoDB supports key value and document based APIs. While not offering MongoDB compatibility, DynamoDB has been around for a long time, and is the most battle tested of the public cloud database offerings.
            • Google Cloud DataStore  – Google Cloud offers a number of data storage options for you to consider, and Cloud DataStore offers a data model and query language that is the most similar to MongoDB.

            If you’re not ready for a major migration effort, there is one more solution for you – Percona Server for MongoDB.  Based on MongoDB Community, and enhanced by Percona with Enterprise Features, Percona Server for MongoDB offers 100% compatibility. As we wrote in a previous post, we commit to shipping a supported AGPL version until the situation around SSPL is clearly resolved.

            Want help on deciding what is the best option for you, or with migration heavy lifting? Percona Professional Services can help!

            Have idea for another feasible MongoDB alternative?  Please comment, and I will consider adding it to the list!


            Image by JOSHUA COLEMAN on Unsplash

            by Peter Zaitsev at October 30, 2018 02:52 PM

            PostgreSQL locking, part 3: lightweight locks

            LWLocks lightweight locks postgres

            PostgreSQL logoPostgreSQL lightweight locks, or LWLocks, control memory access. PostgreSQL uses multi-process architecture and should allow only consistent reads and writes to shared memory structures. LWLocks have two levels of locking: shared and exclusive. It’s also possible to release all acquired LWLocks to simplify clean up. Other databases often call primitives similar to LWLocks “latches”. Because LWLocks is an implementation detail, application developers shouldn’t pay much attention to this kind of locking.

            This is the third and final part of a series on PostgreSQL locking, related to latches protecting internal database structures. Here are the previous parts: Row-level locks and table-level locks.

            Instrumentation

            Starting from PostgreSQL 9.6, LWLocks activity can be investigated with the pg_stat_activity system view. It could be useful under high CPU utilization. There are system settings to help with contention on specific lightweight locks.

            Before PostgreSQL 9.5, the LWLocks implementation used spin-locks.  It was a bottleneck. This was fixed in 9.5 with atomic state variable.

            Potential heavy contention places

            • WALInsertLock: protects WAL buffers. You can increase the number of wal buffers to get a slight improvement. Incidentally, synchronous_commit=off increases pressure on the lock even more, but it’s not a bad thing. full_page_writes=off reduces contention, but it’s generally not recommended.
            • WALWriteLock: accrued by PostgreSQL processes while WAL records are flushed to disk or during a WAL segments switch. synchronous_commit=off removes the wait for disk flush, full_page_writes=off reduces the amount of data to flush.
            • LockMgrLock: appears in top waits during a read-only workload. It latches relations regardless of its size. It’s not a single lock, but at least 16 partitions. Thus it’s important to use multiple tables during benchmarks and avoid single table anti-pattern in production.
            • ProcArrayLock: Protects the ProcArray structure. Before PostgreSQL 9.0, every transaction acquired this lock exclusively before commit.
            • CLogControlLock: protects CLogControl structure, if it shows on the top of pg_stat_activity, you should check the location of $PGDATA/pg_clog—it should be on a buffered file system.
            • SInvalidReadLock: protects sinval array. Readers using shared lock. SICleanupQueue, and other array-wide updates, requires an exclusive lock. It shows at the top of the pg_stat_activity when the shared buffer pool is under stress. Using a higher number of shared_buffers helps to reduce contention.
            • BufMappingLocks: protects regions of buffers. Sets 128 regions (16 before 9.5) of buffers to handle the whole buffer cache.

            Spinlocks

            The lowest level for locking is spinlocks. Therefore, it’s implemented within CPU-specific instructions. PostgreSQL is trying to change an atomic variable value in a loop. If the value is changed from zero to one – the process obtained a spinlock. If it’s not possible to get a spinlock immediately, the process will increase its wait delay exponentially.  There is no monitoring on spinlocks and it’s not possible to release all accrued spinlocks at once. Due to the single state change, it’s also an exclusive lock. In order to simplify the porting of PostgreSQL to exotic CPU and OS variants, PostgreSQL uses OS semaphores for its spinlocks implementation. Of course, it’s significantly slower compared to native CPU instructions port.

            Summary

            • Use pg_stat_activity to find which queries or LWLocks are causing lock waits
            • Use fresh branches of PostgreSQL, as developers have been working on performance improvements and trying to reduce locking contention on hot mutexes.

            References

            Locks Photo by Warren Sammut on Unsplash

            by Nickolay Ihalainen at October 30, 2018 01:31 PM

            Jean-Jerome Schmidt

            SQL Firewalling Made Easy with ClusterControl & ProxySQL

            Reading the title of this blog post may raise some questions. SQL firewall - what is that? What does it do? Why would I need something like that in the first place? Well, the ability to block certain queries could come in handy in certain situations. When using ProxySQL in front of your database servers, the proxy is able to inspect all SQL statements being sent. ProxySQL has a sophisticated rules engine, and can match queries that are to be allowed, blocked, re-written on the fly or routed to a specific database server. Let’s go through some examples.

            You have a dedicated slave which is used by developers to test their queries against production data. You want to make sure the developers can only connect to that particular host and execute only SELECT queries.

            Another case, let’s say that you encountered one too many accidents with people running schema changes and you would like to limit which users which can execute ALTER statement.

            Finally, let’s think about a paranoid approach in which users are allowed to execute just a pre-defined whitelisted set of queries.

            In our environment we have a replication setup with the master and two slaves.

            In front of our databases, wee have three ProxySQL nodes with Keepalived managing Virtual IP. We also have ProxySQL cluster configured (as explained in this previous blog) so we don’t have to worry about making configuration or query rule changes three times on all three ProxySQL nodes. For the query rules, a simple read-write split is set up:

            Let’s take a look at how ProxySQL, with its extensive query rules mechanism, can help us to achieve our goals in all those three cases.

            Locking user access to a single hostgroup

            A dedicated slave available to developers - this is not uncommon practice. As long as your developers can access production data (and if they are not allowed, e.g., due to compliance reasons, data masking as explained in our ProxySQL tutorial may help), this can help them to test and optimize queries on the real world data set. It may also help to verify data before executing some of the schema changes. For example, is my column really unique before adding a unique index?

            With ProxySQL it is fairly easy to restrict access. For starters, let’s assume that the hostgroup 30 contains the slave we want developers to access.

            We need an user which will be used by the developers to access that slave. If you have it already in ProxySQL, that’s fine. If not, you may either need to import it to ProxySQL (if it is created in MySQL but not in ProxySQL) or create it in both locations (if you’ll be creating a new user). Let’s go with the last option, creating a new user.

            Let’s create a new user with limited privileges on both MySQL and ProxySQL. We will use it in query rules to identify traffic coming from the developers.

            In this query rule we are going to redirect all of the queries which are executed by dev_test user to the hostgroup 30. We want this rule to be active and it should be the final one to parse, therefore we enabled ‘Apply’. We also configured RuleID to be smaller than the ID of the first existing rule as we want this query to be executed outside of the regular read/write split setup.

            As you can see, we used an username but there are also other options.

            If you can predict which development hosts will send the traffic to the database (for example, you can have developers use a specific proxy before they can reach the database), you can also use the “Client Address” option to match queries executed by that single host and redirect them to a correct hostgroup.

            Disallowing user from executing certain queries

            Now, let’s consider the case where we want to limit execution of some particular commands to a given user. This could be handy to ensure that the right people can run some of the performance impacting queries like schema changes. ALTER will be the query which we will use in this example. For starters, let’s add a new user which will be allowed to run schema changes. We will call it ‘admin_user’. Next, we need to create the required query rules.

            We will create a query rule which uses ‘.*ALTER TABLE.*’ regular expression to match the queries. This query rule should be executed before other, read/write split rules. We assigned a rule ID of 20 to it. We define an error message that will be returned to the client in case this query rule will be triggered. Once done, we proceed to another query rule.

            Here we use the same regular expression to catch the query but we don’t define any error text (which means that query will not return an error). We also define which user is allowed to execute it (admin_user in our case). We make sure this query is checked before the previous one, so we assigned a lower rule ID of 19 to it.

            Once these two query rules are in place, we can test how they work. Let’s try to log in as an application user and run an ALTER TABLE query:

            root@vagrant:~# mysql -P6033 -usbtest -ppass -h10.0.0.111
            mysql: [Warning] Using a password on the command line interface can be insecure.
            Welcome to the MySQL monitor.  Commands end with ; or \g.
            Your MySQL connection id is 43160
            Server version: 5.5.30 (ProxySQL)
            
            Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
            
            Oracle is a registered trademark of Oracle Corporation and/or its
            affiliates. Other names may be trademarks of their respective
            owners.
            
            Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
            
            mysql> use sbtest;
            Reading table information for completion of table and column names
            You can turn off this feature to get a quicker startup with -A
            
            Database changed
            mysql> alter table sbtest1 add index (pad);
            ERROR 1148 (42000): You are not allowed to execute ALTER
            mysql> ^DBye

            As expected, we couldn’t execute this query and we received an error message. Let’s now try to connect using our ‘admin_user’:

            root@vagrant:~# mysql -P6033 -uadmin_user -ppass -h10.0.0.111
            mysql: [Warning] Using a password on the command line interface can be insecure.
            Welcome to the MySQL monitor.  Commands end with ; or \g.
            Your MySQL connection id is 43180
            Server version: 5.5.30 (ProxySQL)
            
            Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
            
            Oracle is a registered trademark of Oracle Corporation and/or its
            affiliates. Other names may be trademarks of their respective
            owners.
            
            Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
            
            mysql> use sbtest;
            Reading table information for completion of table and column names
            You can turn off this feature to get a quicker startup with -A
            
            Database changed
            mysql> alter table sbtest1 add index (pad);
            Query OK, 0 rows affected (0.99 sec)
            Records: 0  Duplicates: 0  Warnings: 0

            We managed to execute the ALTER as we logged in using ‘admin_user’. This is a very simple way of ensuring that only appointed people can run schema changes on your databases.

            Creating a whitelist of allowed queries

            Finally, let’s consider a tightly locked environment where only predefined queries can be executed. ProxySQL can be easily utilized to implement such setup.

            First of all, we need to remove all existing query rules before we can implement what we need. Then, we need to create a catch-all query rule, which will block all the queries:

            The rest we have to do is to create query rules for all of the queries which are allowed. You can do one rule per query. Or you can use regular expressions if, for example, SELECTs are always ok to run. The only thing you have to remember is that the rule ID has to be smaller than the rule ID of this catch-all rule, and ensure that the query will eventually hit the rule with ‘Apply’ enabled.

            We hope that this blog post gave you some insight into how you can utilize ClusterControl and ProxySQL to improve security and ensure compliance of your databases.

            by krzysztof at October 30, 2018 08:14 AM

            October 29, 2018

            Jean-Jerome Schmidt

            MySQL on Docker: Running ProxySQL as Kubernetes Service

            When running distributed database clusters, it is quite common to front them with load balancers. The advantages are clear - load balancing, connection failover and decoupling of the application tier from the underlying database topologies. For more intelligent load balancing, a database-aware proxy like ProxySQL or MaxScale would be the way to go. In our previous blog, we showed you how to run ProxySQL as a helper container in Kubernetes. In this blog post, we’ll show you how to deploy ProxySQL as a Kubernetes service. We’ll use Wordpress as an example application and the database backend is running on a two-node MySQL Replication deployed using ClusterControl. The following diagram illustrates our infrastructure:

            Since we are going to deploy a similar setup as in this previous blog post, do expect duplication in some parts of the blog post to keep the post more readable.

            ProxySQL on Kubernetes

            Let’s start with a bit of recap. Designing a ProxySQL architecture is a subjective topic and highly dependent on the placement of the application, database containers as well as the role of ProxySQL itself. Ideally, we can configure ProxySQL to be managed by Kubernetes with two configurations:

            1. ProxySQL as a Kubernetes service (centralized deployment)
            2. ProxySQL as a helper container in a pod (distributed deployment)

            Both deployments can be distinguished easily by looking at the following diagram:

            This blog post will cover the first configuration - running ProxySQL as a Kubernetes service. The second configuration is already covered here. In contrast to the helper container approach, running as a service makes ProxySQL pods live independently from the applications and can be easily scaled and clustered together with the help of Kubernetes ConfigMap. This is definitely a different clustering approach than ProxySQL native clustering support which relies on configuration checksum across ProxySQL instances (a.k.a proxysql_servers). Check out this blog post if you want to learn about ProxySQL clustering made easy with ClusterControl.

            In Kubernetes, ProxySQL's multi-layer configuration system makes pod clustering possible with ConfigMap. However, there are a number of shortcomings and workarounds to make it work smoothly as what ProxySQL's native clustering feature does. At the moment, signalling a pod upon ConfigMap update is a feature in the works. We will cover this topic in much greater detail in an upcoming blog post.

            Basically, we need to create ProxySQL pods and attach a Kubernetes service to be accessed by the other pods within the Kubernetes network or externally. Applications will then connect to the ProxySQL service via TCP/IP networking on the configured ports. Default to 6033 for MySQL load-balanced connections and 6032 for ProxySQL administration console. With more than one replica, the connections to the pod will be load balanced automatically by Kubernetes kube-proxy component running on every Kubernetes node.

            ProxySQL as Kubernetes Service

            In this setup, we run both ProxySQL and Wordpress as pods and services. The following diagram illustrates our high-level architecture:

            In this setup, we will deploy two pods and services - "wordpress" and "proxysql". We will merge Deployment and Service declaration in one YAML file per application and manage them as one unit. To keep the application containers' content persistent across multiple nodes, we have to use a clustered or remote file system, which in this case is NFS.

            Deploying ProxySQL as a service brings a couple of good things over the helper container approach:

            • Using Kubernetes ConfigMap approach, ProxySQL can be clustered with immutable configuration.
            • Kubernetes handles ProxySQL recovery and balance the connections to the instances automatically.
            • Single endpoint with Kubernetes Virtual IP address implementation called ClusterIP.
            • Centralized reverse proxy tier with shared nothing architecture.
            • Can be used with external applications outside Kubernetes.

            We will start the deployment as two replicas for ProxySQL and three for Wordpress to demonstrate running at scale and load-balancing capabilities that Kubernetes offers.

            Preparing the Database

            Create the wordpress database and user on the master and assign with correct privilege:

            mysql-master> CREATE DATABASE wordpress;
            mysql-master> CREATE USER wordpress@'%' IDENTIFIED BY 'passw0rd';
            mysql-master> GRANT ALL PRIVILEGES ON wordpress.* TO wordpress@'%';

            Also, create the ProxySQL monitoring user:

            mysql-master> CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';

            Then, reload the grant table:

            mysql-master> FLUSH PRIVILEGES;

            ProxySQL Pod and Service Definition

            The next one is to prepare our ProxySQL deployment. Create a file called proxysql-rs-svc.yml and add the following lines:

            apiVersion: v1
            kind: Deployment
            metadata:
              name: proxysql
              labels:
                app: proxysql
            spec:
              replicas: 2
              selector:
                matchLabels:
                  app: proxysql
                  tier: frontend
              strategy:
                type: RollingUpdate
              template:
                metadata:
                  labels:
                    app: proxysql
                    tier: frontend
                spec:
                  restartPolicy: Always
                  containers:
                  - image: severalnines/proxysql:1.4.12
                    name: proxysql
                    volumeMounts:
                    - name: proxysql-config
                      mountPath: /etc/proxysql.cnf
                      subPath: proxysql.cnf
                    ports:
                    - containerPort: 6033
                      name: proxysql-mysql
                    - containerPort: 6032
                      name: proxysql-admin
                  volumes:
                  - name: proxysql-config
                    configMap:
                      name: proxysql-configmap
            ---
            apiVersion: v1
            kind: Service
            metadata:
              name: proxysql
              labels:
                app: proxysql
                tier: frontend
            spec:
              type: NodePort
              ports:
              - nodePort: 30033
                port: 6033
                name: proxysql-mysql
              - nodePort: 30032
                port: 6032
                name: proxysql-admin
              selector:
                app: proxysql
                tier: frontend

            Let's see what those definitions are all about. The YAML consists of two resources combined in a file, separated by "---" delimiter. The first resource is the Deployment, which we define the following specification:

            spec:
              replicas: 2
              selector:
                matchLabels:
                  app: proxysql
                  tier: frontend
              strategy:
                type: RollingUpdate

            The above means we would like to deploy two ProxySQL pods as a ReplicaSet that matches containers labelled with "app=proxysql,tier=frontend". The deployment strategy specifies the strategy used to replace old pods by new ones. In this deployment, we picked RollingUpdate which means the pods will be updated in a rolling update fashion, one pod at a time.

            The next part is the container's template:

                  - image: severalnines/proxysql:1.4.12
                    name: proxysql
                    volumeMounts:
                    - name: proxysql-config
                      mountPath: /etc/proxysql.cnf
                      subPath: proxysql.cnf
                    ports:
                    - containerPort: 6033
                      name: proxysql-mysql
                    - containerPort: 6032
                      name: proxysql-admin
                  volumes:
                  - name: proxysql-config
                    configMap:
                      name: proxysql-configmap

            In the spec.templates.spec.containers.* section, we are telling Kubernetes to deploy ProxySQL using severalnines/proxysql image version 1.4.12. We also want Kubernetes to mount our custom, pre-configured configuration file and map it to /etc/proxysql.cnf inside the container. The running pods will publish two ports - 6033 and 6032. We also define the "volumes" section, where we instruct Kubernetes to mount the ConfigMap as a volume inside the ProxySQL pods to be mounted by volumeMounts.

            The second resource is the service. A Kubernetes service is an abstraction layer which defines the logical set of pods and a policy by which to access them. In this section, we define the following:

            apiVersion: v1
            kind: Service
            metadata:
              name: proxysql
              labels:
                app: proxysql
                tier: frontend
            spec:
              type: NodePort
              ports:
              - nodePort: 30033
                port: 6033
                name: proxysql-mysql
              - nodePort: 30032
                port: 6032
                name: proxysql-admin
              selector:
                app: proxysql
                tier: frontend

            In this case, we want our ProxySQL to be accessed from the external network thus NodePort type is the chosen type. This will publish the nodePort on every Kubernetes nodes in the cluster. The range of valid ports for NodePort resource is 30000-32767. We chose port 30033 for MySQL-load balanced connections which is mapped to port 6033 of the ProxySQL pods and port 30032 for ProxySQL Administration port mapped to 6032.

            Therefore, based on our YAML definition above, we have to prepare the following Kubernetes resource before we can begin to deploy the "proxysql" pod:

            • ConfigMap - To store ProxySQL configuration file as a volume so it can be mounted to multiple pods and can be remounted again if the pod is being rescheduled to the other Kubernetes node.

            Preparing ConfigMap for ProxySQL

            Similar to the previous blog post, we are going to use ConfigMap approach to decouple the configuration file from the container and also for scalability purpose. Take note that in this setup, we consider our ProxySQL configuration is immutable.

            Firstly, create the ProxySQL configuration file, proxysql.cnf and add the following lines:

            datadir="/var/lib/proxysql"
            admin_variables=
            {
                    admin_credentials="proxysql-admin:adminpassw0rd"
                    mysql_ifaces="0.0.0.0:6032"
                    refresh_interval=2000
            }
            mysql_variables=
            {
                    threads=4
                    max_connections=2048
                    default_query_delay=0
                    default_query_timeout=36000000
                    have_compress=true
                    poll_timeout=2000
                    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
                    default_schema="information_schema"
                    stacksize=1048576
                    server_version="5.1.30"
                    connect_timeout_server=10000
                    monitor_history=60000
                    monitor_connect_interval=200000
                    monitor_ping_interval=200000
                    ping_interval_server_msec=10000
                    ping_timeout_server=200
                    commands_stats=true
                    sessions_sort=true
                    monitor_username="proxysql"
                    monitor_password="proxysqlpassw0rd"
            }
            mysql_replication_hostgroups =
            (
                    { writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
            )
            mysql_servers =
            (
                    { address="192.168.55.171" , port=3306 , hostgroup=10, max_connections=100 },
                    { address="192.168.55.172" , port=3306 , hostgroup=10, max_connections=100 },
                    { address="192.168.55.171" , port=3306 , hostgroup=20, max_connections=100 },
                    { address="192.168.55.172" , port=3306 , hostgroup=20, max_connections=100 }
            )
            mysql_users =
            (
                    { username = "wordpress" , password = "passw0rd" , default_hostgroup = 10 , active = 1 }
            )
            mysql_query_rules =
            (
                    {
                            rule_id=100
                            active=1
                            match_pattern="^SELECT .* FOR UPDATE"
                            destination_hostgroup=10
                            apply=1
                    },
                    {
                            rule_id=200
                            active=1
                            match_pattern="^SELECT .*"
                            destination_hostgroup=20
                            apply=1
                    },
                    {
                            rule_id=300
                            active=1
                            match_pattern=".*"
                            destination_hostgroup=10
                            apply=1
                    }
            )

            Pay attention on the admin_variables.admin_credentials variable where we used non-default user which is "proxysql-admin". ProxySQL reserves the default "admin" user for local connection via localhost only. Therefore, we have to use other users to access the ProxySQL instance remotely. Otherwise, you would get the following error:

            ERROR 1040 (42000): User 'admin' can only connect locally

            Our ProxySQL configuration is based on our two database servers running in MySQL Replication as summarized in the following Topology screenshot taken from ClusterControl:

            All writes should go to the master node while reads are forwarded to hostgroup 20, as defined under "mysql_query_rules" section. That's the basic of read/write splitting and we want to utilize them altogether.

            Then, import the configuration file into ConfigMap:

            $ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf
            configmap/proxysql-configmap created

            Verify if the ConfigMap is loaded into Kubernetes:

            $ kubectl get configmap
            NAME                 DATA   AGE
            proxysql-configmap   1      45s

            Wordpress Pod and Service Definition

            Now, paste the following lines into a file called wordpress-rs-svc.yml on the host where kubectl is configured:

            apiVersion: apps/v1
            kind: Deployment
            metadata:
              name: wordpress
              labels:
                app: wordpress
            spec:
              replicas: 3
              selector:
                matchLabels:
                  app: wordpress
                  tier: frontend
              strategy:
                type: RollingUpdate
              template:
                metadata:
                  labels:
                    app: wordpress
                    tier: frontend
                spec:
                  restartPolicy: Always
                  containers:
                  - image: wordpress:4.9-apache
                    name: wordpress
                    env:
                    - name: WORDPRESS_DB_HOST
                      value: proxysql:6033 # proxysql.default.svc.cluster.local:6033
                    - name: WORDPRESS_DB_USER
                      value: wordpress
                    - name: WORDPRESS_DB_DATABASE
                      value: wordpress
                    - name: WORDPRESS_DB_PASSWORD
                      valueFrom:
                        secretKeyRef:
                          name: mysql-pass
                          key: password
                    ports:
                    - containerPort: 80
                      name: wordpress
            ---
            apiVersion: v1
            kind: Service
            metadata:
              name: wordpress
              labels:
                app: wordpress
                tier: frontend
            spec:
              type: NodePort
              ports:
              - name: wordpress
                nodePort: 30088
                port: 80
              selector:
                app: wordpress
                tier: frontend

            Similar to our ProxySQL definition, the YAML consists of two resources, separated by "---" delimiter combined in a file. The first one is the Deployment resource, which will be deployed as a ReplicaSet, as shown under the "spec.*" section:

            spec:
              replicas: 3
              selector:
                matchLabels:
                  app: wordpress
                  tier: frontend
              strategy:
                type: RollingUpdate

            This section provides the Deployment specification - 3 pods to start that matches label "app=wordpress,tier=backend". The deployment strategy is RollingUpdate which means the way Kubernetes will replace the pod is by using rolling update fashion, same with our ProxySQL deployment.

            The next part is the "spec.template.spec.*" section:

                  restartPolicy: Always
                  containers:
                  - image: wordpress:4.9-apache
                    name: wordpress
                    env:
                    - name: WORDPRESS_DB_HOST
                      value: proxysql:6033
                    - name: WORDPRESS_DB_USER
                      value: wordpress
                    - name: WORDPRESS_DB_PASSWORD
                      valueFrom:
                        secretKeyRef:
                          name: mysql-pass
                          key: password
                    ports:
                    - containerPort: 80
                      name: wordpress
                    volumeMounts:
                    - name: wordpress-persistent-storage
                      mountPath: /var/www/html


            In this section, we are telling Kubernetes to deploy Wordpress 4.9 using Apache web server and we gave the container the name "wordpress". The container will be restarted every time it is down, regardless of the status. We also want Kubernetes to pass a number of environment variables:

            • WORDPRESS_DB_HOST - The MySQL database host. Since we are using ProxySQL as a service, the service name will be the value of metadata.name which is "proxysql". ProxySQL listens on port 6033 for MySQL load balanced connections while ProxySQL administration console is on 6032.
            • WORDPRESS_DB_USER - Specify the wordpress database user that have been created under "Preparing the Database" section.
            • WORDPRESS_DB_PASSWORD - The password for WORDPRESS_DB_USER. Since we do not want to expose the password in this file, we can hide it using Kubernetes Secrets. Here we instruct Kubernetes to read the "mysql-pass" Secret resource instead. Secrets has to be created in advanced before the pod deployment, as explained further down.

            We also want to publish port 80 of the pod for the end user. The Wordpress content stored inside /var/www/html in the container will be mounted into our persistent storage running on NFS. We will use the PersistentVolume and PersistentVolumeClaim resources for this purpose as shown under "Preparing Persistent Storage for Wordpress" section.

            After the "---" break line, we define another resource called Service:

            apiVersion: v1
            kind: Service
            metadata:
              name: wordpress
              labels:
                app: wordpress
                tier: frontend
            spec:
              type: NodePort
              ports:
              - name: wordpress
                nodePort: 30088
                port: 80
              selector:
                app: wordpress
                tier: frontend

            In this configuration, we would like Kubernetes to create a service called "wordpress", listen on port 30088 on all nodes (a.k.a. NodePort) to the external network and forward it to port 80 on all pods labelled with "app=wordpress,tier=frontend".

            Therefore, based on our YAML definition above, we have to prepare a number of Kubernetes resources before we can begin to deploy the "wordpress" pod and service:

            • PersistentVolume and PersistentVolumeClaim - To store the web contents of our Wordpress application, so when the pod is being rescheduled to other worker node, we won't lose the last changes.
            • Secrets - To hide the Wordpress database user password inside the YAML file.

            Preparing Persistent Storage for Wordpress

            A good persistent storage for Kubernetes should be accessible by all Kubernetes nodes in the cluster. For the sake of this blog post, we used NFS as the PersistentVolume (PV) provider because it's easy and supported out-of-the-box. The NFS server is located somewhere outside of our Kubernetes network (as shown in the first architecture diagram) and we have configured it to allow all Kubernetes nodes with the following line inside /etc/exports:

            /nfs    192.168.55.*(rw,sync,no_root_squash,no_all_squash)

            Take note that NFS client package must be installed on all Kubernetes nodes. Otherwise, Kubernetes wouldn't be able to mount the NFS correctly. On all nodes:

            $ sudo apt-install nfs-common #Ubuntu/Debian
            $ yum install nfs-utils #RHEL/CentOS

            Also, make sure on the NFS server, the target directory exists:

            (nfs-server)$ mkdir /nfs/kubernetes/wordpress

            Then, create a file called wordpress-pv-pvc.yml and add the following lines:

            apiVersion: v1
            kind: PersistentVolume
            metadata:
              name: wp-pv
              labels:
                app: wordpress
            spec:
              accessModes:
                - ReadWriteOnce
              capacity:
                storage: 3Gi
              mountOptions:
                - hard
                - nfsvers=4.1
              nfs:
                path: /nfs/kubernetes/wordpress
                server: 192.168.55.200
            ---
            kind: PersistentVolumeClaim
            apiVersion: v1
            metadata:
              name: wp-pvc
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 3Gi
              selector:
                matchLabels:
                  app: wordpress
                  tier: frontend

            In the above definition, we are telling Kubernetes to allocate 3GB of volume space on the NFS server for our Wordpress container. Take note for production usage, NFS should be configured with automatic provisioner and storage class.

            Create the PV and PVC resources:

            $ kubectl create -f wordpress-pv-pvc.yml

            Verify if those resources are created and the status must be "Bound":

            $ kubectl get pv,pvc
            NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
            persistentvolume/wp-pv   3Gi        RWO            Recycle          Bound    default/wp-pvc                           22h
            
            
            NAME                           STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
            persistentvolumeclaim/wp-pvc   Bound    wp-pv    3Gi        RWO                           22h

            Preparing Secrets for Wordpress

            Create a secret to be used by the Wordpress container for WORDPRESS_DB_PASSWORD environment variable. The reason is simply because we don't want to expose the password in clear text inside the YAML file.

            Create a secret resource called mysql-pass and pass the password accordingly:

            $ kubectl create secret generic mysql-pass --from-literal=password=passw0rd

            Verify that our secret is created:

            $ kubectl get secrets mysql-pass
            NAME         TYPE     DATA   AGE
            mysql-pass   Opaque   1      7h12m

            Deploying ProxySQL and Wordpress

            Finally, we can begin the deployment. Deploy ProxySQL first, followed by Wordpress:

            $ kubectl create -f proxysql-rs-svc.yml
            $ kubectl create -f wordpress-rs-svc.yml

            We can then list out all pods and services that have been created under "frontend" tier:

            $ kubectl get pods,services -l tier=frontend -o wide
            NAME                             READY   STATUS    RESTARTS   AGE   IP          NODE          NOMINATED NODE
            pod/proxysql-95b8d8446-qfbf2     1/1     Running   0          12m   10.36.0.2   kube2.local   <none>
            pod/proxysql-95b8d8446-vljlr     1/1     Running   0          12m   10.44.0.6   kube3.local   <none>
            pod/wordpress-59489d57b9-4dzvk   1/1     Running   0          37m   10.36.0.1   kube2.local   <none>
            pod/wordpress-59489d57b9-7d2jb   1/1     Running   0          30m   10.44.0.4   kube3.local   <none>
            pod/wordpress-59489d57b9-gw4p9   1/1     Running   0          30m   10.36.0.3   kube2.local   <none>
            
            NAME                TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE   SELECTOR
            service/proxysql    NodePort   10.108.195.54    <none>        6033:30033/TCP,6032:30032/TCP   10m   app=proxysql,tier=frontend
            service/wordpress   NodePort   10.109.144.234   <none>        80:30088/TCP                    37m   app=wordpress,tier=frontend
              kube2.local   <none>

            The above output verifies our deployment architecture where we are currently having three Wordpress pods, exposed on port 30088 publicly as well as our ProxySQL instance which is exposed on port 30033 and 30032 externally plus 6033 and 6032 internally.

            At this point, our architecture is looking something like this:

            Port 80 published by the Wordpress pods is now mapped to the outside world via port 30088. We can access our blog post at http://{any_kubernetes_host}:30088/ and should be redirected to the Wordpress installation page. If we proceed with the installation, it would skip the database connection part and directly show this page:

            It indicates that our MySQL and ProxySQL configuration is correctly configured inside wp-config.php file. Otherwise, you would be redirected to the database configuration page.

            Our deployment is now complete.

            ProxySQL Pods and Service Management

            Failover and recovery are expected to be handled automatically by Kubernetes. For example, if a Kubernetes worker goes down, the pod will be recreated in the next available node after --pod-eviction-timeout (default to 5 minutes). If the container crashes or is killed, Kubernetes will replace it almost instantly.

            Some common management tasks are expected to be different when running within Kubernetes, as shown in the next sections.

            Connecting to ProxySQL

            While ProxySQL is exposed externally on port 30033 (MySQL) and 30032 (Admin), it is also accessible internally via the published ports, 6033 and 6032 respectively. Thus, to access the ProxySQL instances within the Kubernetes network, use the CLUSTER-IP, or the service name "proxysql" as the host value. For example, within Wordpress pod, you may access the ProxySQL admin console by using the following command:

            $ mysql -uproxysql-admin -p -hproxysql -P6032

            If you want to connect externally, use the port defined under nodePort value he service YAML and pick any of the Kubernetes node as the host value:

            $ mysql -uproxysql-admin -p -hkube3.local -P30032

            The same applies to the MySQL load-balanced connection on port 30033 (external) and 6033 (internal).

            Scaling Up and Down

            Scaling up is easy with Kubernetes:

            $ kubectl scale deployment proxysql --replicas=5
            deployment.extensions/proxysql scaled

            Verify the rollout status:

            $ kubectl rollout status deployment proxysql
            deployment "proxysql" successfully rolled out

            Scaling down is also similar. Here we want to revert back from 5 to 2 replicas:

            $ kubectl scale deployment proxysql --replicas=2
            deployment.extensions/proxysql scaled

            We can also look at the deployment events for ProxySQL to get a better picture of what has happened for this deployment by using the "describe" option:

            $ kubectl describe deployment proxysql
            ...
            Events:
              Type    Reason             Age    From                   Message
              ----    ------             ----   ----                   -------
              Normal  ScalingReplicaSet  20m    deployment-controller  Scaled up replica set proxysql-769895fbf7 to 1
              Normal  ScalingReplicaSet  20m    deployment-controller  Scaled down replica set proxysql-95b8d8446 to 1
              Normal  ScalingReplicaSet  20m    deployment-controller  Scaled up replica set proxysql-769895fbf7 to 2
              Normal  ScalingReplicaSet  20m    deployment-controller  Scaled down replica set proxysql-95b8d8446 to 0
              Normal  ScalingReplicaSet  7m10s  deployment-controller  Scaled up replica set proxysql-6c55f647cb to 1
              Normal  ScalingReplicaSet  7m     deployment-controller  Scaled down replica set proxysql-769895fbf7 to 1
              Normal  ScalingReplicaSet  7m     deployment-controller  Scaled up replica set proxysql-6c55f647cb to 2
              Normal  ScalingReplicaSet  6m53s  deployment-controller  Scaled down replica set proxysql-769895fbf7 to 0
              Normal  ScalingReplicaSet  54s    deployment-controller  Scaled up replica set proxysql-6c55f647cb to 5
              Normal  ScalingReplicaSet  21s    deployment-controller  Scaled down replica set proxysql-6c55f647cb to 2

            The connections to the pods will be load balanced automatically by Kubernetes.

            Configuration Changes

            One way to make configuration changes on our ProxySQL pods is by versioning our configuration using another ConfigMap name. Firstly, modify our configuration file directly via your favourite text editor:

            $ vim /root/proxysql.cnf

            Then, load it up into Kubernetes ConfigMap with a different name. In this example, we append "-v2" in the resource name:

            $ kubectl create configmap proxysql-configmap-v2 --from-file=proxysql.cnf

            Verify if the ConfigMap is loaded correctly:

            $ kubectl get configmap
            NAME                    DATA   AGE
            proxysql-configmap      1      3d15h
            proxysql-configmap-v2   1      19m

            Open the ProxySQL deployment file, proxysql-rs-svc.yml and change the following line under configMap section to the new version:

                  volumes:
                  - name: proxysql-config
                    configMap:
                      name: proxysql-configmap-v2 #change this line

            Then, apply the changes to our ProxySQL deployment:

            $ kubectl apply -f proxysql-rs-svc.yml
            deployment.apps/proxysql configured
            service/proxysql configured

            Verify the rollout by using looking at the ReplicaSet event using the "describe" flag:

            $ kubectl describe proxysql
            ...
            Pod Template:
              Labels:  app=proxysql
                       tier=frontend
              Containers:
               proxysql:
                Image:        severalnines/proxysql:1.4.12
                Ports:        6033/TCP, 6032/TCP
                Host Ports:   0/TCP, 0/TCP
                Environment:  <none>
                Mounts:
                  /etc/proxysql.cnf from proxysql-config (rw)
              Volumes:
               proxysql-config:
                Type:      ConfigMap (a volume populated by a ConfigMap)
                Name:      proxysql-configmap-v2
                Optional:  false
            Conditions:
              Type           Status  Reason
              ----           ------  ------
              Available      True    MinimumReplicasAvailable
              Progressing    True    NewReplicaSetAvailable
            OldReplicaSets:  <none>
            NewReplicaSet:   proxysql-769895fbf7 (2/2 replicas created)
            Events:
              Type    Reason             Age   From                   Message
              ----    ------             ----  ----                   -------
              Normal  ScalingReplicaSet  53s   deployment-controller  Scaled up replica set proxysql-769895fbf7 to 1
              Normal  ScalingReplicaSet  46s   deployment-controller  Scaled down replica set proxysql-95b8d8446 to 1
              Normal  ScalingReplicaSet  46s   deployment-controller  Scaled up replica set proxysql-769895fbf7 to 2
              Normal  ScalingReplicaSet  41s   deployment-controller  Scaled down replica set proxysql-95b8d8446 to 0

            Pay attention on the "Volumes" section with the new ConfigMap name. You can also see the deployment events at the bottom of the output. At this point, our new configuration has been loaded into all ProxySQL pods, where Kubernetes scaled down the ProxySQL ReplicaSet to 0 (obeying RollingUpdate strategy) and bring them back to the desired state of 2 replicas.

            Final Thoughts

            Up until this point, we have covered possible deployment approach for ProxySQL in Kubernetes. Running ProxySQL with the help of Kubernetes ConfigMap opens a new possibility of ProxySQL clustering, where it is somewhat different as compared to the native clustering support built-in inside ProxySQL.

            In the upcoming blog post, we will explore ProxySQL Clustering using Kubernetes ConfigMap and how to do it the right way. Stay tuned!

            by ashraf at October 29, 2018 11:07 PM

            Peter Zaitsev

            One Week Until Percona Live Open Source Database Conference Europe 2018

            Percona Live Europe 2018

            Percona Live Europe Open Source Database Conference PLE 2018It’s almost here! One week until the Percona Live Europe Open Source Database Conference 2018 in Frankfurt, Germany! Are you ready?

            This year’s theme is “Connect. Accelerate. Innovate.” We want to live these words by making sure that the conference allows you to connect with others in the open source community, accelerate your ideas and solutions and innovate when you get back to your projects and companies.

            • There is one day of tutorials (Monday) and two days of sessions (Tuesday and Wednesday). We have multiple tracks: MySQL 8.0, Using MySQL, MongoDB, PostgreSQL, Cloud, Database Security and Compliance, Monitoring and Ops, and Containers and Emerging Technologies. This year also includes a specialized “Business Track” aimed at how open source can solve critical enterprise issues.
            • Each of the session days begins with excellent keynote presentations in the main room by well-known people and players in the open source community. Don’t miss them!
            • Don’t forget to attend our Welcome Reception on Monday.
            • Want to meet with our Product Managers? Join them for Lunch on Wednesday, November 7, where you’ll have a chance to participate in the development of Percona Software!
            • On our community blog, we’ve been highlighting some of the sessions that will be occurring during the conference. You can check them out here.

            Percona Live Europe TutorialsThe entire conference schedule is up and available here.

            Percona Live Europe provides the community with an opportunity to discover and discuss the latest open source trends, technologies and innovations. The conference includes the best and brightest innovators and influencers in the open source database industry.

            Our daily sessions, day-one tutorials, demonstrations, keynotes and events provide access to what is happening NOW in the world of open source databases. At the conference, you can mingle with all levels of the database community: DBAs, developers, C-level executives and the latest database technology trend-setters.

            Network with peers and technology professionals and unite the open source database community! Share knowledge, experiences and use cases! Learn about how open source database technology can power your applications, improve your websites and solve your critical database issues.

            Come to the conference.

            Don’t miss out, buy your tickets here!

            Percona Live Europe TutorialsConnect. Accelerate. Innovate.

            With a lot of focus on the benefits of open source over proprietary models of software delivery, you surely can’t afford to miss this opportunity to connect with leading figures of the open source database world. On Monday, November 5 you can opt to accelerate your knowledge with our in-depth tutorials, or choose to attend our business track geared towards open source innovation and adoption.

            Tuesday and Wednesday’s sessions across eight different tracks provides something for all levels of experience, and addresses a range of business challenges. See the full schedule.

            by Bronwyn Campbell at October 29, 2018 11:05 AM

            October 27, 2018

            Valeriy Kravchuk

            Fun with Bugs #71 - On Some Public Bugs Fixed in MySQL 5.7.24

            Oracle released many new MySQL versions back on Monday, but I had no time during this very busy week to check anything related (besides the fact that MySQL 8.0.13 can be complied from source on my Fedora 27 box). I am sure you've read a lot about MySQL 8.0.13 elsewhere already, even patches contributed by Community are already presented in a separate post by Jesper Krogh.

            I am still mostly interested in MySQL 5.7. So, here is my typical quick review of some selected bugs reported in public by MySQL Community users and fixed in MySQL 5.7.24.

            My wife noticed this nice spider in the garden and reported it to me via this photo. Spider is formally not a bug, while in this post I discuss pure bugs...
            Let me start with fixes in Performance Schema (that is supposed to be mostly bugs free):
            • Bug #90264 - "Some file operations in mf_iocache2.c are not instrumented". This bug reported by Yura Sorokin from Percona, who also contributed patches, is fixed in all recent Oracle releases, from 5.5.62 to 8.0.13.
            • Bug #77519 - "Reported location of Innodb Merge Temp File is wrong". This bug was reported by Daniël van Eeden back in 2015. Let's hope files are properly reported in @@tmpdir now.
            • Bug #80777 - "order by on LAST_SEEN_TRANSACTION results in empty set". Yet another bug report from Daniël van Eeden got fixed.
            Let's continue with InnoDB bugs:
            • Bug #91032 - "InnoDB 5.7 Primary key scan lack data". Really weird bug was reported by Raolh Rao back in May.
            • Bug #95045 - Release notes are referring to public bug that does not exist! So, we have a bug in them. Related text:
              "It was possible to perform FLUSH TABLES FOR EXPORT on a partitioned table created with innodb_file_per_table=1 after discarding its tablespace. Attempting to do so now raises ER_TABLESPACE_DISCARDED. (Bug #95045, Bug #27903881)"
              and refer to Bug #80669 - "Failing assert: fil_space_get(table->space) != __null in row0quiesce.cc line 724", reported by Ramesh Sivaraman from Percona. In the comment from Roel there we see that actual bug was Bug #90545 that is, surprise, still private!
              Recently I found out (here) that some community members think that keeping crashing bugs private after the fixed version is released is still better than publish test cases for them before all affected versions are fixed... I am not so sure.
            What about replication (group replication aside, I have enough Galera problems to deal with in my life to even think about it)? There are some interesting bug fixes:
            • Bug #90551 - "[MySQL 8.0 GA Debug Build] Assertion `!thd->has_gtid_consistency_violation'". Good to know that Oracle engineers still pay attention to debug assertions, as in this report (with nice simple test case involving XA transactions)  reported by Roel Van de Paar from Percona.
            • Bug #89370 - "semi-sync replication doesn't work for minutes after restart replication". This bug was reported by Yan Huang, who had contributed a patch for it.
            • Bug #89143 - "Commit order deadlock + retry logic is not considering trx error cases". Nice bug report from Jean-François Gagné.
            • Bug #83232 - "replication breaks after bug #74145 happens in master". FLUSH SLOW LOGS that failed on master (because of file permission problem, for example) was still written to the binary log. Nice finding by Jericho Rivera from Percona.
            There are interesting bugs fixed in other categories as well. For example:
            • Bug #91914 - "Mysql 5.7.23 cmake fail with 'Unknown CMake command "ADD_COMPILE_FLAGS".'" Now thanks to this report by Tomasz Kłoczko one can build MySQL 5.7 with gcc 8.
            • Bug #91080 - "sql_safe_updates behaves inconsistently between delete and select". The fix is described as follows:
              "For DELETE and UPDATE that produced an error due to sql_safe_updates being enabled, the error message was insufficiently informative. The message now indicates that data truncation occurred or the range_optimizer_max_mem_size value was exceeded.

              Additionally: (1) Using EXPLAIN for such statements does not produce an error, enabling users to see from EXPLAIN output why an index is not used; (2) For multiple-table deletes and updates, an error is produced with safe updates enabled only if the target table or tables use a table scan."
              I am NOT sure this is the fix that bug reporter, Nakoa Mccullough, was expecting. He asked to be consistent with SELECT (that works). The bug is still closed :(
            • Bug #90624 - "Restore dump created with 5.7.22 on 8.0.11". It seems Emmanuel CARVIN asked for the working way to upgrade from 5.7.x to 8.0.x. Last comment seems to state that upgrade from 5.7.24 to 8.0.13 is still not possible. I had not checked this.
            • Bug #90505 is private. Release notes say:
              "If flushing the error log failed due to a file permission error, the flush operation did not complete. (Bug #27891472, Bug #90505) References: This issue is a regression of: Bug #26447825"
              OK, we have a private regression bug, fixed. Nice.
            • Bug #90266 - "No warning when truncating a string with data loss". It was when making BLOB/TEXT columns smaller. Nice finding by Carlos Tutte.
            • Bug #89537 - "Regression in FEDERATED storage engine after GCC 7 fixes". Yet another by report with patch contributed by Yura Sorokin.
            • Bug #88670 - "Subquery incorrectly shows duplicate values on subqueries.". Simple from results bug in optimizer affecting all versions starting from 5.6. Fixed now thanks to Mark El-Wakil.
            That's all bugs I wanted to mention today. To summarize my feelings after reading the release notes:
            1. I'd surely consider upgrade to 5.7.24 in any environment where replication is used. Some InnoDB fixes also matter.
            2. We still see not only private bugs (with questionable security impact) mentioned in the release notes, but this time also a typo in bug number that makes it harder to find out what was really fixed and why.
            3. I think it would be fair for Oracle to mention Percona as a major contributor to MySQL 5.7, in a same way as Facebook is mentioned in many places with regards to 8.0.13.
            4. It's good to know that some debug assertions related bugs are still fixed. More on this later...

            by Valeriy Kravchuk (noreply@blogger.com) at October 27, 2018 04:50 PM

            October 26, 2018

            Peter Zaitsev

            Announcing Keynotes for Percona Live Europe!

            Percona Live Keynotes

            There’s just over one week to go so it’s time to announce the keynote addresses for Percona Live Europe 2018! We’re excited to share our lineup of conference keynotes, featuring talks from Paddy Power Betfair, Amazon Web Services, Facebook, PingCap and more!

            The speakers will address the current status of key open source database projects MySQL®, PostgreSQL, MongoDB®, and MariaDB®. They’ll be sharing with you how organizations are shifting from a single use database to a polyglot strategy, thereby avoiding vendor lock-in and enabling business growth.

            Without further ado, here’s the full keynote line-up for 2018!

            Tuesday, November 6

            Maximizing the Power and Value of Open Source Databases

            Open source database adoption continues to grow in enterprise organizations, as companies look to scale for growth, maintain performance, keep up with changing technologies, control risks and contain costs. In today’s environment, a single database technology or platform is no longer an option, as organizations shift to a best-of-breed, polyglot strategy to avoid vendor lock-in, increase agility and enable business growth. Percona’s CEO Peter Zaitsev shares his perspective.

            Following this keynote, there will be a round of lightning talks featuring the latest releases from PostgreSQL, MongoDB and MariaDB.

            Technology Lightning Talks

            PostgreSQL 11

            PostgreSQL benefits from over 20 years of open source development, and has become the preferred open source relational database for developers. PostgreSQL 11 was released on October 18. It provides users with improvements to the overall performance of the database system, with specific enhancements associated with very large databases and high computational workloads.

            MongoDB 4.0

            Do you love MongoDB? With version 4.0 you have a reason to love it even more! MongoDB 4.0 adds support for multi-document ACID transactions, combining the document model with ACID guarantees. Through snapshot isolation, transactions provide a consistent view of data and enforce all-or-nothing execution to maintain data integrity. And not only transactions – MongoDB 4.0 has more exciting features like non-blocking secondary reads, improved sharding, security improvements, and more.

            MariaDB 10.3

            MariaDB benefits from a thriving community of contributors. The latest release, MariaDB 10.3, provides several new features not found anywhere else, as well back-ported and reimplemented features from MySQL.

            Paddy Power Betfair, Percona, and MySQL

            This keynote highlights the collaborative journey Paddy Power Betfair and Percona have taken through the adoption of MySQL within the PPB enterprise. The keynote focuses on how Percona has assisted PPB in adopting MySQL, and how PPB has used this partnership to deliver a full DBaaS for a MySQL solution on OpenStack.

            Wednesday 7th November

            State of the Dolphin

            Geir Høydalsvik (Oracle) will talk about the focus, strategy, investments, and innovations evolving MySQL to power next-generation web, mobile, cloud, and embedded applications. He will also discuss the latest and the most significant MySQL database release ever in its history, MySQL 8.0.

            Amazon Relational Database Services (RDS)

            Amazon RDS is a fully managed database service that allows you to launch an optimally configured, secure, and highly available database with just a few clicks. It manages time-consuming database administration tasks, freeing you to focus on your applications and business. This keynote features the latest news and announcements from RDS, including the launches of Aurora Serverless, Parallel Query, Backtrack, RDS MySQL 8.0, PostgreSQL 10.0, Performance Insights, and several other recent innovations.

            TiDB 2.1, MySQL Compatibility, and Multi-Cloud Deployment

            This keynote talk from PingCap will provide an architectural overview of TiDB, how and why it’s MySQL compatible, the latest features and improvements in TiDB 2.1 GA release, and how its multi-cloud fully-managed solution works.

            MyRocks in the Real World

            In this keynote, Yoshinori Matsunobu, Facebook, will share interesting lessons learned from Facebook’s production deployment and operations of MyRocks and future MyRocks development roadmaps. Vadim Tkachenko, Percona’s CTO, will discuss MyRocks in Percona Server for MySQL and share performance benchmark results from on-premise and cloud deployments.

            Don’t miss out, buy your tickets here!

            Connect. Accelerate. Innovate.

            With a lot of focus on the benefits of open source over proprietary models of software delivery, you surely can’t afford to miss this opportunity to connect with leading figures of the open source database world. On Monday, November 5 you can opt to accelerate your knowledge with our in-depth tutorials, or choose to attend our business track geared towards open source innovation and adoption.

            Tuesday and Wednesday with sessions across 8 different tracks, there’s something for all levels of experience, addressing a range of business challenges. See the full schedule.

            With thanks to our sponsors!

            Platinum: AWS, Percona
            Gold: Facebook
            Silver: Altinity, PingCap, Shannon Systems, OlinData, MySQL
            Startup: SeveralNines
            Community: PostgreSQL, MariaDB Foundation
            Contributing: Intel Optane, Idera, Studio3T

            Media Sponsors: Datanami, Enterprise Tech, HPC Wire, ODBMS.org, Database Trends and Applications, Packt

            Percona Live Keynotes

             

            by Bronwyn Campbell at October 26, 2018 11:01 AM

            MariaDB Foundation

            MariaDB 5.5.62 now available

            The MariaDB Foundation is pleased to announce the availability of MariaDB 5.5.62, the latest stable release in the MariaDB 5.5 series. See the release notes and changelogs for details. Download MariaDB 5.5.62 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Contributors to MariaDB 5.5.62 Alexander Barkov (MariaDB Corporation) Daniel […]

            The post MariaDB 5.5.62 now available appeared first on MariaDB.org.

            by Ian Gilfillan at October 26, 2018 08:48 AM

            October 25, 2018

            Peter Zaitsev

            Percona Live Europe 2018: Our Sponsors

            Sponsors PLE 2018

            Without our sponsors, it would be almost out of reach to deliver a conference of the size and format  that everyone has come to expect from Percona Live. As well as financial support, our sponsors contribute massively by supporting their teams in presenting at the conference, and adding to the quality and atmosphere of the event. Having their support means we can present excellent in-depth technical content for the tutorials and talks, and that’s highly valued by conference delegates. This year, too, Amazon Web Services (AWS) sponsors the cloud track on day two, with a superb line up of cloud content.

            Here’s a shout out to our sponsors, you’ll find more information on the Percona Live sponsors page:

            Platinum

            aws

             

            For over 12 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform. https://aws.amazon.com/

            Gold

            facebookFacebook offer a fantastic contribution to open source databases with MyRocks and are greatly appreciated for their ongoing support of Percona Live.
            https://www.facebook.com

            Silver

            altinityAltinity is the leading service provider for ClickHouse
            https://www.altinity.com/

            PingCAP
            PingCAP is the company and core team building TiDB, a popular open-source MySQL-compatible NewSQL hybrid database.
            https://www.pingcap.com/en/

            Shannon Systems

            Shannon Systems is a global leader in providing enterprise-grade Flash storage devices and system solutions.
            http://en.shannon-sys.com/

            OlinData

            OlinData is an open source infrastructure management company providing services to help companies from small to large with their infrastructure.
            https://www.olindata.com/en

            MySQL

            MySQL is the world’s most popular OS database, delivered by Oracle.
            https://www.mysql.com/

            Start Up

            SeveralNines
            SeveralNines provide automation and management software for MySQL, MariaDB and MongoDB clusters
            https://severalnines.com/

            Community Sponsors

            PostgreSQL
            PostgreSQL is a powerful, open source object-relational database system.
            https://www.postgresql.org/
            MariaDB Foundation

            MariaDB Server is one of the most popular database servers in the world.
            https://mariadb.org

            Branding

            Intel OPTANE
            Intel is the world’s leading technology company, powering the cloud and billions of smart, connected computing devices.
            https://www.intel.com

            Idera
            IDERA designs powerful software with one goal in mind – to solve customers’ most complex challenges with elegant solutions.
            https://www.idera.com/

            Studio 3T

            Studio 3T is a GUI and IDE for developers and data engineers who work with MongoDB.
            https://studio3t.com/

            Media

            • datanami online portal for data science, AI and advanced analytics
            • Enterprise Tech online portal addressing high performance computing technologies at scale
            • HPC Wire covering the fastest computers in the world and the people who run them
            • odbms.org a resource portal for big data, new data management technologies, data science and AI
            • Packt online technical publications and videos

            Thanks again to all – appreciated!

            Sponsors PLE 2018

            by Bronwyn Campbell at October 25, 2018 11:14 AM

            October 24, 2018

            Jean-Jerome Schmidt

            Effective Monitoring of MySQL Replication with SCUMM Dashboards - Part 2

            In our previous blog on SCUMM dashboards, we looked at the MySQL overview dashboard. The new version of ClusterControl (ver. 1.7) offers a number of high resolution graphs of useful metrics, and we went through the meaning of each of the metrics, and how they help you troubleshoot your database. In this blog, we will look at the MySQL Replication dashboard. Let’s proceed on the details of this dashboard on what has to offer.

            MySQL Replication Dashboard

            The MySQL Replication Dashboard offers a very straightforward sets of graph that makes it easier to monitor your MySQL master and replica(s). Starting from the top, it shows the most important variables and information to determine the health of the replica(s) or even the master. This dashboard offers a very useful part when inspecting the health of the slaves or a master in master-master setup. One can as well check on this dashboard the master’s binary log creation and determine the overall dimension, in terms of the generated size, at a particular given period of time.

            First thing in this dashboard, it presents you with the most important information you might need with the health of your replica. See the graph below:

            Basically, it will show you the Slave thread’s IO_Thread, SQL_Thread, replication error and if has read_only variable enabled. From the sample screenshot above, all the information shows that my slave 192.168.70.20 is healthy and running normally.

            Additionally, ClusterControl has information to gather as well if you go over to Cluster -> Overview. Scroll down and you can see the graph below:

            Another place to view the replication setup is the topology view of the replication setup, accessible at Cluster -> Topology. It gives, at a quick glance, a view of the different nodes in the setup, their roles, replication lag, retrieved GTID and more. See the graph below:

            In addition to this, the Topology View also shows all the different nodes that form part of your database cluster whether its the database nodes, load balancers (ProxySQL/MaxScale/HaProxy) or arbitrators (garbd), as well as the connections between them. The nodes, connections, and their statuses are discovered by ClusterControl. Since ClusterControl is continuously monitoring the nodes and keeps state information, any changes in the topology are reflected in the web interface. In case of failure of nodes are reported, you can use the this view along with the SCUMM Dashboards and see what impact that might have cause it.

            The Topology View has some similarity with Orchestrator in which you can manage the nodes, change masters by dragging and dropping the object on the desired master, restart nodes and synchronize data. To know more about our Topology View, we suggest you to read our previous blog - “Visualizing your Cluster Topology in ClusterControl”.

            Let’s now proceed with the graphs.

            • MySQL Replication Delay
              This graph is very familiar to anybody managing MySQL, especially those who are working on a daily basis on their master-slave setup. This graph has the trends for all the lags recorded for a specific time range specified in this dashboard. Whenever we want to check the periodic fall time that our replica has, then this graph is good to look at. There are certain occasions that a replica could lag for odd reasons like your RAID has a degraded BBU and needs a replacement, a table has no unique key but not on the master, an unwanted full table scan or full index scan, or a bad query was left running by a developer. This is also a good indicator to determine if slave lag is a key issue, then you may want to take advantage of parallel replication.

            • Binlog Size
              These graphs are related to each other. The Binlog Size graph shows you how your node generates the binary log and helps determine its dimension based on the period of time you are scanning.

            • Binlog Data Written Hourly
              The Binlog Data Written Hourly is a graph based on the current day and the previous day recorded. This might be useful whenever you want to identify how large your node is that is accepting writes, which you can later use for capacity planning.

            • Binlogs Count
              Let’s say you expect high traffic for a given week. You want to compare how large writes are going through your master and slaves with the previous week. This graph is very useful for this kind of situation - To determine how high the generated binary logs were on the master itself or even on the slaves if log_slave_updates variable is enabled. You may also use this indicator to determine your master vs slaves binary log data generated, especially if you are filtering some tables or schemas (replicate_ignore_db, replicate_ignore_table, replicate_wild_do_table) on your slaves that were generated while log_slave_updates is enabled.

            • Binlogs Created Hourly
              This graph is a quick overview to compare your binlogs creation hourly from yesterday and today’s date.

            • Relay Log Space
              This graph serves as the basis of the generated relay logs from your replica. When used along with the MySQL Replication Delay graph, it helps determine how large the number of relay logs generated is, which the administrator has to consider in terms of disk availability of the current replica. It can cause trouble when your slave is rigorously lagging, and is generating large numbers of relay logs. This can consume your disk space quickly. There are certain situations that, due to a high number of writes from the master, the slave/replica will lag tremendously, thus generating a large amount of logs can cause some serious problems on that replica. This can help the ops team when talking to their management about capacity planning.

            • Relay Log Written Hourly
              Same as the Relay Log Space but adds a quick overview to compare your relay logs written from yesterday and today’s date.

            Conclusion

            You learned that using SCUMM to monitor your MySQL Replication adds more productivity and efficiency to the operations team. Using the features we have from previous versions combined with the graphs provided with SCUMM is like going to the gym and seeing massive improvements in your productivity. This is what SCUMM can offer: monitoring on steroids! (now, we are not advocating that you should take steroids when going to the gym!)

            In Part 3 of this blog, I will discuss the InnoDB Metrics and MySQL Performance Schema Dashboards.

            by Paul Namuag at October 24, 2018 09:19 PM

            Peter Zaitsev

            Poll: MongoDB License Change

            MongoDB Licensing poll

            As you may have heard, MongoDB recently changed the license for MongoDB Community version from AGPL to SSPL. In order to better serve our users and customers, we’d like to ask about your plans.

            Please select the answer that best describes your current thinking as a MongoDB user:

            Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

            If you would like to expand on your response, or otherwise talk to me about your thoughts on the MongoDB license change, I’d be pleased to hear from you. You’re welcome to email me.

             

            by Vadim Tkachenko at October 24, 2018 06:06 PM

            PostgreSQL locking, part 2: heavyweight locks

            Locking in PostgreSQL

            Locking in PostgreSQLPostgreSQL locking visibility for application developers and DBAs is in most cases related to heavyweight locks. Complex database locking operations require full instrumentation using views from the system catalog. It should be clear which object is locked by a specific database “backend” process. An alternative name for any lock is “bottleneck”. In order to make database operations parallel we should split a single bottleneck into multiple operation specific tasks.

            This is the second part of three blog posts related to table level locks. The previous post was about row-level locks, and a subsequent post reviews the latches protecting internal database structures.

            Example environment

            A simple table with several rows:

            CREATE TABLE locktest (c INT);
            INSERT INTO locktest VALUES (1), (2);

            Helper view

            In order to check different types of these locks, let’s create a helper view as suggested by Bruce Momjian in his presentation:

            CREATE VIEW lockview AS SELECT pid, virtualtransaction AS vxid, locktype AS lock_type,
            mode AS lock_mode, granted,
            CASE
            WHEN virtualxid IS NOT NULL AND transactionid IS NOT NULL
            THEN virtualxid || ' ' || transactionid
            WHEN virtualxid::text IS NOT NULL
            THEN virtualxid
            ELSE transactionid::text                                                                                                                                                                                           END AS xid_lock, relname,
            page, tuple, classid, objid, objsubid
            FROM pg_locks LEFT OUTER JOIN pg_class ON (pg_locks.relation = pg_class.oid)
            WHERE -- do not show our view’s locks
            pid != pg_backend_pid() AND
            -- no need to show self-vxid locks                                                                                                                                                                                 virtualtransaction IS DISTINCT FROM virtualxid
            -- granted is ordered earlier
            ORDER BY 1, 2, 5 DESC, 6, 3, 4, 7;

            RowShareLock (ROW SHARE)

            Many applications use the read-modify-write paradigm. For instance, the application fetches single object fields from a table, modifies the data, and saves the changes back to the database. In a multi-user environment, different users could modify the same rows in the course of this transaction. We can get inconsistent data with just a plain select. In response to user demands, almost all SQL databases have SELECT … FOR SHARE locking. This feature prevents the application entity from making data modifications until the locker transaction commits or rolls back.

            For example:

            1. There is a user with multiple bank accounts stored in an accounts table, with total_amount stored in a bank_clients table.
            2. In order to update the total_amount field, we should prevent modification of all rows related to the specific bank client.
            3. It would be better to use a single update statement to calculate total_amount and select it from the accounts table. If the update requires external data, or some action from the user, then several statements are required

            START TRANSACTION;
            SELECT * FROM accounts WHERE client_id = 55 FOR SHARE;
            SELECT * FROM bank_clients WHERE client_id=55 FOR UPDATE;
            UPDATE bank_clients SET total_amount=38984.33, client_status='gold' WHERE client_id=55;
            COMMIT;

            The SELECT FOR SHARE statement creates a “RowShareLock” lock on the relation locktest.

            Here’s exactly the same lock created with an SQL statement:

            BEGIN;
            LOCK TABLE locktest IN ROW SHARE MODE;

            A single heavyweight RowShareLock is required regardless of the number of rows locked by the query.

            This is illustrated with an unfinished transaction in the following example. Start the unfinished transaction, and select from lockview in a second connection to the database:

            BEGIN;
            SELECT * FROM locktest FOR SHARE;
            -- In second connection:
            postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
              pid  | vxid |   lock_type   |   lock_mode   | granted | xid_lock | relname
            -------+------+---------------+---------------+---------+----------+----------
             21144 | 3/13 | transactionid | ExclusiveLock | t       | 586      |
             21144 | 3/13 | relation      | RowShareLock  | t       |          | locktest

            RowExclusiveLock (ROW EXCLUSIVE)

            Real queries that modify rows also require heavyweight locks on tables, one per table.

            The next example uses a DELETE query, but an UPDATE will have the same effect.

            All commands that modify data in a table obtain a ROW EXCLUSIVE lock.

            BEGIN;
            DELETE FROM locktest;
            -- second connection
            postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
              pid  | vxid |   lock_type   |    lock_mode     | granted | xid_lock | relname
            -------+------+---------------+------------------+---------+----------+----------
             10997 | 3/6  | transactionid | ExclusiveLock    | t       | 589      |
             10997 | 3/6  | relation      | RowExclusiveLock | t       |          | locktest

            This new lock is incompatible with the previous FOR SHARE example.

            SELECT * FROM locktest FOR SHARE
              waits for the delete transaction to finish or abort:

            postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname,page,tuple FROM lockview;
              pid  | vxid |   lock_type   |    lock_mode     | granted | xid_lock | relname  | page | tuple
            -------+------+---------------+------------------+---------+----------+----------+------+-------
             10997 | 3/6  | transactionid | ExclusiveLock    | t       | 589      |          |      |
             10997 | 3/6  | relation      | RowExclusiveLock | t       |          | locktest |      |
             11495 | 5/9  | relation      | RowShareLock     | t       |          | locktest |      |
             11495 | 5/9  | tuple         | RowShareLock     | t       |          | locktest |    0 |     1
             11495 | 5/9  | transactionid | ShareLock        | f       | 589      |          |      |

            Queries modifying table content also lock all indexes, even if the index does not contain modified fields.

            -- preparation
            CREATE INDEX c_idx2 ON locktest (c);
            ALTER TABLE locktest ADD COLUMN c2 INT;
            CREATE INDEX c2_idx ON locktest(c2);
            -- unfinished example transaction
            BEGIN;
            UPDATE locktest SET c=3 WHERE c=1;
            -- second connection
            postgres=# SELECT * FROM lockview;
             pid  |  vxid  | lock_type  |    lock_mode     | granted | xid_lock | relname  | page | tuple | classid | objid | objsubid
            ------+--------+------------+------------------+---------+----------+----------+------+-------+---------+-------+----------
             3998 | 3/7844 | virtualxid | ExclusiveLock    | t       | 3/7844   |          |      |       |         |       |
             3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | c2_idx   |      |       |         |       |
             3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | c_idx    |      |       |         |       |
             3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | c_idx2   |      |       |         |       |
             3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | locktest |      |       |         |       |

            ShareLock (SHARE)

            The non-concurrent version of CREATE INDEX prevents table updates, e.g. DROP TABLE or INSERT or DELETE, with ShareLock.

            BEGIN;
            CREATE INDEX c_idx ON locktest (c);
            -- second connection
            postgres=# SELECT * FROM lockview;
             pid  |  vxid  |   lock_type   |      lock_mode      | granted | xid_lock | relname  | page | tuple | classid | objid | objsubid
            ------+--------+---------------+---------------------+---------+----------+----------+------+-------+---------+-------+----------
             3998 | 3/7835 | virtualxid    | ExclusiveLock       | t       | 3/7835   |          |      |       |         |       |
             3998 | 3/7835 | transactionid | ExclusiveLock       | t       | 564      |          |      |       |         |       |
             3998 | 3/7835 | relation      | AccessExclusiveLock | t       |          |          |      |       |         |       |
             3998 | 3/7835 | relation      | ShareLock           | t       |          | locktest |      |       |         |       |

            You can execute multiple CREATE INDEX queries in parallel unless the index name is exactly the same. The wait happens on the row lock (ShareLock with “transactionid” type) in the pg_class table.

            Note that there is also AccessExclusiveLock lock with type “relation”, but it’s not a table level one.

            ShareUpdateExclusiveLock (SHARE UPDATE EXCLUSIVE)

            These database maintenance operations need to take a ShareUpdateExclusiveLock:

            • ANALYZE table
            • VACUUM (without full) runs.
            • CREATE INDEX CONCURRENTLY

            The 

            ANALYZE tablename;
              statement updates table statistics. The query planner/optimizer is able to provide the best plans for query execution only if the statistics are up to date.

            BEGIN;
            ANALYZE locktest;
            -- in second connection
            postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
              pid  | vxid |   lock_type   |        lock_mode         | granted | xid_lock | relname
            -------+------+---------------+--------------------------+---------+----------+----------
             10997 | 3/7  | transactionid | ExclusiveLock            | t       | 591      |
             10997 | 3/7  | relation      | ShareUpdateExclusiveLock | t       |          | locktest

            There is no conflict between RowExclusiveLock and ShareUpdateExclusiveLock. UPDATE/DELETE/INSERT could still modify rows during ANALYZE.

            VACUUM and CREATE INDEX CONCURRENTLY can be executed only outside a transaction. To see the effects of these statements in lockview, execute a conflicting transaction first e.g. run ANALYZE in a transaction, or run VACUUM against a huge table.

            CREATE INDEX CONCURRENTLY locking can be confusing. SHARE UPDATE EXCLUSIVE lock does not conflict with a ROW EXCLUSIVE lock that’s used for DELETES, INSERT and UPDATES. Unfortunately, CREATE INDEX CONCURRENTLY waits until active transactions are finished twice due to full table scans:

            In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate.” PostgreSQL Documentation

            AccessExclusiveLock  (ACCESS EXCLUSIVE)

            This lock conflicts with any other locks and is used by these statements:

            • CREATE RULE
            • DROP TABLE
            • DROP INDEX
            • TRUNCATE
            • VACUUM FULL
            • LOCK TABLE (default mode)
            • CLUSTER
            • REINDEX
            • REFRESH MATERIALIZED VIEW (without CONCURRENTLY)

            BEGIN;
            CREATE RULE r_locktest AS ON INSERT TO locktest DO INSTEAD NOTHING;
            -- second connection
            postgres=# select pid,vxid,lock_type,lock_mode,granted,xid_lock,relname from lockview;
              pid  | vxid |   lock_type   |      lock_mode      | granted | xid_lock | relname
            -------+------+---------------+---------------------+---------+----------+----------
             10997 | 3/19 | transactionid | ExclusiveLock       | t       | 596      |
             10997 | 3/19 | relation      | AccessExclusiveLock | t       |          | locktest

            More importantly, drop index requires access exclusive locks for both table and index:

            BEGIN;
            DROP INDEX c_idx;
            -- second connection
            postgres=# SELECT * FROM lockview;
             pid  |  vxid  |   lock_type   |      lock_mode      | granted | xid_lock | relname  | page | tuple | classid | objid | objsubid
            ------+--------+---------------+---------------------+---------+----------+----------+------+-------+---------+-------+----------
             3998 | 3/7839 | virtualxid    | ExclusiveLock       | t       | 3/7839   |          |      |       |         |       |
             3998 | 3/7839 | transactionid | ExclusiveLock       | t       | 569      |          |      |       |         |       |
             3998 | 3/7839 | relation      | AccessExclusiveLock | t       |          | c_idx    |      |       |         |       |
             3998 | 3/7839 | relation      | AccessExclusiveLock | t       |          | locktest |      |       |         |       |

            Note: This is the most dangerous type of lock. Avoid running queries requiring access exclusive lock in production, or at least put the application in maintenance mode.

            ExclusiveLock

            Meanwhile, SQL commands don’t use ExclusiveLock, except for with the general LOCK TABLE statement. This lock prevents all requests except for a non-locking select (i.e. without FOR SHARE/UPDATE).

            BEGIN;
            LOCK TABLE locktest IN EXCLUSIVE MODE;
            -- second connection
            postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
              pid  | vxid | lock_type |   lock_mode   | granted | xid_lock | relname
            -------+------+-----------+---------------+---------+----------+----------
             10997 | 3/21 | relation  | ExclusiveLock | t       |          | locktest

            Savepoints

            Savepoint produces an additional ExclusiveLock of transactionid type with new xid value.

            BEGIN;
            SELECT * FROM locktest FOR SHARE;
            SAVEPOINT s1;
            SELECT * FROM locktest FOR UPDATE;
            -- second connection
            postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
              pid  | vxid |   lock_type   |    lock_mode    | granted | xid_lock | relname
            -------+------+---------------+-----------------+---------+----------+----------
             10997 | 3/37 | transactionid | ExclusiveLock   | t       | 602      |
             10997 | 3/37 | transactionid | ExclusiveLock   | t       | 603      |
             10997 | 3/37 | relation      | AccessShareLock | t       |          | c_idx
             10997 | 3/37 | relation      | RowShareLock    | t       |          | locktest

            pg_advisory_lock

            Sometimes application developers require synchronization between processes. In such systems, the application creates and removes locks frequently. Systems with a row-based locks implementation tend to cause table bloat.

            There are many functions related to advisory locks:

            • per session or per transaction
            • wait if lock is not available or immediately return false
            • exclusive or shared
            • 64-bit or two 32-bit integers resource identifiers

            Imagine that we have several cron jobs and that the application should prevent simultaneous runs of the same script. Next, each script can check if a lock is available in PostgreSQL for specific integer job identifier:

            postgres=# SELECT pg_try_advisory_lock(10);
             pg_try_advisory_lock
            ----------------------
             t
            -- second connection
            postgres=# SELECT * FROM lockview;
             pid  | vxid | lock_type |   lock_mode   | granted | xid_lock | relname | page | tuple | classid | objid | objsubid
            ------+------+-----------+---------------+---------+----------+---------+------+-------+---------+-------+----------
             3998 | 3/0  | advisory  | ExclusiveLock | t       |          |         |      |       |       0 |    10 |        1
            -- other connections
            SELECT pg_try_advisory_lock(10);
             pg_try_advisory_lock
            ----------------------
             f

            The query produces ExclusiveLock with type advisory.

            Deadlocks

            Any system with multiple locks tends to have a deadlock situation when queries will never finish. The only way to resolve such issues: kill one of blocked statements. More importantly, deadlock detection is an expensive procedure in PostgreSQL. A check for deadlock only happens when a transaction is locked for deadlock_timeout milliseconds—after one second by default.

            Here is an illustration of a deadlock situation for two different connections A and B:

            Any deadlock starting from lock wait.

            A: BEGIN; SELECT c FROM locktest WHERE c=1 FOR UPDATE;
            B: BEGIN; SELECT c FROM locktest WHERE c=2 FOR UPDATE; SELECT c FROM locktest WHERE c=1 FOR UPDATE;

            You are not alone with the identification of deadlocks, as the pg_stat_activity system view helps you to find statements and transactions causing lock waits:

            postgres=# SELECT pg_stat_activity.pid AS pid,
            query, wait_event, vxid, lock_type,
            lock_mode, granted, xid_lock
            FROM lockview JOIN pg_stat_activity ON (lockview.pid = pg_stat_activity.pid);
              pid  |          query             |  wait_event   | vxid |   lock_type   |      lock_mode      | granted | xid_lock
            -------+----------------------------+---------------+------+---------------+---------------------+---------+----------
             10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | transactionid | ExclusiveLock       | t       | 605
             10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | advisory      | ExclusiveLock       | t       |
             10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | relation      | AccessShareLock     | t       |
             10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | relation      | RowShareLock        | t       |
             11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | transactionid | ExclusiveLock       | t       | 606
             11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | advisory      | ExclusiveLock       | t       |
             11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | relation      | AccessShareLock     | t       |
             11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | relation      | RowShareLock        | t       |
             11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | tuple         | AccessExclusiveLock | t       |
             11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | transactionid | ShareLock           | f       | 605

            The SELECT FOR UPDATE on c=2 row causes a deadlock:

            SELECT c FROM locktest WHERE c=2 FOR UPDATE;

            Afterwards, PostgreSQL reports in server log:

            2018-08-02 08:46:07.793 UTC [10997] ERROR:  deadlock detected
            2018-08-02 08:46:07.793 UTC [10997] DETAIL:  Process 10997 waits for ShareLock on transaction 606; blocked by process 11495.
            Process 11495 waits for ShareLock on transaction 605; blocked by process 10997.
            Process 10997: select c from locktest where c=2 for update;
            Process 11495: select c from locktest where c=1 for update;
            2018-08-02 08:46:07.793 UTC [10997] HINT:  See server log for query details.
            2018-08-02 08:46:07.793 UTC [10997] CONTEXT:  while locking tuple (0,3) in relation "locktest"
            2018-08-02 08:46:07.793 UTC [10997] STATEMENT:  SELECT c FROM locktest WHERE c=2 FOR UPDATE;
            ERROR:  deadlock detected
            DETAIL:  Process 10997 waits for ShareLock on transaction 606; blocked by process 11495.
            Process 11495 waits for ShareLock on transaction 605; blocked by process 10997.
            HINT:  See server log for query details.
            CONTEXT:  while locking tuple (0,3) in relation "locktest"

            As you can see, the database server aborts one blocked transaction automatically.

            Multi-way deadlocks

            Normally there are just two transactions creating deadlocks. However, in complex cases, an application could caused deadlocks with multiple transactions forming a dependency circle.

            A: locks row1, B locks row2, C locks row3

            Step2

            A: trying to get row3, B: trying to get row1, C: trying to get row2

            Summary

            • Do not put DDL statements in long transactions.
            • Please avoid DDL on during high load for frequently updated tables
            • CLUSTER command requires exclusive access to the table and all it’s indexes
            • Monitor postgresql log for deadlock-related messages

            Photo by shy sol from Pexels

            by Nickolay Ihalainen at October 24, 2018 11:39 AM

            October 23, 2018

            Peter Zaitsev

            Reclaiming space on your Docker PMM server deployment

            reclaiming space Docker PMM

            reclaiming space Docker PMMRecently we had a customer that had issues with a filled disk on the server hosting their Docker pmm-server environment. They were not able to access the web UI, or even stop the pmm-server container because they had filled the /var/ mount point.

            Setting correct expectations

            The best way to avoid these kinds of issues in the first place is to plan ahead, and to know exactly with what you are dealing with in terms of disk space requirements. Michael Coburn has written a great blogpost on this matter:

            https://www.percona.com/blog/2017/05/04/how-much-disk-space-should-i-allocate-for-percona-monitoring-and-management/

            We are now using Prometheus version 2 inside PMM server, so you should take it with a pinch of salt. On the other hand, it will show how you should plan ahead, and think about the “steady state” disk usage, so it’s a good read.

            That’s the first step to make sure you won’t get into trouble down the line. But, what happens if you are already in trouble? We’ll see two quick ways that may help reclaiming space.

            Before anything else, you should stop any and all PMM clients running, so that you don’t have a race condition after recovering some space, in which metrics coming from the running clients will fill up whatever disk you had freed.

            If

            pmm-admin stop --all
              won’t work, you can stop the services manually, or even manually kill the running processes as a last resort:

            shell> systemctl list-unit-files | grep enabled | grep pmm | awk '{print $1}' | xargs -n 1 systemctl stop
            shell> ps ax | egrep "exporter|qan-agent|pmm" | grep -v "ssh" | awk '{print $1}' | xargs kill

            Removing unused containers

            In order for the next steps to be as effective as possible, make sure there are no unused containers running, or stopped:

            shell> docker ps -a

            If you see any container that you know you don’t need anymore:

            shell> docker stop <container_name>
            shell> docker rm -v <container_name>

            WARNING! Do not remove the pmm-data container!

            Reclaiming space from unused Docker images

            After you are done cleaning unused containers, we can move forward with removing unused images. Unless you are manually building your own Docker images, it’s really easy to get them again if needed, so you shouldn’t be afraid of deleting the ones that are not being used. In fact, you don’t need to explicitly download the images. By simply running

            docker run … image_name
              Docker will automatically do it for you if it’s not found locally.

            shell> docker image prune -a
            WARNING! This will remove all images without at least one container associated to them.
            Are you sure you want to continue? [y/N] y
            Deleted Images:
            ...
            Total reclaimed space: 3.97GB

            Not too bad, we just reclaimed 4Gb of disk space. This alone should be enough to restart the Docker service and have the pmm-server container back up. But we want more, just because we can 🙂

            Reclaiming space from orphaned Docker volumes

            By default, when removing a container (with

            docker rm
             ) Docker will not delete the associated volumes, unless you use the -v switch as we did above. This will mean that, unless you were aware of this fact, you will probably have some other gigabytes worth of data occupying disk space. We can easily do this with the volume prune command:

            shell> docker volume prune
            WARNING! This will remove all local volumes not used by at least one container.
            Are you sure you want to continue? [y/N] y
            Deleted Volumes:
            ...
            Total reclaimed space: 115GB

            Yeah… that’s some significant amount of disk space we just reclaimed back! Again, make sure you don’t care about any of the volumes from your past containers to be able to do this safely, since there is no turning back from this, obviously.

            For earlier versions of Docker where this command is not available, you can check this link.

            Planning ahead

            As mentioned before, you should now revisit Michael’s blogpost, and set the metrics retention and queries retention variables to whatever makes sense for your environment. Even if you plan ahead, you may not be counting on the additional variable overhead of images and orphaned volumes, so you may want to (warning: shameless plug for my own blogpost ahead) use different mount points for your PMM deployment, and avoid using the shared /var/lib/docker/ mount point for it.

            PMM also includes a Disk Space usage dashboard, that you can use to monitor this.

            Don’t forget to start back up your PMM clients, and continue to monitor them 24×7!

            Photo by Andrew Wulf on Unsplash

            by Agustín at October 23, 2018 12:39 PM

            MariaDB Foundation

            MariaDB Foundation at Percona Live Europe 2018

            From 5 to 7 November 2018 will be my first conference visit since taking up my new position.  The sub-title of the conference in Frankfurt is “Open Source Database Conference”, which is great and quite accurate, as there are a lot of different database systems represented.  The name is not new though…. Georg Richter, Zak […]

            The post MariaDB Foundation at Percona Live Europe 2018 appeared first on MariaDB.org.

            by Arjen Lentz at October 23, 2018 02:14 AM

            October 22, 2018

            Jean-Jerome Schmidt

            MySQL on Docker: Running ProxySQL as a Helper Container on Kubernetes

            ProxySQL commonly sits between the application and database tiers, in so called reverse-proxy tier. When your application containers are orchestrated and managed by Kubernetes, you might want to use ProxySQL in front of your database servers.

            In this post, we’ll show you how to run ProxySQL on Kubernetes as a helper container in a pod. We are going to use Wordpress as an example application. The data service is provided by our two-node MySQL Replication, deployed using ClusterControl and sitting outside of the Kubernetes network on a bare-metal infrastructure, as illustrated in the following diagram:

            ProxySQL Docker Image

            In this example, we are going to use ProxySQL Docker image maintained by Severalnines, a general public image built for multi-purpose usage. The image comes with no entrypoint script and supports Galera Cluster (in addition to built-in support for MySQL Replication), where an extra script is required for health check purposes.

            Basically, to run a ProxySQL container, simply execute the following command:

            $ docker run -d -v /path/to/proxysql.cnf:/etc/proxysql.cnf severalnines/proxysql

            This image recommends you to bind a ProxySQL configuration file to the mount point, /etc/proxysql.cnf, albeit you can skip this and configure it later using ProxySQL Admin console. Example configurations are provided in the Docker Hub page or the Github page.

            ProxySQL on Kubernetes

            Designing the ProxySQL architecture is a subjective topic and highly dependent on the placement of the application and database containers as well as the role of ProxySQL itself. ProxySQL does not only route queries, it can also be used to rewrite and cache queries. Efficient cache hits might require a custom configuration tailored specifically for the application database workload.

            Ideally, we can configure ProxySQL to be managed by Kubernetes with two configurations:

            1. ProxySQL as a Kubernetes service (centralized deployment).
            2. ProxySQL as a helper container in a pod (distributed deployment).

            The first option is pretty straightforward, where we create a ProxySQL pod and attach a Kubernetes service to it. Applications will then connect to the ProxySQL service via networking on the configured ports. Default to 6033 for MySQL load-balanced port and 6032 for ProxySQL administration port. This deployment will be covered in the upcoming blog post.

            The second option is a bit different. Kubernetes has a concept called "pod". You can have one or more containers per pod, these are relatively tightly coupled. A pod’s contents are always co-located and co-scheduled, and run in a shared context. A pod is the smallest manageable container unit in Kubernetes.

            Both deployments can be distinguished easily by looking at the following diagram:

            The primary reason that pods can have multiple containers is to support helper applications that assist a primary application. Typical examples of helper applications are data pullers, data pushers, and proxies. Helper and primary applications often need to communicate with each other. Typically this is done through a shared filesystem, as shown in this exercise, or through the loopback network interface, localhost. An example of this pattern is a web server along with a helper program that polls a Git repository for new updates.

            This blog post will cover the second configuration - running ProxySQL as a helper container in a pod.

            ProxySQL as Helper in a Pod

            In this setup, we run ProxySQL as a helper container to our Wordpress container. The following diagram illustrates our high-level architecture:

            In this setup, ProxySQL container is tightly coupled with the Wordpress container, and we named it as "blog" pod. If rescheduling happens e.g, the Kubernetes worker node goes down, these two containers will always be rescheduled together as one logical unit on the next available host. To keep the application containers' content persistent across multiple nodes, we have to use a clustered or remote file system, which in this case is NFS.

            ProxySQL role is to provide a database abstraction layer to the application container. Since we are running a two-node MySQL Replication as the backend database service, read-write splitting is vital to maximize resource consumption on both MySQL servers. ProxySQL excels at this and requires minimal to no changes to the application.

            There are a number of other benefits running ProxySQL in this setup:

            • Bring query caching capability closest to the application layer running in Kubernetes.
            • Secure implementation by connecting through ProxySQL UNIX socket file. It is like a pipe that the server and the clients can use to connect and exchange requests and data.
            • Distributed reverse proxy tier with shared nothing architecture.
            • Less network overhead due to "skip-networking" implementation.
            • Stateless deployment approach by utilizing Kubernetes ConfigMaps.

            Preparing the Database

            Create the wordpress database and user on the master and assign with correct privilege:

            mysql-master> CREATE DATABASE wordpress;
            mysql-master> CREATE USER wordpress@'%' IDENTIFIED BY 'passw0rd';
            mysql-master> GRANT ALL PRIVILEGES ON wordpress.* TO wordpress@'%';

            Also, create the ProxySQL monitoring user:

            mysql-master> CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';

            Then, reload the grant table:

            mysql-master> FLUSH PRIVILEGES;

            Preparing the Pod

            Now, copy paste the following lines into a file called blog-deployment.yml on the host where kubectl is configured:

            apiVersion: apps/v1
            kind: Deployment
            metadata:
              name: blog
              labels:
                app: blog
            spec:
              replicas: 1
              selector:
                matchLabels:
                  app: blog
                  tier: frontend
              strategy:
                type: RollingUpdate
              template:
                metadata:
                  labels:
                    app: blog
                    tier: frontend
                spec:
            
                  restartPolicy: Always
            
                  containers:
                  - image: wordpress:4.9-apache
                    name: wordpress
                    env:
                    - name: WORDPRESS_DB_HOST
                      value: localhost:/tmp/proxysql.sock
                    - name: WORDPRESS_DB_USER
                      value: wordpress
                    - name: WORDPRESS_DB_PASSWORD
                      valueFrom:
                        secretKeyRef:
                          name: mysql-pass
                          key: password
                    ports:
                    - containerPort: 80
                      name: wordpress
                    volumeMounts:
                    - name: wordpress-persistent-storage
                      mountPath: /var/www/html
                    - name: shared-data
                      mountPath: /tmp
            
                  - image: severalnines/proxysql
                    name: proxysql
                    volumeMounts:
                    - name: proxysql-config
                      mountPath: /etc/proxysql.cnf
                      subPath: proxysql.cnf
                    - name: shared-data
                      mountPath: /tmp
            
                  volumes:
                  - name: wordpress-persistent-storage
                    persistentVolumeClaim:
                      claimName: wp-pv-claim
                  - name: proxysql-config
                    configMap:
                      name: proxysql-configmap
                  - name: shared-data
                    emptyDir: {}

            The YAML file has many lines and let's look the interesting part only. The first section:

            apiVersion: apps/v1
            kind: Deployment

            The first line is the apiVersion. Our Kubernetes cluster is running on v1.12 so we should refer to the Kubernetes v1.12 API documentation and follow the resource declaration according to this API. The next one is the kind, which tells what type of resource that we want to deploy. Deployment, Service, ReplicaSet, DaemonSet, PersistentVolume are some of the examples.

            The next important section is the "containers" section. Here we define all containers that we would like to run together in this pod. The first part is the Wordpress container:

                  - image: wordpress:4.9-apache
                    name: wordpress
                    env:
                    - name: WORDPRESS_DB_HOST
                      value: localhost:/tmp/proxysql.sock
                    - name: WORDPRESS_DB_USER
                      value: wordpress
                    - name: WORDPRESS_DB_PASSWORD
                      valueFrom:
                        secretKeyRef:
                          name: mysql-pass
                          key: password
                    ports:
                    - containerPort: 80
                      name: wordpress
                    volumeMounts:
                    - name: wordpress-persistent-storage
                      mountPath: /var/www/html
                    - name: shared-data
                      mountPath: /tmp

            In this section, we are telling Kubernetes to deploy Wordpress 4.9 using Apache web server and we gave the container the name "wordpress". We also want Kubernetes to pass a number of environment variables:

            • WORDPRESS_DB_HOST - The database host. Since our ProxySQL container resides in the same Pod with the Wordpress container, it's more secure to use a ProxySQL socket file instead. The format to use socket file in Wordpress is "localhost:{path to the socket file}". By default, it's located under /tmp directory of the ProxySQL container. This /tmp path is shared between Wordpress and ProxySQL containers by using "shared-data" volumeMounts as shown further down. Both containers have to mount this volume to share the same content under /tmp directory.
            • WORDPRESS_DB_USER - Specify the wordpress database user.
            • WORDPRESS_DB_PASSWORD - The password for WORDPRESS_DB_USER. Since we do not want to expose the password in this file, we can hide it using Kubernetes Secrets. Here we instruct Kubernetes to read the "mysql-pass" Secret resource instead. Secrets has to be created in advanced before the pod deployment, as explained further down.

            We also want to publish port 80 of the container for the end user. The Wordpress content stored inside /var/www/html in the container will be mounted into our persistent storage running on NFS.

            Next, we define the ProxySQL container:

                  - image: severalnines/proxysql:1.4.12
                    name: proxysql
                    volumeMounts:
                    - name: proxysql-config
                      mountPath: /etc/proxysql.cnf
                      subPath: proxysql.cnf
                    - name: shared-data
                      mountPath: /tmp
                    ports:
                    - containerPort: 6033
                      name: proxysql

            In the above section, we are telling Kubernetes to deploy a ProxySQL using severalnines/proxysql image version 1.4.12. We also want Kubernetes to mount our custom, pre-configured configuration file and map it to /etc/proxysql.cnf inside the container. There will be a volume called "shared-data" which map to /tmp directory to share with the Wordpress image - a temporary directory that shares a pod's lifetime. This allows ProxySQL socket file (/tmp/proxysql.sock) to be used by the Wordpress container when connecting to the database, bypassing the TCP/IP networking.

            The last part is the "volumes" section:

                  volumes:
                  - name: wordpress-persistent-storage
                    persistentVolumeClaim:
                      claimName: wp-pv-claim
                  - name: proxysql-config
                    configMap:
                      name: proxysql-configmap
                  - name: shared-data
                    emptyDir: {}

            Kubernetes will have to create three volumes for this pod:

            • wordpress-persistent-storage - Use the PersistentVolumeClaim resource to map NFS export into the container for persistent data storage for Wordpress content.
            • proxysql-config - Use the ConfigMap resource to map the ProxySQL configuration file.
            • shared-data - Use the emptyDir resource to mount a shared directory for our containers inside the Pod. emptyDir resource is a temporary directory that shares a pod's lifetime.

            Therefore, based on our YAML definition above, we have to prepare a number of Kubernetes resources before we can begin to deploy the "blog" pod:

            1. PersistentVolume and PersistentVolumeClaim - To store the web contents of our Wordpress application, so when the pod is being rescheduled to other worker node, we won't lose the last changes.
            2. Secrets - To hide the Wordpress database user password inside the YAML file.
            3. ConfigMap - To map the configuration file to ProxySQL container, so when it's being rescheduled to other node, Kubernetes can automatically remount it again.
            Severalnines
             
            MySQL on Docker: How to Containerize Your Database
            Discover all you need to understand when considering to run a MySQL service on top of Docker container virtualization

            PersistentVolume and PersistentVolumeClaim

            A good persistent storage for Kubernetes should be accessible by all Kubernetes nodes in the cluster. For the sake of this blog post, we used NFS as the PersistentVolume (PV) provider because it's easy and supported out-of-the-box. The NFS server is located somewhere outside of our Kubernetes network and we have configured it to allow all Kubernetes nodes with the following line inside /etc/exports:

            /nfs    192.168.55.*(rw,sync,no_root_squash,no_all_squash)

            Take note that NFS client package must be installed on all Kubernetes nodes. Otherwise, Kubernetes wouldn't be able to mount the NFS correctly. On all nodes:

            $ sudo apt-install nfs-common #Ubuntu/Debian
            $ yum install nfs-utils #RHEL/CentOS

            Also, make sure on the NFS server, the target directory exists:

            (nfs-server)$ mkdir /nfs/kubernetes/wordpress

            Then, create a file called wordpress-pv-pvc.yml and add the following lines:

            apiVersion: v1
            kind: PersistentVolume
            metadata:
              name: wp-pv
              labels:
                app: blog
            spec:
              accessModes:
                - ReadWriteOnce
              capacity:
                storage: 3Gi
              mountOptions:
                - hard
                - nfsvers=4.1
              nfs:
                path: /nfs/kubernetes/wordpress
                server: 192.168.55.200
            ---
            kind: PersistentVolumeClaim
            apiVersion: v1
            metadata:
              name: wp-pvc
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 3Gi
              selector:
                matchLabels:
                  app: blog
                  tier: frontend

            In the above definition, we would like Kubernetes to allocate 3GB of volume space on the NFS server for our Wordpress container. Take note for production usage, NFS should be configured with automatic provisioner and storage class.

            Create the PV and PVC resources:

            $ kubectl create -f wordpress-pv-pvc.yml

            Verify if those resources are created and the status must be "Bound":

            $ kubectl get pv,pvc
            NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
            persistentvolume/wp-pv   3Gi        RWO            Recycle          Bound    default/wp-pvc                           22h
            
            NAME                           STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
            persistentvolumeclaim/wp-pvc   Bound    wp-pv    3Gi        RWO                           22h

            Secrets

            The first one is to create a secret to be used by the Wordpress container for WORDPRESS_DB_PASSWORD environment variable. The reason is simply because we don't want to expose the password in clear text inside the YAML file.

            Create a secret resource called mysql-pass and pass the password accordingly:

            $ kubectl create secret generic mysql-pass --from-literal=password=passw0rd

            Verify that our secret is created:

            $ kubectl get secrets mysql-pass
            NAME         TYPE     DATA   AGE
            mysql-pass   Opaque   1      7h12m

            ConfigMap

            We also need to create a ConfigMap resource for our ProxySQL container. A Kubernetes ConfigMap file holds key-value pairs of configuration data that can be consumed in pods or used to store configuration data. ConfigMaps allow you to decouple configuration artifacts from image content to keep containerized applications portable.

            Since our database server is already running on bare-metal servers with a static hostname and IP address plus static monitoring username and password, in this use case the ConfigMap file will store pre-configured configuration information about the ProxySQL service that we want to use.

            First create a text file called proxysql.cnf and add the following lines:

            datadir="/var/lib/proxysql"
            admin_variables=
            {
                    admin_credentials="admin:adminpassw0rd"
                    mysql_ifaces="0.0.0.0:6032"
                    refresh_interval=2000
            }
            mysql_variables=
            {
                    threads=4
                    max_connections=2048
                    default_query_delay=0
                    default_query_timeout=36000000
                    have_compress=true
                    poll_timeout=2000
                    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
                    default_schema="information_schema"
                    stacksize=1048576
                    server_version="5.1.30"
                    connect_timeout_server=10000
                    monitor_history=60000
                    monitor_connect_interval=200000
                    monitor_ping_interval=200000
                    ping_interval_server_msec=10000
                    ping_timeout_server=200
                    commands_stats=true
                    sessions_sort=true
                    monitor_username="proxysql"
                    monitor_password="proxysqlpassw0rd"
            }
            mysql_servers =
            (
                    { address="192.168.55.171" , port=3306 , hostgroup=10, max_connections=100 },
                    { address="192.168.55.172" , port=3306 , hostgroup=10, max_connections=100 },
                    { address="192.168.55.171" , port=3306 , hostgroup=20, max_connections=100 },
                    { address="192.168.55.172" , port=3306 , hostgroup=20, max_connections=100 }
            )
            mysql_users =
            (
                    { username = "wordpress" , password = "passw0rd" , default_hostgroup = 10 , active = 1 }
            )
            mysql_query_rules =
            (
                    {
                            rule_id=100
                            active=1
                            match_pattern="^SELECT .* FOR UPDATE"
                            destination_hostgroup=10
                            apply=1
                    },
                    {
                            rule_id=200
                            active=1
                            match_pattern="^SELECT .*"
                            destination_hostgroup=20
                            apply=1
                    },
                    {
                            rule_id=300
                            active=1
                            match_pattern=".*"
                            destination_hostgroup=10
                            apply=1
                    }
            )
            mysql_replication_hostgroups =
            (
                    { writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
            )

            Pay extra attention to the "mysql_servers" and "mysql_users" sections, where you might need to modify the values to suit your database cluster setup. In this case, we have two database servers running in MySQL Replication as summarized in the following Topology screenshot taken from ClusterControl:

            All writes should go to the master node while reads are forwarded to hostgroup 20, as defined under "mysql_query_rules" section. That's the basic of read/write splitting and we want to utilize them altogether.

            Then, import the configuration file into ConfigMap:

            $ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf
            configmap/proxysql-configmap created

            Verify if the ConfigMap is loaded into Kubernetes:

            $ kubectl get configmap
            NAME                 DATA   AGE
            proxysql-configmap   1      45s

            Deploying the Pod

            Now we should be good to deploy the blog pod. Send the deployment job to Kubernetes:

            $ kubectl create -f blog-deployment.yml

            Verify the pod status:

            $ kubectl get pods
            NAME                           READY   STATUS              RESTARTS   AGE
            blog-54755cbcb5-t4cb7          2/2     Running             0          100s

            It must show 2/2 under the READY column, indicating there are two containers running inside the pod. Use the -c option flag to check the Wordpress and ProxySQL containers inside the blog pod:

            $ kubectl logs blog-54755cbcb5-t4cb7 -c wordpress
            $ kubectl logs blog-54755cbcb5-t4cb7 -c proxysql

            From the ProxySQL container log, you should see the following lines:

            2018-10-20 08:57:14 [INFO] Dumping current MySQL Servers structures for hostgroup ALL
            HID: 10 , address: 192.168.55.171 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
            HID: 10 , address: 192.168.55.172 , port: 3306 , weight: 1 , status: OFFLINE_HARD , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
            HID: 20 , address: 192.168.55.171 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
            HID: 20 , address: 192.168.55.172 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:

            HID 10 (writer hostgroup) must have only one ONLINE node (indicating a single master) and the other host must be in at least in OFFLINE_HARD status. For HID 20, it's expected to be ONLINE for all nodes (indicating multiple read replicas).

            To get a summary of the deployment, use the describe flag:

            $ kubectl describe deployments blog

            Our blog is now running, however we can't access it from outside of the Kubernetes network without configuring the service, as explained in the next section.

            Creating the Blog Service

            The last step is to create attach a service to our pod. This to ensure that our Wordpress blog pod is accessible from the outside world. Create a file called blog-svc.yml and paste the following line:

            apiVersion: v1
            kind: Service
            metadata:
              name: blog
              labels:
                app: blog
                tier: frontend
            spec:
              type: NodePort
              ports:
              - name: blog
                nodePort: 30080
                port: 80
              selector:
                app: blog
                tier: frontend

            Create the service:

            $ kubectl create -f blog-svc.yml

            Verify if the service is created correctly:

            root@kube1:~/proxysql-blog# kubectl get svc
            NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
            blog         NodePort    10.96.140.37   <none>        80:30080/TCP   26s
            kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP        43h

            Port 80 published by the blog pod is now mapped to the outside world via port 30080. We can access our blog post at http://{any_kubernetes_host}:30080/ and should be redirected to the Wordpress installation page. If we proceed with the installation, it would skip the database connection part and directly show this page:

            It indicates that our MySQL and ProxySQL configuration is correctly configured inside wp-config.php file. Otherwise, you would be redirected to the database configuration page.

            Our deployment is now complete.

            Managing ProxySQL Container inside a Pod

            Failover and recovery are expected to be handled automatically by Kubernetes. For example, if Kubernetes worker goes down, the pod will be recreated in the next available node after --pod-eviction-timeout (default to 5 minutes). If the container crashes or is killed, Kubernetes will replace it almost instantly.

            Some common management tasks are expected to be different when running within Kubernetes, as shown in the next sections.

            Scaling Up and Down

            In the above configuration, we were deploying one replica in our deployment. To scale up, simply change the spec.replicas value accordingly by using kubectl edit command:

            $ kubectl edit deployment blog

            It will open up the deployment definition in a default text file and simply change the spec.replicas value to something higher, for example, "replicas: 3". Then, save the file and immediately check the rollout status by using the following command:

            $ kubectl rollout status deployment blog
            Waiting for deployment "blog" rollout to finish: 1 of 3 updated replicas are available...
            Waiting for deployment "blog" rollout to finish: 2 of 3 updated replicas are available...
            deployment "blog" successfully rolled out

            At this point, we have three blog pods (Wordpress + ProxySQL) running simultanouesly in Kubernetes:

            $ kubectl get pods
            NAME                             READY   STATUS              RESTARTS   AGE
            blog-54755cbcb5-6fnqn            2/2     Running             0          11m
            blog-54755cbcb5-cwpdj            2/2     Running             0          11m
            blog-54755cbcb5-jxtvc            2/2     Running             0          22m

            At this point, our architecture is looking something like this:

            Take note that it might require more customization than our current configuration to run Wordpress smoothly in a horizontal-scaled production environment (think about static contents, session management and others). Those are actually beyond the scope of this blog post.

            Scaling down procedures are similar.

            Configuration Management

            Configuration management is important in ProxySQL. This is where the magic happens where you can define your own set of query rules to do query caching, firewalling and rewriting. Contrary to the common practice, where ProxySQL would be configured via Admin console and push into persistency by using "SAVE .. TO DISK", we will stick with configuration files only to make things more portable in Kubernetes. That's the reason we are using ConfigMaps.

            Since we are relying on our centralized configuration stored by Kubernetes ConfigMaps, there are a number of ways to perform configuration changes. Firstly, by using the kubectl edit command:

            $ kubectl edit configmap proxysql-configmap

            It will open up the configuration in a default text editor and you can directly make changes to it and save the text file once done. Otherwise, recreate the configmaps should also do:

            $ vi proxysql.cnf # edit the configuration first
            $ kubectl delete configmap proxysql-configmap
            $ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf

            After the configuration is pushed into ConfigMap, restart the pod or container as shown in the Service Control section. Configuring the container via ProxySQL admin interface (port 6032) won't make it persistent after pod rescheduling by Kubernetes.

            Service Control

            Since the two containers inside a pod are tightly coupled, the best way to apply the ProxySQL configuration changes is to force Kubernetes to do pod replacement. Consider we are having three blog pods now after we scaled up:

            $ kubectl get pods
            NAME                             READY   STATUS              RESTARTS   AGE
            blog-54755cbcb5-6fnqn            2/2     Running             0          31m
            blog-54755cbcb5-cwpdj            2/2     Running             0          31m
            blog-54755cbcb5-jxtvc            2/2     Running             1          22m

            Use the following command to replace one pod at a time:

            $ kubectl get pod blog-54755cbcb5-6fnqn -n default -o yaml | kubectl replace --force -f -
            pod "blog-54755cbcb5-6fnqn" deleted
            pod/blog-54755cbcb5-6fnqn

            Then, verify with the following:

            $ kubectl get pods
            NAME                             READY   STATUS              RESTARTS   AGE
            blog-54755cbcb5-6fnqn            2/2     Running             0          31m
            blog-54755cbcb5-cwpdj            2/2     Running             0          31m
            blog-54755cbcb5-qs6jm            2/2     Running             1          2m26s

            You will notice the most recent pod has been restarted by looking at the AGE and RESTART column, it came up with a different pod name. Repeat the same steps for the remaining pods. Otherwise, you can also use "docker kill" command to kill the ProxySQL container manually inside the Kubernetes worker node. For example:

            (kube-worker)$ docker kill $(docker ps | grep -i proxysql_blog | awk {'print $1'})

            Kubernetes will then replace the killed ProxySQL container with a new one.

            Monitoring

            Use kubectl exec command to execute SQL statement via mysql client. For example, to monitor query digestion:

            $ kubectl exec -it blog-54755cbcb5-29hqt -c proxysql -- mysql -uadmin -p -h127.0.0.1 -P6032
            mysql> SELECT * FROM stats_mysql_query_digest;

            Or with a one-liner:

            $ kubectl exec -it blog-54755cbcb5-29hqt -c proxysql -- mysql -uadmin -p -h127.0.0.1 -P6032 -e 'SELECT * FROM stats_mysql_query_digest'

            By changing the SQL statement, you can monitor other ProxySQL components or perform any administration tasks via this Admin console. Again, it will only persist during the ProxySQL container lifetime and won't get persisted if the pod is rescheduled.

            Final Thoughts

            ProxySQL holds a key role if you want to scale your application containers and and have an intelligent way to access a distributed database backend. There are a number of ways to deploy ProxySQL on Kubernetes to support our application growth when running at scale. This blog post only covers one of them.

            In an upcoming blog post, we are going to look at how to run ProxySQL in a centralized approach by using it as a Kubernetes service.

            by ashraf at October 22, 2018 09:06 PM

            Henrik Ingo

            Video x2: Measuring performance variability of EC2

            I was recently invited to speak at Fwdays Highload in Kyiv. This was my first ever visit to Ukraine, so I was excited to go and visit this large and beautiful European capital. Over a thousand years ago Vikings would row their boats through the rivers in Russia, and take the Dniepr southward to Kyiv and ultimately Turkey. It was exciting to travel in the footsteps of my forefathers.

            My talk isn't really MongoDB specific, rather about an EC2 performance tuning project we did in 2017:

            read more

            by hingo at October 22, 2018 05:59 PM

            Peter Zaitsev

            Upcoming Webinar Thurs 10/25: Why Do Developers Prefer MongoDB?

            Why Do Developers Prefer MongoDB?

            Why Do Developers Prefer MongoDB?Please join Percona’s Sr. Technical Operations Architect Tim Vaillancourt as he presents Why Do Developers Prefer MongoDB? on Thursday, October 25th, 2018, at 10:00 AM PDT (UTC-7) / 1:00 PM EDT (UTC-4).

            Register Now

            As the fastest growing database technology today, MongoDB® helps organizations and businesses across industries create scalable applications that would have been deemed impossible a few years ago.

            The world is on the brink of an information overload. As a result, this information requires huge databases to store and manipulate its data. This is possible with MongoDB, which is as flexible as it is powerful. Accordingly, you can build extremely high-performance apps with the joy of a schemaless lifestyle. Even more, it’s easy to adopt and deploy, which is why developers like the database.

            We’ll examine how MongoDB compares with other NoSQL database platforms. Moreover, we’ll provide a side-by-side comparison that will help you decide if MongoDB is the right match for your applications.

            Register for this webinar to learn why developers prefer MongoDB.

            by Tim Vaillancourt at October 22, 2018 05:00 PM

            One Billion Tables in MySQL 8.0 with ZFS

            one billion tables MySQL

            The short version

            I created > one billion InnoDB tables in MySQL 8.0 (tables, not rows) just for fun. Here is the proof:

            $ mysql -A
            Welcome to the MySQL monitor.  Commands end with ; or \g.
            Your MySQL connection id is 1425329
            Server version: 8.0.12 MySQL Community Server - GPL
            Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
            Oracle is a registered trademark of Oracle Corporation and/or its
            affiliates. Other names may be trademarks of their respective
            owners.
            Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
            mysql> select count(*) from information_schema.tables;
            +------------+
            | count(*)   |
            +------------+
            | 1011570298 |
            +------------+
            1 row in set (6 hours 57 min 6.31 sec)

            Yes, it took 6 hours and 57 minutes to count them all!

            Why does anyone need one billion tables?

            In my previous blog post, I created and tested MySQL 8.0 with 40 million tables (that was a real case study). The One Billion Tables project is not a real world scenario, however. I was challenged by Billion Tables Project (BTP) in PostgreSQL, and decided to repeat it with MySQL, creating 1 billion InnoDB tables.

            As an aside: I think MySQL 8.0 is the first MySQL version where creating 1 billion InnoDB tables is even practically possible.

            Challenges with one billion InnoDB tables

            Disk space

            The first and one of the most important challenges is disk space. InnoDB allocates data pages on disk when creating .ibd files. Without disk level compression we need > 25Tb of disk. The good news: we have ZFS which provides transparent disk compression. Here’s how the disk utilization looks:

            Actual data (apparent-size):

            # du -sh --apparent-size /mysqldata/
            26T     /mysqldata/

            Compressed data:

            # du -sh /mysqldata/
            2.4T    /mysqldata/

            Compression ratio:

            # zfs get compression,compressratio
            ...
            mysqldata/mysql/data             compressratio         7.14x                      -
            mysqldata/mysql/data             compression           gzip                       inherited from mysqldata/mysql

            (Looks like the compression ratio reported is not 100% correct, we expect ~10x compression ratio.)

            Too many tiny files

            This is usually the big issue with databases that create a file per table. With MySQL 8.0 we can create a shared tablespace and “assign” a table to it. I created a tablespace per database, and created 1000 tables in each database.

            The result:

            mysql> select count(*) from information_schema.schemata;
            +----------+
            | count(*) |
            +----------+
            |  1011575 |
            +----------+
            1 row in set (1.31 sec)

            Creating tables

            Another big challenge is how to create tables fast enough so it will not take months. I have used three approaches:

            1. Disabled all possible consistency checks in MySQL, and decreased the innodb page size to 4K (these config options are NOT for production use)
            2. Created tables in parallel: as the mutex contention bug in MySQL 8.0 has been fixed, creating tables in parallel works fine.
            3. Use local NVMe cards on top of an AWS ec2 i3.8xlarge instance

            my.cnf config file (I repeat: do not use this in production):

            [mysqld]
            default-authentication-plugin = mysql_native_password
            performance_schema=0
            datadir=/mysqldata/mysql/data
            socket=/mysqldata/mysql/data/mysql.sock
            log-error = /mysqldata/mysql/log/error.log
            skip-log-bin=1
            innodb_log_group_home_dir = /mysqldata/mysql/log/
            innodb_doublewrite = 0
            innodb_checksum_algorithm=none
            innodb_log_checksums=0
            innodb_flush_log_at_trx_commit=0
            innodb_log_file_size=2G
            innodb_buffer_pool_size=100G
            innodb_page_size=4k
            innodb_flush_method=nosync
            innodb_io_capacity_max=20000
            innodb_io_capacity=5000
            innodb_buffer_pool_instances=32
            innodb_stats_persistent = 0
            tablespace_definition_cache = 524288
            schema_definition_cache = 524288
            table_definition_cache = 524288
            table_open_cache=524288
            table_open_cache_instances=32
            open-files-limit=1000000

            ZFS pool:

            # zpool status
              pool: mysqldata
             state: ONLINE
              scan: scrub repaired 0B in 1h49m with 0 errors on Sun Oct 14 02:13:17 2018
            config:
                    NAME        STATE     READ WRITE CKSUM
                    mysqldata   ONLINE       0     0     0
                      nvme0n1   ONLINE       0     0     0
                      nvme1n1   ONLINE       0     0     0
                      nvme2n1   ONLINE       0     0     0
                      nvme3n1   ONLINE       0     0     0
            errors: No known data errors

            A simple “deploy” script to create tables in parallel (includes the sysbench table structure):

            #/bin/bash
            function do_db {
                    db_exist=$(mysql -A -s -Nbe "select 1 from information_schema.schemata where schema_name = '$db'")
                    if [ "$db_exist" == "1" ]; then echo "Already exists: $db"; return 0; fi;
                    tbspace="create database $db; use $db; CREATE TABLESPACE $db ADD DATAFILE '$db.ibd' engine=InnoDB";
                    #echo "Tablespace $db.ibd created!"
                    tables=""
                    for i in {1..1000}
                    do
                            table="CREATE TABLE sbtest$i ( id int(10) unsigned NOT NULL AUTO_INCREMENT, k int(10) unsigned NOT NULL DEFAULT '0', c varchar(120) NOT NULL DEFAULT '', pad varchar(60) NOT NULL DEFAULT '', PRIMARY KEY (id), KEY k_1 (k) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 tablespace $db;"
                            tables="$tables; $table;"
                    done
                    echo "$tbspace;$tables" | mysql
            }
            c=0
            echo "starting..."
            c=$(mysql -A -s -Nbe "select max(cast(SUBSTRING_INDEX(schema_name, '_', -1) as unsigned)) from information_schema.schemata where schema_name like 'sbtest_%'")
            for m in {1..100000}
            do
                    echo "m=$m"
                    for i in {1..30}
                    do
                            let c=$c+1
                            echo $c
                            db="sbtest_$c"
                            do_db &
                    done
                    wait
            done

            How fast did we create tables? Here are some stats:

            # mysqladmin -i 10 -r ex|grep Com_create_table
            ...
            | Com_create_table                                      | 6497                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
            | Com_create_table                                      | 6449

            So we created ~650 tables per second. The average, above, is per 10 seconds.

            Counting the tables

            It took > 6 hours to do “count(*) from information_schema.tables”! Here is why:

            1. MySQL 8.0 uses a new data dictionary (this is great as it avoids creating 1 billion frm files). Everything is stored in this file:
              # ls -lah /mysqldata/mysql/data/mysql.ibd
              -rw-r----- 1 mysql mysql 6.1T Oct 18 15:02 /mysqldata/mysql/data/mysql.ibd
            2. The information_schema.tables is actually a view:

            mysql> show create table information_schema.tables\G
            *************************** 1. row ***************************
                            View: TABLES
                     Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER VIEW `information_schema`.`TABLES` AS select `cat`.`name` AS `TABLE_CATALOG`,`sch`.`name` AS `TABLE_SCHEMA`,`tbl`.`name` AS `TABLE_NAME`,`tbl`.`type` AS `TABLE_TYPE`,if((`tbl`.`type` = 'BASE TABLE'),`tbl`.`engine`,NULL) AS `ENGINE`,if((`tbl`.`type` = 'VIEW'),NULL,10) AS `VERSION`,`tbl`.`row_format` AS `ROW_FORMAT`,internal_table_rows(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`table_rows`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `TABLE_ROWS`,internal_avg_row_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`avg_row_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `AVG_ROW_LENGTH`,internal_data_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`data_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `DATA_LENGTH`,internal_max_data_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`max_data_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `MAX_DATA_LENGTH`,internal_index_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`index_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `INDEX_LENGTH`,internal_data_free(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`data_free`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `DATA_FREE`,internal_auto_increment(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`auto_increment`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0),`tbl`.`se_private_data`) AS `AUTO_INCREMENT`,`tbl`.`created` AS `CREATE_TIME`,internal_update_time(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(cast(`stat`.`update_time` as unsigned),0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `UPDATE_TIME`,internal_check_time(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(cast(`stat`.`check_time` as unsigned),0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `CHECK_TIME`,`col`.`name` AS `TABLE_COLLATION`,internal_checksum(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`checksum`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `CHECKSUM`,if((`tbl`.`type` = 'VIEW'),NULL,get_dd_create_options(`tbl`.`options`,if((ifnull(`tbl`.`partition_expression`,'NOT_PART_TBL') = 'NOT_PART_TBL'),0,1))) AS `CREATE_OPTIONS`,internal_get_comment_or_error(`sch`.`name`,`tbl`.`name`,`tbl`.`type`,`tbl`.`options`,`tbl`.`comment`) AS `TABLE_COMMENT` from (((((`mysql`.`tables` `tbl` join `mysql`.`schemata` `sch` on((`tbl`.`schema_id` = `sch`.`id`))) join `mysql`.`catalogs` `cat` on((`cat`.`id` = `sch`.`catalog_id`))) left join `mysql`.`collations` `col` on((`tbl`.`collation_id` = `col`.`id`))) left join `mysql`.`tablespaces` `ts` on((`tbl`.`tablespace_id` = `ts`.`id`))) left join `mysql`.`table_stats` `stat` on(((`tbl`.`name` = `stat`.`table_name`) and (`sch`.`name` = `stat`.`schema_name`)))) where (can_access_table(`sch`.`name`,`tbl`.`name`) and is_visible_dd_object(`tbl`.`hidden`))
            character_set_client: utf8
            collation_connection: utf8_general_ci

            and the explain plan looks like this:

            mysql> explain select count(*) from information_schema.tables \G
            *************************** 1. row ***************************
                       id: 1
              select_type: SIMPLE
                    table: cat
               partitions: NULL
                     type: index
            possible_keys: PRIMARY
                      key: name
                  key_len: 194
                      ref: NULL
                     rows: 1
                 filtered: 100.00
                    Extra: Using index
            *************************** 2. row ***************************
                       id: 1
              select_type: SIMPLE
                    table: tbl
               partitions: NULL
                     type: ALL
            possible_keys: schema_id
                      key: NULL
                  key_len: NULL
                      ref: NULL
                     rows: 1023387060
                 filtered: 100.00
                    Extra: Using where; Using join buffer (Block Nested Loop)
            *************************** 3. row ***************************
                       id: 1
              select_type: SIMPLE
                    table: sch
               partitions: NULL
                     type: eq_ref
            possible_keys: PRIMARY,catalog_id
                      key: PRIMARY
                  key_len: 8
                      ref: mysql.tbl.schema_id
                     rows: 1
                 filtered: 11.11
                    Extra: Using where
            *************************** 4. row ***************************
                       id: 1
              select_type: SIMPLE
                    table: stat
               partitions: NULL
                     type: eq_ref
            possible_keys: PRIMARY
                      key: PRIMARY
                  key_len: 388
                      ref: mysql.sch.name,mysql.tbl.name
                     rows: 1
                 filtered: 100.00
                    Extra: Using index
            *************************** 5. row ***************************
                       id: 1
              select_type: SIMPLE
                    table: ts
               partitions: NULL
                     type: eq_ref
            possible_keys: PRIMARY
                      key: PRIMARY
                  key_len: 8
                      ref: mysql.tbl.tablespace_id
                     rows: 1
                 filtered: 100.00
                    Extra: Using index
            *************************** 6. row ***************************
                       id: 1
              select_type: SIMPLE
                    table: col
               partitions: NULL
                     type: eq_ref
            possible_keys: PRIMARY
                      key: PRIMARY
                  key_len: 8
                      ref: mysql.tbl.collation_id
                     rows: 1
                 filtered: 100.00
                    Extra: Using index

            Conclusions

            1. I have created more than 1 billion real InnoDB tables with indexes in MySQL 8.0, just for fun, and it worked. It took ~2 weeks to create.
            2. Probably MySQL 8.0 is the first version where it is even practically possible to create billion InnoDB tables
            3. ZFS compression together with NVMe cards makes it reasonably cheap to do, for example, by using i3.4xlarge or i3.8xlarge instances on AWS.

            one billion tables MySQL

            by Alexander Rubin at October 22, 2018 02:22 PM

            Percona Live Europe Presents … In Their Own Words

            Percona Live Europe 2018 two weeks to go

            Percona Live Europe 2018 two weeks to goFor those who are looking forward to Percona Live Europe in just two weeks time—and for those yet to make up their minds—some of our presenters have shared some insight into their talks and what they’re most looking forward to themselves. Make no mistake, this is one of the most exciting events in the conference calendar for those of us who work with open source databases.

            This year, our conference previews are being hosted over on the Percona community blog and the posts have been written by the presenters.

            Percona Live Europe presents…

            Here are the first six posts in this series of Percona Live Europe presents. There are more to come, so do come back over the next few days to see if any of the writers can help you pinpoint the talks that you are most interested in attending this year:

            • Dinesh Joshi will be taking a look at boosting Apache Cassandra’s performance using Netty
            • Federico Razzoli on why he’s investigating MariaDB system versioned tables
            • Jaime Crespo of Wikimedia Foundation will be presenting a entry level (but detailed) tutorial on query optimization, and a break out talk on TLS security, you can find out more in his blog post
            • Tiago Jorge of Oracle on his talk about MySQL 8.0 replication
            • There’s going to be an ElasticSearch 101 tutorial presented by three of the team from ObjectRocket—Antonios Giannopoulos tells you more about that stellar opportunity—while last but by no means least…
            • Arjen Lentz, new CEO of MariaDB Foundation, is keen to share with you the latest information on MariaDB 10.3

            Tantalized? Keep meaning to book your seat? There’s not long left now, so head straight to the registration page and book your place. Percona Live Europe will be in  Frankfurt from November 5-7 2018.

            About the community blog

            We’re really pleased that the community blog is gaining some great support. It offers a platform for all to write on the general topic of open source databases. Commercial and non-commercial. Those who are already prolific bloggers, and those who maybe only want to write a blog or two on a topic that they feel strongly about. If you’d like to join us and write for the community blog, please get in touch! You can email me.

            by Lorraine Pocklington, Community Manager at October 22, 2018 11:07 AM

            October 21, 2018

            MariaDB Foundation

            MariaDB Foundation at the Google Mentor Summit

            The MariaDB Foundation has had 2 projects accepted for Google Summer of Code 2018, of which one we deemed successful. Teodor Niculescu (teodorvicentiuniculescu@gmail.com)’s work was part of an effort to improve MariaDB’s query optimiser by providing faster histogram collection using equal-width histograms. His project is not yet in a release worthy state, yet we are […]

            The post MariaDB Foundation at the Google Mentor Summit appeared first on MariaDB.org.

            by Vicențiu Ciorbaru at October 21, 2018 10:47 AM

            Valeriy Kravchuk

            On Some Recent MySQL Optimizer Bugs

            Yet another topic I missed in my blog post on problematic MySQL features back in July is MySQL optimizer. Unlike with XA transactions, it was done in purpose, as known bugs, limitations and deficiencies of MySQL optimizer is a topic for a series of blog posts if not a separate blog. At the moment the list of known active bug reports in optimize category consists of 380(!) items (mostly "Verified"), aside from feature requests and bugs not considered "production" ones by current "rules" of MySQL public bugs database. I try to check optimizer bugs often in my posts, I reported many of them, but I am still not ready to cover this topic entirely.

            What I can do in frames of one blog post is a quick review of some "Verified" optimizer bugs reported over last year. I'll present them one by one in a list, with some comments (mostly related to my checks of the same test case with MariaDB 10.3.7 that I have at hand) and, hopefully, some conclusions about current state of MySQL optimizer.

            I'll try to shed some light on current state of MySQL optimizer, but it's huge and dark area, with many details hidden...
            So, here is the list, starting from most recently reported bugs:
            • Bug #92654 - "GROUP BY fails for queries when a temporary table is involved". This bug affects recent MySQL 8.0.12 and 5.7.23, but does not affect MariaDB 10.3, for example, from what I see:
              MariaDB [test]> insert into domain_tree values (1), (2), (3);
              Query OK, 3 rows affected (0.080 sec)
              Records: 3  Duplicates: 0  Warnings: 0

              MariaDB [test]> insert into host_connection_info values (1), (3);
              Query OK, 2 rows affected (0.054 sec)
              Records: 2  Duplicates: 0  Warnings: 0

              MariaDB [test]> SELECT
                  ->   COUNT(1),
                  ->   host_connection_status.connection_time
                  -> FROM
                  ->   (SELECT id
                  ->    FROM domain_tree) AS hosts_with_status
                  ->   LEFT OUTER JOIN
                  ->   (SELECT
                  ->      domain_id,
                  ->      'recent' AS connection_time
                  ->    FROM
                  ->      host_connection_info) AS host_connection_status
                  ->     ON hosts_with_status.id = host_connection_status.domain_id
                  -> GROUP BY host_connection_status.connection_time;
              +----------+-----------------+
              | COUNT(1) | connection_time |
              +----------+-----------------+
              |        1 | NULL            |
              |        2 | recent          |
              +----------+-----------------+
              2 rows in set (0.003 sec)
            • Bug #92524 - "Left join with datetime join condition produces wrong results". The bug was reported by Wei Zhao, who contributed a patch. Again, MariaDB 10.3 is not affected:
              MariaDB [test]> select B.* from h1 left join g B on h1.a=B.a where B.d=str_to_date('99991231',"%Y%m%d") and h1.a=1;
              +---+---------------------+
              | a | d                   |
              +---+---------------------+
              | 1 | 9999-12-31 00:00:00 |
              +---+---------------------+
              1 row in set (0.151 sec)

              MariaDB [test]> select B.* from h1 left join g B on h1.a=B.a and B.d=str_to_date
              ('99991231',"%Y%m%d") where h1.a=1;
              +---+---------------------+
              | a | d                   |
              +---+---------------------+
              | 1 | 9999-12-31 00:00:00 |
              +---+---------------------+
              1 row in set (0.002 sec)
            • Bug #92466 - "Case function error on randomly generated values". See also related older Bug #86624 - "Subquery's RAND() column re-evaluated at every reference". These are either regressions comparing to MySQL 5.6 (and MariaDB), or unclear and weird change in behavior that can be workarounded with some tricks (suggested by Oracle developers) to force materialization of derived table. Essentially, result depends on execution plan - what else could we dream about?
            • Bug #92421 - "Queries with views and operations over local variables don't use indexes". Yet another case when MySQL 5.6 worked differently. As Roy Lyseng explained in comments:
              "... this is due to a deliberate choice that was taken when rewriting derived tables and views in 5.7: When a user variable was assigned a value in a query block, merging of derived tables was disabled.
              ...
              In 8.0, you can override this with a merge hint: /*+ merge(v_test) */, but this is unfortunately not implemented in 5.7.
              "
            • Bug #92209 - "AVG(YEAR(datetime_field)) makes an error result because of overflow". All recent MySQL versions and MariaDB 10.3.7 are affected.
            • Bug #92020 - "Introduce new SQL mode rejecting queries with results depending on query plan". Great feature request by Sveta Smirnova that shows current state of optimizer development properly. We need a feature for MySQL to stop accepting queries that may return different results depending on the execution plan. So, current MySQL considers different results when different execution plans are used normal! Sveta refers to her Bug #91878 - "Wrong results with optimizer_switch='derived_merge=ON';" as an example. MariaDB 10.3 is NOT affected by that bug.
            • Bug #91418 - "derived_merge causing incorrect results with distinct subquery and uuid()". From what I see in my tests, MariaDB 10.3.7 produce wrong results with derived_merge both ON and OFF, unfortunately.
            • Bug #91139 - "use index dives less often". In MySQL 5.7+ the default value of eq_range_index_dive_limit increased from to 10 to 200, and this may negatively affect performance. As Mark Callaghan noted, when there is only one possible index exists optimizer doesn't need to evaluate the query to figure out how to evaluate the query.
            • Bug #90847 - "Query returns wrong data if order by is present". This is definitely a corner case, but still. MariaDB 10.3 returns correct result in my tests.
            • Bug #90398 - "Duplicate entry for key '<group_key>' error". I can not reproduce the last public test case on MariaDB 10.3.
            • Bug #89419 - "Incorrect use of std::max". It was reported based on code analysis by Zsolt Parragi. See also Bug #90853 - "warning: taking the max of a value and unsigned zero is always equal to the other value [-Wmax-unsigned-zero]". Proper compiler detects this.
            • Bug #89410 - "delete from ...where not exists with table alias creates an ERROR 1064 (42000)". MariaDB 10.3 is also affected. Both Oracle and PostrgeSQL accepts the syntax, while in MySQL and MariaDB we can use multi-table delete syntax-based workaround as suggested by Roy Lyseng.
            • Bug #89367 - "Storing result in a variable(UDV) causes query on a view to use derived tables", was reported by Jaime Sicam. This is a kind of regression in MySQL 5.7. MariaDB 10.3 and MySQL 8.0 are not affected. Let me quote a comment by Roy Lyseng:
              "In 5.7, we added a heuristic so that queries that assign user variables are by default materialized and not merged. However, we should have let the ALGORITHM=MERGE override this decision. This is a bug."
            • Bug #89182 - "Optimizer unable to ignore index part entering less optimal query plan". Nice report from Przemyslaw Malkowski. One of many case when "ref" vs "range" decision seems to be wrong based on costs. Looks like optimizer still have parts that are heuristics/rules based and/or do not take costs into account properly.
            • Bug #89149 - "SELECT DISTINCT on multiple TEXT columns is slow". Yet another regression in MySQL 5.7+.
            That's all optimizer bugs reported in 2018 and still "Verified" that I wanted to discuss.

            From the list above I can conclude the following:
            1. There are many simple enough cases when queries provide wrong results or get not optimal execution plans in MySQL. For many of them MariaDB's optimizer does a better job.
            2. Behavior of optimizer for some popular use cases changed after MySQL 5.6, so take extra care to check queries and their results after upgrade to MySQL 5.7+.
            3. derived_merge optimization seems to cause a lot of problems for users in MySQL 5.7 and 8.0.
            4. It seems optimizer developers care enough to comment on bugs, suggest workarounds and explain decisions made.

            by Valeriy Kravchuk (noreply@blogger.com) at October 21, 2018 08:46 AM

            October 19, 2018

            Peter Zaitsev

            PostgreSQL Q&A: Building an Enterprise-Grade PostgreSQL Setup Using Open Source Tools

            Enterprise-PostgreSQL-q-and-a

            PostgreSQL logoHello everyone, and thank you to those that attended our webinar on Building an Enterprise-grade PostgreSQL setup using open source tools last Wednesday. You’ll find the recordings of such as well as the slides we have used during our presentation here.

            We had over forty questions during the webinar but were only able to tackle a handful during the time available, so most remained unanswered. We address the remaining ones below, and they have been grouped in categories for better organization. Thank you for sending them over! We have merged related questions and kept some of our answers concise, but please leave us a comment if you would like to see a particular point addressed further.

            Backups

            Q: In our experience, pg_basebackup with compression is slow due to single-thread gzip compression. How to speed up online compressed full backup?

            Single-thread operation is indeed a limitation of pg_basebackup, and this is not limited to compression only. pgBackRest is an interesting alternative tool in this regard as it does have support for parallel processing.

            Q: Usually one setup database backup on primary DB in a HA setup. Is it possible to automatically activate backup on new primary DB after Patroni failover? (or other HA solutions)

            Yes. This can be done transparently by pointing your backup system to the “master-role” port in the HAProxy instead – or to the “replica-role” port; in fact, it’s more common to use standby replicas as the backup source.

            Q: Do backups and WAL backups work with third party backup managers like NetBackup for example?

            Yes, as usual it depends on how good the vendor support is. NetBackup supports PostgreSQL, and so does Zmanda to mention another one.

            Security and auditing

            Q: Do you know a TDE solution for PostgreSQL? Can you talk a little bit about the encryption at rest solution for Postgres PCI/PII applications from Percona standpoint.

            At this point PostgreSQL does not provide a native Transparent Data Encryption (TDE) functionality, relying instead in the underlying file system for data-at-rest encryption. Encryption at the column level can be achieved through the pgcrypto module.

            Moreover, other PostgreSQL security features related to PCI compliance are:

            Q: How to prevent superuser account to access raw data in Postgres? (…) we encounter companies usually ask that even managed accounts can not access the real data in any mean.

            It is fundamental to maintain a superuser account that is able to access any object in the database for maintenance activities. Having said that, currently it is not possible to deny a superuser direct access to the raw data found in tables. What you can do to protect sensitive data from superuser access is to have it stored encrypted. As mentioned above, pgcrypto offers the necessary functionality for achieving this.

            Furthermore, avoiding connecting to the database as a superuser is a best practice. The extension set_user allows for unprivileged users to escalate themselves as superuser for maintenance tasks on demand while providing an additional layer of logging and control for better auditing. Also, as discussed in the webinar, it’s possible to implement segregation of users using roles and privileges. Remember it’s best practice to only grant the essential privileges a role to fulfill its duties, including application users. Additionally, password authentication should be enforced to superusers.

            Q: How can you make audit logging in Postgres record DMLs while masking data content in these recorded SQLs?

            To the best of our knowledge, currently there is not a solution to apply query obfuscation to logs. Bind parameters are always included in both the audit and logging of DMLs, and that is by design. If you would rather avoid logging bind parameters and want to keep track of the statements executed only, you can use the pg_stat_statements extension instead. Note that while pg_stat_statements provides overall statistics of the executed statements, it does not keep track of when each DML has been executed.

            Q: How to setup database audit logging effectively when utilizing pgbouncer or pgpool?

            A key part of auditing is having separate user accounts in the database instead of a single, shared account. The connection to the database should be made by the appropriate user/application account. In pgBouncer we can have multiple pools for each of the user accounts. Every action by a connection from that pool will be audited against the corresponding user.

            High Availability and replication

            Q: Is there anything like Galera for PostgreSQL ?

            Galera replication library provides support for multi-master, active-active MySQL clusters based on synchronous replication, such as Percona XtraDB Cluster. PostgreSQL does have support for synchronous replication but limited to a single active master context only.

            There are, however, clustering solutions for PostgreSQL that address similar business requirements or problem domains such as scalability and high availability (HA). We have presented one of them, Patroni, in our webinar; it focuses on HA and read scaling. For write scaling, there have long been sharding based solutions, including Citus, and PostgreSQL 10 (and now 11!) bring substantial new features in the partitioning area. Finally, PostgreSQL based solutions like Greenplum and Amazon redshift addresses scalability for analytical processing, while TimescaleDB has been conceived to handle large volumes of time series data.

            Q: Pgpool can load balance – what is the benefit of HAProxy over Pgpool?

            No doubt Pgpool is feature rich, which includes load balancing besides connection pooling, among other functionalities. It could be used in place of HAProxy and PgBouncer, yes. But features is just one of the criteria for selecting a solution. In our evaluation we gave more weight to lightweight and faster, scalable solutions. HAProxy is well known for its lightweight connection routing capability without consuming much of the server resources.

            Q: How to combine PgBouncer and Pgpool together so that one can achieve transaction pooling + load balancing? Can you let me know between the two scaling solutions which one is better, PgBouncer or Pgpool-II?

            It depends, and must be analyzed on a case-by-case basis. If what we really need is just a connection pooler, PgBouncer will be our first choice because it is more lightweight compared to Pgpool. PgBouncer is thread-based while Pgpool is process-based—like PostgreSQL, forking the main process for each inbound connection is a somewhat expensive operation. PgBouncer is more effective in this front.

            However, the relative heavyweight of Pgpool comes with a lot of features, including the capability to manage PostgreSQL replication, and the ability to parse statements fired against PostgreSQL and redirect them to certain cluster nodes for load balancing. Also, when your application cannot differentiate between read and write requests, Pgpool can parse the individual SQL statements and redirect them to the master,  if it is a write, or to a standby replica, if it is a read, as configured in your Pgpool setup. The demo application we used in our webinar setup was able to distinguish reads from writes and use multiple connection strings accordingly, so we employed HAProxy on top of Patroni.

            We have seen environments where Pgpool was used for its load balancing capabilities while connection pooling duties were left for PgBouncer, but this is not a great combination. As described above, HAProxy is more efficient than Pgpool as a load balancer.

            Finally, as discussed in the webinar, any external connection pooler like Pgbouncer is required only if there is no proper application layer connection pooler, or if the application layer connection pooler is not doing a great job in maintaining a proper connection pool, resulting in frequent connections and disconnections.

            Q: Is it possible for Postgres to have a built-in connection pool worker? Maybe merge Pgbouncer into postgres core? That would make it much easier to use advanced authentication mechanisms (e.g. LDAP).

            A great thought. That would indeed be a better approach in many aspects than employing an external connection pooler like Pgbouncer. Recently there were discussions among PostgreSQL contributors on the related topic, as seen here. A few sample patches have been submitted by hackers but nothing has been accepted yet. The PostgreSQL community is very keen to keep the server code lightweight and stable.

            Q: Is rebooting the standby the only way to change master in PostgreSQL?

            A standby-to-master promotion does not involve any restart.

            From the perspective of the user, a standby is promoted by pg_ctl promote command or by creating a trigger file. During this operation, the replica stops the recovery related processing and becomes a read-write database.

            Once we have a new master, all the other standby servers need to start replicating from it. This involves changes to the  recovery.conf parameters and, yes, a restart: the restart happens only on the standby side when the current master has to be changed. PostgreSQL currently does not allow us to change this parameter using a SIGHUP.

            Q: Are external connection pooling solutions (PgBouncer, Pgpool) compatible with Java Hibernate ORM ?

            External connection poolers like PgBouncer and Pgpool are compatible with regular PostgreSQL connections. So connections from Hibernate ORM can treat PgBouncer as regular PostgreSQL but running on a different port (or the same, depending on how you configure it). An important point to remember is that they are complementary to connection pools that integrate well with ORM components. For example c3p0 is a well known connection pooler for Hibernate. If an ORM connection pooler can be well tuned to avoid frequent connections and disconnections, then, external pooling solutions like PgBouncer or Pgpool will become redundant and can/should be avoided.

            Q: Question regarding connection pool: I want to understand if the connections are never closed or if there are any settings to force the closing of the connection after some time.

            There is no need to close a connection if it can be reused (recycled) again and again instead of having a new one created. That is the very purpose of the connection pooler. When an application “closes” a connection, the connection pooler will virtually release the connection from the application and recover it back to the pool of connections. On the next connection request, instead of establishing a new connection to the database the connection pooler will pick a connection from the pool of connections and “lend” it to the application. Furthermore, most connection poolers include a parameter to control the release of connections after a specified idle time.

            Q: Question regarding Patroni: can we select in the settings to not failover automatically and only used Patroni for manual failover/failback?

            Yes, Patroni allow users to pause its automation process, leaving them to manually trigger operations such as failover. The actual procedure for achieving this will make an interesting blog post (we put it in our to-do list).

            Q: Where should we install PgBouncer, Patroni and HAproxy to fulfill the 3-lawyers format: web frontends, app backends and DB servers ? What about etcd ?

            Patroni and etcd must be installed in the database servers. In fact, etcd can be running in other servers as well, because the set of etcd instances just form the distributed consensus store. HAProxy and PgBouncer can be installed on the application servers for simplicity, or optionally they can run on dedicated servers, especially when you ran a large amount of those. Having said that, HAProxy is very lightweight and can be maintained in each application server without added impact. If you want to install PgBouncer on dedicated servers, just make sure to avoid SPOF (single point of failure) by employing active-passive servers.

            Q: How does HAproxy in your demo setup know how to route DML appropriately to the master and slaves (e.g. writes always go to the master and reads are load balanced between the replicas) ?

            HAProxy does not parse SQL statements in the intermediate layer in order to redirect them to the master or to one of the replicas accordingly—this must be done at the application level. In order to benefit from this traffic distribution, your application needs to send write requests to the appropriate HAproxy port; the same with read requests. In our demo setup, the application connected to two different ports, one for reads and another for writes (DML).

            Q: How often does the cluster poll each node/slave? Is it tunable for poor performing networks?

            Patroni uses an underlying distributed consensus mechanism for all heartbeat checks. For example, etcd, which can be used for this, has default heartbeat interval of 100ms, but it is adjustable. Apart from this, in every layer of the stack, there are tunable TCP-like timeouts. For connection routing HAProxy polls by making use of the Patroni API, which also allows further control on how the checks can be done. Having said that, please keep in mind that poor performing networks are often a bad choice for distributed services, with problems spanning beyond timeout checks.

            Miscellaneous

            Q: Hi Avinash/Nando/Jobin, maybe I wasn’t able to catch up with DDL’s but what’s the best way to handle DDLs ? In MySQL, we can use pt-online-schema-change and avoid large replica lag, is there a way to achieve the same in PostgreSQL without blocking/downtime or does Percona has an equivalent tool for PostgreSQL? Looking forward to this!

            Currently, PostgreSQL locks tables for DDLs. Some DDLs, such as creating triggers and indexes, may not lock every activity on the table. There isn’t a tool like pt-online-schema-change for PostgreSQL yet. There is, however, an extension called pg_repack, which assists in rebuilding a table online. Additionally, adding the keyword “CONCURRENTLY” to create index statement makes it gentle on the system and allows concurrent DMLs and queries to happen while the index is being built. Let’s suppose you want to rebuild the index behind the primary key or unique key: an index can be created independently and the index behind the key can be replaced with a momentarily lock that may be seamless.

            A lot of new features are added in this space with each new release. One of the extreme cases of extended locking is adding a NOT NULL column on a table with DEFAULT values. In most of the database systems this operation can hold a write lock on the table until it completes. Just released, PostgreSQL 11 makes it a brief operation irrespective of the size of the table. It is now achieved with a simple metadata change rather than through a complete table rebuild. As PostgreSQL continues to get better on handling DDLs, the scope for external tools is reducing. Moreover, it is not resulting in table rewrite, so excessive I/O and other side effects like replication lag can be avoided.

            Q: What are the actions that can be performed by the parallelization option in PostgreSQL ?

            This is the area where PostgreSQL has improved significantly in the last few versions. The answer, then, depends on which version you are using. Parallelization has been introduced in PostgreSQL 9.6, with more capabilities added in version 10. As of version 11 pretty much everything can make use of parallelization, including index building. The more CPU cores your server has at its disposal, the more you would benefit from the latest versions of PostgreSQL, given that it is properly turned for parallel execution.

            Q: is there any flashback query or flashback database option in PostgreSQL ?

            If flashback queries are an application requirement please consider using temporal tables to better visualize data from a specific time or period. If the application is handling time series data (like IOT devices), then, TimescaleDB may be an interesting option for you.

            Flashback of the database can be achieved in multiple ways, either with the help of backup tools (and point-in-time recovery) or using a delayed standby replica.

            Q: Question regarding pg_repack: we have attempted running pg_repack and for some reason it kept running forever; can we simply cancel/abort its execution ?

            Yes, the execution of pg_repack can be aborted without prejudice. This is safe to do because the tool creates an auxiliary table and uses it to rearrange the data, swapping it with the original table at the end of the process. If its execution is interrupted before it completes, the swapping of tables just doesn’t take place. However, since it works online and doesn’t hold an exclusive lock on the target table, depending on its size and the changes made on the target table during the process, it might take considerable time to complete. Please explore the parallel feature available with pg_repack.

            Q: Will the monitoring tool from Percona be open source ?

            Percona Monitoring and Management (PMM) has been released already as an open source project with its source code being available at GitHub.

            Q: It’s unfortunate that the Master/Slave terminology is still used on slide. Why not use instead leader/follower or orchestrator node/node?

            We agree with you, particularly regarding the reference on “slave” – “replica” is a more generally accepted term (for good reason), with “standby” [server|replica] being more commonly used with PostgreSQL.

            Patroni usually employs the terms “leader” and “followers”.

            The use of “cluster” (and thus “node”) in PostgreSQL, however, contrasts with what is usually the norm (when we think about traditional beowulf clusters, or even Galera and Patroni) as it denotes the set of databases running on a single PostgreSQL instance/server.

            by Jobin Augustine at October 19, 2018 04:50 PM

            Jean-Jerome Schmidt

            Effective Monitoring of MySQL with SCUMM Dashboards Part 1

            We added a number of new dashboards for MySQL in our latest release of ClusterControl 1.7.0. - and in our previous blog, we showed you How to Monitor Your ProxySQL with Prometheus and ClusterControl.

            In this blog, we will look at the MySQL Overview dashboard.

            So, we have enabled the Agent Based Monitoring under the Dashboard tab to start collecting metrics to the nodes. Take note that when enabling the Agent Based Monitoring, you have the options to set the “Scrape Interval (seconds)” and “Data retention (days)”. Scraping Interval is where you want to set how aggressively Prometheus will harvest data from the target and Data Retention is how long you want to keep your data collected by Prometheus before it’s deleted.

            When enabled, you can identify which cluster has agents and which one has agentless monitoring.

            Compared to the agentless approach, the granularity of your data in graphs will be higher with agents.

            The MySQL Graphs

            The latest version of ClusterControl 1.7.0 (which you can download for free - ClusterControl Community) has the following MySQL Dashboards for which you can gather information for your MySQL servers. These are MySQL Overview, MySQL InnoDB Metrics, MySQL Performance Schema, and MySQL Replication.

            We’ll cover in details the graphs available in the MySQL Overview dashboard.

            MySQL Overview Dashboard

            This dashboard contains the usual important variables or information regarding the health of your MySQL node. The graphs contained on this dashboard are specific to the node selected upon viewing the dashboards as seen below:

            It consists of 26 graphs, but you might not need all of these when diagnosing problems. However, these graphs provides a vital representation of the overall metrics for your MySQL servers. Let’s go over the basic ones, as these are probably the most common things that a DBA will routinely look at.

            The first four graphs shown above along with the MySQL’s uptime, query per-seconds, and buffer pool information are the most basic pointers we might need. From the graphs displayed above, here are their representations:

            • MySQL Connections
              This is where you want to check your total client connections thus far allocated in a specific period of time.
            • MySQL Client Thread Activity
              There are times that your MySQL server could be very busy. For example, it might be expected to receive surge in traffic at a specific time, and you want to monitor your running threads activity. This graph is really important to look at. There can be times your query performance could go south if, for example, a large update causes other threads to wait to acquire lock. This would lead to an increased number of your running threads. The cache miss rate is calculated as Threads_created/Connections.
            • MySQL Questions
              These are the queries running in a specific period of time. A thread might be a transaction composed of multiple queries and this can be a good graph to look at.
            • MySQL Thread Cache
              This graph shows the thread_cache_size value, threads that are cached (threads that are reused), and threads that are created (new threads). You can check on this graph for such instances like you need to tune your read queries when noticing a high number of incoming connections and your threads created increases rapidly. For example, if your Threads_running / thread_cache_size > 2 then increasing your thread_cache_size may give a performance boost to your server. Take note that creation and destruction of threads are expensive. However, in the recent versions of MySQL (>=5.6.8), this variable has autosizing by default which you might consider it untouched.

            The next four graphs are MySQL Temporary Objects, MySQL Select Types, MySQL Sorts, and MySQL Slow Queries. These graphs are related to each other specially if you are diagnosing long running queries and large queries that needs optimization.

            • MySQL Temporary Objects
              This graph would be a good source to rely upon if you want to monitor long running queries that would end up using disk instead of temporary tables or files going in-memory. It’s a good place to start looking for periodical occurrence of queries that could add up to create disk space issues especially during odd times.
            • MySQL Select Types
              One source of bad performance is queries that are using full joins, table scans, select range that is not using any indexes. This graph would show how your query performs and what amongst the list from full joins, to full range joins, select range, table scans has the highest trends.
            • MySQL Sorts
              Diagnosing those queries that are using sorting, and the ones that take much time to finish.
            • MySQL Slow Queries
              Trends of your slow queries are collected here on this graph. This is very useful especially on diagnosing how often your queries are slow. What are things that need to be tuned? It could be too small buffer pool, tables that lack indexes and goes a full-table scan, logical backups running on unexpected schedule, etc. Using our Query Monitor in ClusterControl along with this graph is beneficial, as it helps determine slow queries.

            The next graphs we have cover is more of the network activity, table locks, and the underlying internal memory that MySQL is consuming during the MySQL’s activity.

            • MySQL Aborted Connections
              The number of aborted connections will render on this graph. This covers the aborted clients such as where the network was closed abruptly or where the internet connection was down or interrupted. It also records the aborted connects or attempts such as wrong passwords or bad packets upon establishing a connection from the client.
            • MySQL Table Locks
              Trends for tables that request for a table lock that has been granted immediately and for tables that request for a lock that has not been acquired immediately. For example, if you have table-level locks on MyISAM tables and incoming requests of the same table, these cannot be granted immediately.
            • MySQL Network Traffic
              This graph shows the trends of the inbound and outbound network activity in the MySQL server. “Inbound” is the data received by the MySQL server while “Outbound” is the data sent or transferred by the server from the MySQL server.This graph is best to check upon if you want to monitor your network traffic especially when diagnosing if your traffic is moderate but you’re wondering why it has a very high outbound transferred data, like for example, BLOB data.
            • MySQL Network Usage Hourly
              Same as the network traffic which shows the Received and Sent data. Take note that it’s based on ‘per hour’ and labeled with ‘last day’ which will not follow the period of time you selected in the date picker.
            • MySQL Internal Memory Overview
              This graph is familiar for a seasoned MySQL DBA. Each of these legends in the bar graph are very important especially if you want to monitor your memory usage, your buffer pool usage, or your adaptive hash index size.

            The following graphs show the counters that a DBA can rely upon such as checking the statistics for example, the statistics for selects, inserts, updates, the number of master status that has been executed, the number of SHOW VARIABLES that has been executed, check if you have bad queries doing table scans or tables not using indexes by looking over the read_* counters, etc.


            • Top Command Counters (Hourly)
              These are the graphs you would likely have to check whenever you would like to see the statistics for your inserts, deletes, updates, executed commands such as gathering the processlist, slave status, show status (health statistics of the MySQL server), and many more. This is a good place if you want to check what kind of MySQL command counters are topmost and if some performance tuning or query optimization is needed. It might also allow you to identify which commands are running aggressively while not needing it.
            • MySQL Handlers
              Oftentimes, a DBA would go over these handlers and check how the queries are performing in your MySQL server. Basically, this graph covers the counters from the Handler API of MySQL. Most common handler counters for a DBA for the storage API in MySQL are Handler_read_first, Handler_read_key, Handler_read_last, Handler_read_next, Handler_read_prev, Handler_read_rnd, and Handler_read_rnd_next. There are lots of MySQL Handlers to check upon. You can read about them in the documentation here.
            • MySQL Transaction Handlers
              If your MySQL server is using XA transactions, SAVEPOINT, ROLLBACK TO SAVEPOINT statements. Then this graph is a good reference to look at. You can also use this graph to monitor all your server’s internal commits. Take note that the counter for Handler_commit does increment even for SELECT statements but differs against insert/update/delete statements which goes to the binary log during a call to COMMIT statement.

            The next graph will show trends about process states and their hourly usage. There are lots of key points here in the bar graph legend that a DBA would check. Encountering disk space issues, connection issues and see if your connection pool is working as expected, high disk I/O, network issues, etc.

            • Process States/Top Process States Hourly
              This graph is where you can monitor the top thread states of your queries running in the processlist. This is very informative and helpful for such DBA tasks where you can examine here any outstanding statuses that need resolution. For example, opening tables state is very high and its minimum value is almost near to the maximum value. This could indicate that you need to adjust the table_open_cache. If the statistics is high and you’re noticing a slow down of your server, this could indicate that your server is disk-bound and you might need to consider increasing your buffer pool. If you have a high number of creating tmp table then you might have to check your slow log and optimize the offending queries. You can checkout the manual for the complete list of MySQL thread states here.

            The next graph we’ll be checking is about query cache, MySQL table definition cache, how often MySQL opens system files.


            • MySQL Query Cache Memory/Activity
              These graphs are related to each other. If you have query_cache_size <> 0 and query_cache_type <> 0, then this graph can be of help. However, in the newer versions of MySQL, the query cache has been marked as deprecated as the MySQL query cache is known to cause performance issues. You might not need this in the future. The most recent version of MySQL 8.0 has drastic improvements; it tends to increase performance as it comes with several strategies to handle cache information in the memory buffers.
            • MySQL File Openings
              This graph shows the trend for the opened files since the MySQL server’s uptime but it excludes files such as sockets or pipes. It does also not include files that are opened by the storage engine since they have their own counter that is Innodb_num_open_files.
            • MySQL Open Files
              This graph is where you want to check your InnoDB files currently held open, the current MySQL open files, and your open_files_limit variable.
            • MySQL Table Open Cache Status
              If you have very low table_open_cache set here, this graph will tell you about those tables that fail the cache (newly opened tables) or miss due to overflow. If you encounter a high number or too much “Opening tables” status in your processlist, this graph will serve as your reference to determine this. This will tell you if there’s a need to increase your table_open_cache variable.
            • MySQL Open Tables
              Relative to MySQL Table Open Cache Status, this graph is useful in certain occasions like you want to identify if there’s a need to increase of your table_open_cache or lower it down if you notice a high increase of open tables or Open_tables status variable. Note that table_open_cache could take a large amount of memory space so you have to set this with care especially in production systems.
            • MySQL Table Definition Cache
              If you want to check the number of your Open_table_definitions and Opened_table_definitions status variables, then this graph is what you need. For newer versions of MySQL (>=5.6.8), you might not need to change the value of this variable and use the default value since it has autoresizing feature.

            Conclusion

            The SCUMM addition in the latest version of ClusterControl 1.7.0 provides significant new benefits for a number of key DBA tasks. The new graphs can help easily pinpoint the cause of issues that DBAs or sysadmins would typically have to deal with and help find appropriate solutions faster.

            We would love to hear your experience and thoughts on using ClusterControl 1.7.0 with SCUMM (which you can download for free - ClusterControl Community).

            In part 2 of this blog, I will discuss Effective Monitoring of MySQL Replication with SCUMM Dashboards.

            by Paul Namuag at October 19, 2018 09:17 AM

            October 18, 2018

            Peter Zaitsev

            ProxySQL 1.4.11 and Updated proxysql-admin Tool Now in the Percona Repository

            ProxySQL 1.4.11

            ProxySQL 1.4.11ProxySQL 1.4.11, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

            ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

            The ProxySQL 1.4.11 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.11 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases. The documentation is hosted on GitHub in the wiki format.

            Improvements

            • mysql_query_rules_fast_routing is enabled in ProxySQL Cluster. For more information, see #1674 at GitHub.
            • In this release, rmpdb checksum error is ignored when building ProxySQL in Docker.
            • By default, the permissions for proxysql.cnf are set to 600 (only the owner of the file can read it or make changes to it).

            Bugs Fixed

            • Fixed the bug that could cause crashing of ProxySQL if IPv6 listening was enabled. For more information, see #1646 at GitHub.

            ProxySQL is available under Open Source license GPLv3.

            by Borys Belinsky at October 18, 2018 09:19 PM

            Percona Statement on MongoDB Community Server License Change

            MongoDB Community Server License

            MongoDB Community Server LicenseMongoDB, Inc. announced it has elected to change its license for MongoDB Community Server from AGPLv3 to a new license type they have created called a “Server Side Public License (SSPL)” citing the need to have a license better suited for the age of Software-as-a-Service.

            First, it is important to state that MongoDB, Inc. is fully within its rights as a software copyright holder to change the license of MongoDB Community Server to a license which better reflects its business interests.

            In our opinion, however, announcing the license and making the change effective immediately is not respectful to users of MongoDB Community Server. For many organizations, while AGPL may be an approved software license, the SSPL is not, and their respective internal review processes may take weeks. During this time users can’t get access, even to patch versions of old major releases, which might be required to ensure security in their environment, among other potential issues.

            This issue is compounded by the fact that the SSPL has only recently been submitted to be evaluated by the Open Source Software Initiative, and it is not yet clear if it will be considered an Open Source License.

            We believe it would have been much better for the MongoDB Community and the Open Source Community at large if MongoDB, Inc. would have chosen to release SSPL, and announce the move to this license at some future effective date, allowing for a more orderly transition.

            This is a developing situation, and I’m sure over the next few days and weeks we will both hear from OSI with their decision, as well as have further clarification on many points of the SSPL in the FAQ, and possibly the license itself.  At Percona we’re watching this situation closely and will provide additional updates regarding potential impacts to our community and customers.

            At this point we can state the following:

            • Percona will continue to support the latest AGPL versions of MongoDB Community Server and Percona Server for MongoDB until more clarity in regards to SSPL is available, giving companies time to complete their assessment of whether moving to the SSPL software version is feasible for them.
            • Being based on MongoDB Community Server, we anticipate that our Percona Server for MongoDB will change its license to SSPL when we move to the SSPL codebase released by MongoDB, Inc.
            • We believe this change does not impact other Percona software which interfaces with MongoDB, such as Percona Toolkit and Percona Monitoring and Management. At this point, we do not anticipate a license change for this software.
            • This license change does not impact Percona support customers, who will receive the same level of comprehensive, responsive, and cost-effective support as before. We encourage customers to evaluate the impact of this license change for their own software.

            by Peter Zaitsev at October 18, 2018 03:45 PM

            PostgreSQL 11! Our First Take On The New Release

            slonik_with_black_text

            PostgreSQL logoYou may be aware that the new major version of PostgreSQL has been released today. PostgreSQL 11 is going to be one of the most vibrant releases in recent times. It incorporates many features found in proprietary, industry-leading database systems, further qualifying PostgreSQL as a strong open source alternative.

            Without further ado, let’s have a look at some killer features in this new release.

            Just In Time (JIT) Compilation of SQL Statements

            This is a cutting edge feature in PostgreSQL: SQL statements can get compiled into native code for execution. It’s well know how much Google V8 JIT revolutionized the JavaScript language. JIT in PostgreSQL 11 supports accelerating two important factors—expression evaluation and tuple deforming during query execution—and helps CPU bound queries perform faster. Hopefully this is a new era in the SQL world.

            Parallel B-tree Index build

            This could be the most sought after feature by DBAs, especially those migrating large databases from other database systems to PostgreSQL. Gone are the days when a lot of time was spent on building indexes during data migration. Index maintenance (rebuild) for very large tables can now make an effective use of multiple cores in the server by parallelizing the operation, taking considerably less time to complete.

            Lightweight and super fast ALTER TABLE for NOT NULL column with DEFAULT values

            In the process of continuous enhancement and adding new features, we see several application developments that involve schema changes to the database. Most such changes include adding new columns to a table. This can be a nightmare if a new column needs to be added to a large table with a default value and a NOT NULL constraint. This is because an ALTER statement can hold a write lock on the table for a long period. It can also involve excessive IO due to table rewrite. PostgreSQL 11 addresses this issue by ensuring that the column addition with a default value and a NOT NULL constraint avoids a table rewrite.  

            Stored procedures with transaction control

            PostgreSQL 11 includes stored procedures. What really existed in PostgreSQL so far was functions. The lack of native stored procedures in PostgreSQL made the database code for migrations from other databases complex. They often required extensive manual work from experts. Since stored procedures might include transaction blocks with BEGIN, COMMIT, and ROLLBACK, it was necessary to apply workarounds to meet this requirement in past PostgreSQL versions, but not anymore.

            Load cache upon crash or restart – pg_prewarm

            Memory is becoming cheaper and faster, year over year. The latest generation of servers is commonly available with several hundreds of GBs of RAM, making it easy to employ large caches (shared_buffers) in PostgreSQL. Until today, you might have used pg_prewarm to warm up the cache manually (or automatically at server start). PostgreSQL 11 now includes a background worker thread that will take care of that for you, recording the contents of the shared_buffers—in fact, the “address” of those—to a file autoprewarm.blocks. Upon crash recovery or normal server restart, two such threads work in the background, reloading those blocks into the cache.

            Hash Partition

            Until PostgreSQL 9.6 we used table inheritance for partitioning a table. PostgreSQL 10 came up with declarative partitioning, using two of the three most common partitioning methods: list and range. And now, PostgreSQL 11 has introduced the missing piece: hash partitioning.

            Advanced partitioning features that were always on demand

            There were a lot of new features committed to the partitioning space in PostgreSQL 11. It now allows us to attach an index to a given partition even though it won’t behave as a global index.

            Also, row updates now automatically move rows to new partitions (if necessary) based on the updated fields. During query processing, the optimizer may now simply skip “unwanted” partitions from the execution plan, which greatly simplifies the work to be done. Previously, it had to convey all the partitions, even if the target data was to be found in just a subset of them.

            We will discuss these new features in detail in a future blog post.

            Tables can have default partitions

            Until PostgreSQL 10, if a table did not have a default partition, PostgreSQL had to reject a row when the row being inserted did not satisfy any of the existing partitions definitions. That changes with the introduction of default partitions in PostgreSQL 11.

            Parallel hash join

            Most of the SQLs with equi-joins do hash joins in the background. There is a great opportunity to speed up performance if we can leverage the power of hardware by spinning off multiple parallel workers. PostgreSQL 11 now allows hash joins to be performed in parallel.

            Write-Ahead Log (WAL) improvements

            Historically, PostgreSQL had a default WAL segment of 16MB and we had to recompile PostgreSQL in order to operate with WAL segments of a different size. Now it is possible to change the WAL size during the initialization of the data directory (initdb) or while resetting WALs using pg_resetwal by means of the parameter –wal-segsize = <wal_segment_size>.

            Add extensions to convert JSONB data to/from PL/Perl and PL/Python

            Python as a programming language continues to gain popularity. It is always among the top 5 in the TIOBE Index. One of the greatest features of PostgreSQL is that you can write stored procedures and functions in most of the popular programming languages, including Python (with PL/Python). Now it is also possible to transform JSONB (Binary JSON) data type to/from PL/Python. This feature was later made available for PL/Perl too. It can be a great add-on for organizations using PostgreSQL as a document store.

            Command line improvements in psql: autocomplete and quit/exit

            psql has always been friendly to first time PostgreSQL users through the various options like autocomplete and shortcuts. There’s an exception though: users may find it difficult to understand how to effectively quit from psql, and often attempt to use non-existing quit and exit commands. Eventually, they find \q or ctrl + D, but not without frustrating themselves first. Fortunately, that shouldn’t happen anymore: among many recent fixes and improvements to psql is the addition of the intuitive quit and exit commands to safely leave this popular command line client.

            Improved statistics

            PostgreSQL 10 introduced the new statement CREATE STATISTICS to collect more statistics information about columns from tables. This has been further improved in PostgreSQL 11. Previously, while collecting optimizer statistics, most-common-values (MCV) were chosen based on their significance compared to all columns. But now, MCVs are chosen based on their significance compared to non-MCV values.

            The new features of PostgreSQL 11 are not limited to the ones mentioned above. We will be looking further into most of them in future blog posts. We invite you to leave a comment below and let us know if there is any particular feature you would be interested in knowing more about.

            by Jobin Augustine at October 18, 2018 02:19 PM

            October 17, 2018

            Peter Zaitsev

            Upcoming Webinar Thurs 10/18: MongoDB 4.0 Features – Transactions & More

            MongoDB 4.0 Features Webinar

            MongoDB 4.0 Features WebinarPlease join Percona’s Principal Consultant, Alex Rubin, as he presents MongoDB 4.0 Features – Transactions & More on Thursday, October 18th at 11:00 AM PDT (UTC-7) / 2:00 PM EDT (UTC-4).

             

            MongoDB 4.0 adds support for multi-document ACID transactions, combining the document model with ACID guarantees. Through snapshot isolation, transactions provide a consistent view of data and enforce all-or-nothing execution to maintain data integrity.

            This webinar mainly focuses on MongoDB transactions (the major feature of the latest update) and any future transaction improvements. We will also cover other new MongoDB features, such as Non-Blocking Secondary Reads, Security improvements and more.

            After attending the webinar you will learn more about the latest MongoDB features.

            Register for this webinar to learn about MongoDB transactions and other features.

            by Alexander Rubin at October 17, 2018 08:58 PM

            Can MySQL Parallel Replication Help My Slave?

            InnoDB Row Operations per Hour graph from Percona Monitoring and Management performance monitoring tool

            Parallel replication has been around for a few years now but is still not that commonly used. I had a customer where the master had a very large write workload. The slave could not keep up so I recommended to use parallel slave threads. But how can I measure if it really helps and is working?

            At my customer the

            slave_parallel_workers
              was 0. But how big should I increase it, maybe to 1? Maybe to 10? There is a blog post about how can we see how many threads are actually used, which is a great help.

            We changed the following variables on the slave:

            slave_parallel_type = LOGICAL_CLOCK;
            slave_parallel_workers = 40;
            slave_preserve_commit_order = ON;

            40 threads sounds a lot, right? Of course, this is workload specific: if the transactions are independent it might be useful.

            Let’s have a look, how many threads are working:

            mysql> SELECT performance_schema.events_transactions_summary_by_thread_by_event_name.THREAD_ID AS THREAD_ID
            , performance_schema.events_transactions_summary_by_thread_by_event_name.COUNT_STAR AS COUNT_STAR
            FROM performance_schema.events_transactions_summary_by_thread_by_event_name
            WHERE performance_schema.events_transactions_summary_by_thread_by_event_name.THREAD_ID IN
                 (SELECT performance_schema.replication_applier_status_by_worker.THREAD_ID
                  FROM performance_schema.replication_applier_status_by_worker);
            +-----------+------------+
            | THREAD_ID | COUNT_STAR |
            +-----------+------------+
            | 25882 | 442481 |
            | 25883 | 433200 |
            | 25884 | 426460 |
            | 25885 | 419772 |
            | 25886 | 413751 |
            | 25887 | 407511 |
            | 25888 | 401592 |
            | 25889 | 395169 |
            | 25890 | 388861 |
            | 25891 | 380657 |
            | 25892 | 371923 |
            | 25893 | 362482 |
            | 25894 | 351601 |
            | 25895 | 339282 |
            | 25896 | 325148 |
            | 25897 | 310051 |
            | 25898 | 292187 |
            | 25899 | 272990 |
            | 25900 | 252843 |
            | 25901 | 232424 |
            +-----------+------------+

            You can see all the threads are working. Which is great.

            But did this really speed up the replication? Could the slave write more in the same period of time?

            Let’s see the replication lag:

            MySQL Replication Delay graph from PMM

            As we can see, lag goes down quite quickly. Is this because the increased thread numbers? Or because the job which generated the many inserts finished and there are no more writes coming? (The replication delay did not go to 0 because this slave is deliberately delayed by an hour.)

            Luckily in PMM we have other graphs as well that can help us. Like this one showing InnoDB row operations:

            InnoDB Row Operations graph from PMM

             

            That looks promising: the slave now inserts many more rows than usual. But how much rows were inserted, actually? Let’s create a new graph to see how many rows were inserted per hour. In PMM we already have all this information, we just have to create a new graph using the following query:

            increase(mysql_global_status_innodb_row_ops_total{instance="$host",operation!="read"}[1h])

            And this is the result:

            InnoDB Row Operations per Hour graph from Percona Monitoring and Management performance monitoring tool

            We can see a huge jump in the number of inserted rows per hour, it went from ~50Mil to 200-400Mil per hours. We can say that increasing the number of 

            slave_parallel_workers
              really helped.

            Conclusion

            In this case, parallel replication was extremely useful and we could confirm that using PMM and Performance Schema. If you tune the

            slave_parallel_workers
              check the graphs. You can show the impact to your boss. 🙂

             

             

            by Tibor Korocz at October 17, 2018 12:50 PM

            October 16, 2018

            Peter Zaitsev

            PostgreSQL locking, Part 1: Row Locks

            PostgreSQL row level locks

            row signing with postgresqlAn understanding of PostgreSQL locking is important to build scalable applications and avoid downtime. Modern computers and servers have many CPU cores and it’s possible to execute multiple queries in parallel. Databases containing many consistent structures with changes made by queries or background processes running in parallel could crash a database or even corrupt data. Thus we need the ability to prevent access from concurrent processes, while changing shared memory structures or rows. One thread updates the structure while all others wait (exclusive lock), or multiple threads read the structure and all writes wait. The side effect of waits is a locking contention and server resources waste. Thus it’s important to understand why waits happen and what locks are involved. In this article, I review PostgreSQL row level locking.

            In follow up posts, I will investigate table-level locks and latches protecting internal database structures.

            Row locks – an overview

            PostgreSQL has many locks at different abstraction levels. The most important locks for applications are related to MVCC implementation – row level locking. In second place – locks appearing during maintenance tasks (during backups/database migrations schema changes) – table level locking. It’s also possible—but rare—to see waits on low level PostgreSQL locks. More often there is a high CPU usage, with many concurrent queries running, but overall server performance reduced in comparison with normal number of queries running in parallel.

            Example environment

            To follow along, you need a PostgreSQL server with a single-column table containing several rows:

            postgres=# CREATE TABLE locktest (c INT);
            CREATE TABLE
            postgres=# INSERT INTO locktest VALUES (1), (2);
            INSERT 0 2

            Row locks

            Scenario: two concurrent transactions are trying to select a row for update.

            PostgreSQL uses row-level locking in this case. Row level locking is tightly integrated with MVCC implementation, and uses hidden xmin and xmax fields.

            xmin
             and
            xmax
             store the transaction id. All statements requiring row-level locks modify the xmax field (even SELECT FOR UPDATE). The modification happens after the query returns its results, so in order to see xmax change we need to run SELECT FOR UPDATE twice. Usually, the xmax field is used to mark a row as expired—either removed by some transaction completely or in favor of updated row version—but it also used for row-level locking infrastructure.

            If you need more details about the xmin and xmax hidden fields and MVCC implementation, please check our “Basic Understanding of Bloat and VACUUM in PostgreSQL” blog post.

            postgres=# BEGIN;
            postgres=# SELECT xmin,xmax, txid_current(), c FROM locktest WHERE c=1 FOR UPDATE;
            BEGIN
             xmin | xmax | txid_current | c
            ------+------+--------------+---
              579 |  581 |          583 | 1
            (1 row)
            postgres=# SELECT xmin,xmax, txid_current(), c FROM locktest WHERE c=1 FOR UPDATE;
             xmin | xmax | txid_current | c
            ------+------+--------------+---
              579 |  583 |          583 | 1
            (1 row)

            If a statement is trying to to modify the same row, it checks the list of unfinished transactions. The statement has to wait for modification until the transaction with id=xmax is finished.

            There is no infrastructure for waiting on a specific row, but a transaction can wait on transaction id.

            -- second connection
            SELECT xmin,xmax,txid_current() FROM locktest WHERE c=1 FOR UPDATE;

            The SELECT FOR UPDATE query running in the second connection is unfinished, and waiting for the first transaction to complete.

            pg_locks

            Such waits and locks can be seen by querying pg_locks:

            postgres=# SELECT locktype,transactionid,virtualtransaction,pid,mode,granted,fastpath
            postgres-#  FROM pg_locks WHERE transactionid=583;
               locktype    | transactionid | virtualtransaction |  pid  |     mode      | granted | fastpath
            ---------------+---------------+--------------------+-------+---------------+---------+----------
             transactionid |           583 | 4/107              | 31369 | ShareLock     | f       | f
             transactionid |           583 | 3/11               | 21144 | ExclusiveLock | t       | f

            You can see the writer transaction id for locktype=transactionid == 583. Let’s get the pid and backend id for the holding lock:

            postgres=# SELECT id,pg_backend_pid() FROM pg_stat_get_backend_idset() AS t(id)
            postgres-#  WHERE pg_stat_get_backend_pid(id) = pg_backend_pid();
             id | pg_backend_pid
            ----+----------------
              3 |          21144

            This backend has its lock granted (t). Each backend has an OS process identifier (PID) and internal PostgreSQL identifier (backend id). PostgreSQL can process many transactions, but locking can happen only between backends, and each backend executes a single transaction. Internal bookkeeping requires just a virtual transaction identifier: a pair of backend ids and a sequence number inside the backend.

            Regardless of the number of rows locked, PostgreSQL will have only a single related lock in the pg_locks table. Queries might modify billions of rows but PostgreSQL does not waste memory for redundant locking structures.

            A writer thread sets ExclusiveLock on its transactionid. All row level lock waiters set ShareLock. The lock manager resumes all previously locked backend locks as soon as the writer releases the lock.

            Lock release for transactionid occurs on commit or rollback.

            pg_stat_activity

            Another great method to get locking-related details is to select from the pg_stat_activity table:

            postgres=# SELECT pid,backend_xid,wait_event_type,wait_event,state,query FROM pg_stat_activity WHERE pid IN (31369,21144);
            -[ RECORD 1 ]---+---------------------------------------------------------------------------------------------------------------------------
            pid             | 21144
            backend_xid     | 583
            wait_event_type | Client
            wait_event      | ClientRead
            state           | idle in transaction
            query           | SELECT id,pg_backend_pid() FROM pg_stat_get_backend_idset() AS t(id) WHERE pg_stat_get_backend_pid(id) = pg_backend_pid();
            -[ RECORD 2 ]---+---------------------------------------------------------------------------------------------------------------------------
            pid             | 31369
            backend_xid     | 585
            wait_event_type | Lock
            wait_event      | transactionid
            state           | active
            query           | SELECT xmin,xmax,txid_current() FROM locktest WHERE c=1 FOR UPDATE;

            Source code-level investigation

            Let’s check the stack trace for the waiter with gdb and the pt-pmp tool:

            # pt-pmp -p 31369
            Sat Jul 28 10:10:25 UTC 2018
            30	../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
                  1 epoll_wait,WaitEventSetWaitBlock,WaitEventSetWait,WaitLatchOrSocket,WaitLatch,ProcSleep,WaitOnLock,LockAcquireExtended,LockAcquire,XactLockTableWait,heap_lock_tuple,ExecLockRows,ExecProcNode,ExecutePlan,standard_ExecutorRun,PortalRunSelect,PortalRun,exec_simple_query,PostgresMain,BackendRun,BackendStartup,ServerLoop,PostmasterMain,main

            The WaitOnLock function is causing the wait. The function is located in lock.c file (POSTGRES primary lock mechanism).

            A lock table is a shared memory hash table. The conflicting process sleeps for the lock in storage/lmgr/proc.c. For the most part, this code should be invoked via lmgr.c or another lock-management module, not directly.

            Next, locks listed in pg_stat_activity as “Lock” are also called heavyweight locks, and controlled by Lock Manager. HWLocks are also used for many high level actions.

            By the way, a full description can be found here: https://www.postgresql.org/docs/current/static/explicit-locking.html

            Summary

            • Avoid long running transactions modifying frequently updated rows or too many rows
            • Next, do not use hotspots (single row or multiple rows updated in parallel by many application client connections) with MVCC databases. This kind of workload is more suitable for in-memory databases and can usually be separated from the main business logic.

            by Nickolay Ihalainen at October 16, 2018 02:26 PM

            October 15, 2018

            Peter Zaitsev

            Identifying High Load Spots in MySQL Using Slow Query Log and pt-query-digest

            pt-query-digest MySQL slow queries

            pt-query-digest MySQL slow queriespt-query-digest is one of the most commonly used tool when it comes to query auditing in MySQL®. By default, pt-query-digest reports the top ten queries consuming the most amount of time inside MySQL. A query that takes more time than the set threshold for completion is considered slow but it’s not always true that tuning such queries makes them faster. Sometimes, when resources on server are busy, it will impact every other operation on the server, and so will impact queries too. In such cases, you will see the proportion of slow queries goes up. That can also include queries that work fine in general.

            This article explains a small trick to identify such spots using pt-query-digest and the slow query log. pt-query-digest is a component of Percona Toolkit, open source software that is free to download and use.

            Some sample data

            Let’s have a look at sample data in Percona Server 5.7. Slow query log is configured to capture queries longer than ten seconds with no limit on rate of logging, which is generally considered to throttle the IO that comes while writing slow queries to the log file.

            mysql> show variables like 'log_slow_rate%' ;
            +---------------------+---------+
            | Variable_name       | Value    |
            +---------------------+---------+
            | log_slow_rate_limit | 1       |  --> Log all queries
            | log_slow_rate_type  | session |
            +---------------------+---------+
            2 rows in set (0.00 sec)
            mysql> show variables like 'long_query_time' ;
            +-----------------+-----------+
            | Variable_name   | Value     |
            +-----------------+-----------+
            | long_query_time | 10.000000 |  --> 10 seconds
            +-----------------+-----------+
            1 row in set (0.01 sec)

            When I run pt-query-digest, I see in the summary report that 80% of the queries have come from just three query patterns.

            # Profile
            # Rank Query ID                      Response time    Calls R/Call   V/M
            # ==== ============================= ================ ===== ======== =====
            #    1 0x7B92A64478A4499516F46891... 13446.3083 56.1%   102 131.8266  3.83 SELECT performance_schema.events_statements_history
            #    2 0x752E6264A9E73B741D3DC04F...  4185.0857 17.5%    30 139.5029  0.00 SELECT table1
            #    3 0xAFB5110D2C576F3700EE3F7B...  1688.7549  7.0%    13 129.9042  8.20 SELECT table2
            #    4 0x6CE1C4E763245AF56911E983...  1401.7309  5.8%    12 116.8109 13.45 SELECT table4
            #    5 0x85325FDF75CD6F1C91DFBB85...   989.5446  4.1%    15  65.9696 55.42 SELECT tbl1 tbl2 tbl3 tbl4
            #    6 0xB30E9CB844F2F14648B182D0...   420.2127  1.8%     4 105.0532 12.91 SELECT tbl5
            #    7 0x7F7C6EE1D23493B5D6234382...   382.1407  1.6%    12  31.8451 70.36 INSERT UPDATE tbl6
            #    8 0xBC1EE70ABAE1D17CD8F177D7...   320.5010  1.3%     6  53.4168 67.01 REPLACE tbl7
            #   10 0xA2A385D3A76D492144DD219B...   183.9891  0.8%    18  10.2216  0.00 UPDATE tbl8
            #      MISC 0xMISC                     948.6902  4.0%    14  67.7636   0.0 <10 ITEMS>

            Query #1 is generated by the qan-agent from PMM and runs approximately once a minute. These results will be handed over to PMM Server. Similarly queries #2 & #3 are pretty simple. I mean, they scan just one row and will return either zero or one rows. They also use indexing, which makes me think that this is not because of something just with in MySQL. I wanted to know if I could find any common aspect of all these occurrences.

            Let’s take a closer look at the queries recorded in slow query log.

            # grep -B3 DIGEST mysql-slow_Oct2nd_4th.log
            ....
            ....
            # User@Host: ztrend[ztrend] @ localhost []  Id: 6431601021
            # Query_time: 139.279651  Lock_time: 64.502959 Rows_sent: 0  Rows_examined: 0
            SET timestamp=1538524947;
            SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history;
            # User@Host: ztrend[ztrend] @ localhost []  Id: 6431601029
            # Query_time: 139.282594  Lock_time: 83.140413 Rows_sent: 0  Rows_examined: 0
            SET timestamp=1538524947;
            SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history;
            # User@Host: ztrend[ztrend] @ localhost []  Id: 6431601031
            # Query_time: 139.314228  Lock_time: 96.679563 Rows_sent: 0  Rows_examined: 0
            SET timestamp=1538524947;
            SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history;
            ....
            ....

            Now you can see two things.

            • All of them have same Unix timestamp
            • All of them were spending more than 70% of their execution time waiting for some lock.

            Analyzing the data from pt-query-digest

            Now I want to check if I can group the count of queries based on their time of execution. If there are multiple queries at a given time captured into the slow query log, time will be printed for the first query but not all. Fortunately, in this case I can rely on the Unix timestamp to compute the counts. The timestamp is gets captured for every query. Luckily, without a long struggle, a combination of grep and awk utilities have displayed what I wanted to display.

            # grep -A1 Query_time mysql-slow_Oct2nd_4th.log | grep SET | awk -F "=" '{ print $2 }' | uniq -c
            2   1538450797;
            1   1538524822;
            3   1538524846;
            7   1538524857;
            167 1538524947;   ---> 72% of queries have happened at this timestamp.
            1   1538551813;
            3   1538551815;
            6   1538602215;
            1   1538617599;
            33  1538631015;
            1   1538631016;
            1   1538631017;

            You can use the command below to check the regular date time format of a given timestamp. So, Oct 3, 05:32 is when there was something wrong on the server:

            # date -d @1538524947
            Wed Oct 3 05:32:27 IST 2018

            Query tuning can be carried out alongside this, but identifying such spots helps avoiding spending time on query tuning where badly written queries are not the problem. Having said that, from this point, further troubleshooting may take different sub paths such as checking log files at that particular time, looking at CPU reports, reviewing past pt-stalk reports if set up to run in the background, and dmesg etc. This approach is useful for identifying at what time (or time range) MySQL was more stressed just using slow query log when no robust monitoring tools, like Percona Monitoring and Management (PMM), are deployed.

            Using PMM to monitor queries

            If you have PMM, you can review Query Analytics to see the topmost slow queries, along with details like execution counts, load etc. Below is a sample screen copy for your reference:

            Slow query log from PMM dashboard

            NOTE: If you use Percona Server for MySQL, slow query log can report time in micro seconds. It also supports extended logging of  other statistics about query execution. These provide extra power to see the insights of query processing. You can see more information about these options here.

            by Uday Varagani at October 15, 2018 01:24 PM

            October 12, 2018

            Peter Zaitsev

            Track PostgreSQL Row Changes Using Public/Private Key Signing

            PostgreSQL encryption and authorization

            row signing with postgresqlAuthorisations and encryption/decryption within a database system establish the basic guidelines in protecting your database by guarding against malicious structural or data changes.

            What are authorisations?

            Authorisations are the access privileges that mainly control what a user can and cannot do on the database server for one or more databases. So consider this to be like granting a key to unlock specific doors. Think of this as more like your five star hotel smart card. It allows you access all facilities that are meant for you, but doesn’t let you open every door. Whereas, privileged staff have master keys which let them open any door.

            Similarly, in the database world, granting permissions secures the system by allowing specific actions by specific users or user groups, yet it allows database administrator to perform whatever action(s) on the database he/she wishes. PostgreSQL provides user management where you can can create users, and grant and revoke their privileges.

            Encryption

            Encryption, decryption can protect your data, obfuscate schema structure and help hide code from prying eyes. Encryption/decryption hides the valuable information and ensures that there are no mischievous changes in the code or data that may be considered harmful. In almost all cases, data encryption and decryption happens on the database server. This is more like hiding your stuff somewhere in your room so that nobody can see it, but also making your stuff difficult to access.

            PostgreSQL also provides encryption using pgcrypto (PostgreSQL extension). There are some cases where you don’t want to hide the data, but don’t want people to update it either. You can revoke the privileges to modify the data.

            Data modifications

            But what if an admin user modifies the data? How you can identify that data is changed? If somebody changes the data and you don’t know about, then it is more dangerous than you losing your data, as you are relying on data which may no longer be valid.

            Logs in database systems allow us to track back changes and “potentially” identify what was changed—unless, those logs are removed by the administrator.

            So consider if you can leave your stuff openly in your room and in case of any changes, you can identify that something was tampered with. In database terms, that translates to data without encryption, but with your very own signature. One option is to add a column to your database table which keeps a checksum for the data that is generated on the client side using the user’s own private key.  Any changes in the data would mean that checksum doesn’t match anymore, and hence, one can easily identify if the data has changed. The data signing happens on the client-side, thereby ensuring that only users with the required private key can insert the data and anyone with a public key can validate.

            Public/Private Keys

            Asymmetric cryptographic system uses pairs of keys; public keys and private keys. Private keys are known only to the owner(s). It is used for signing or decrypting data. Public keys are shared with other stakeholders who may use it to encrypt messages or validate messages signed by the owner.

            Generate Private / Public Key

            Private Key

            $ openssl genrsa -aes128 -passout pass:password -out key.private.pem
            Generating RSA private key, 2048 bit long modulus

            Public Key

            $ openssl rsa -in key.private.pem -passin pass:password -pubout -out key.public.pem
            writing RSA key

            Signing Data

            Create a sample table tbl_marks and insert a sample row in that. We’ll need to add additional columns for signature verification. This will understandably increase the table size as we are adding additional columns.

            postgres=# CREATE TABLE tbl_marks (id INTEGER, name TEXT, marks INTEGER, hash TEXT);

            Let’s add a row that we’d like to validate.

            postgres=# INSERT INTO tbl_marks VALUES(1, 'Alice', 80);

            We will select the data to store the value into into query buffer using

            \gset
              command (https://www.postgresql.org/docs/current/static/app-psql.html). The complete row will be saved into “row” psql variable.

            postgres=# SELECT row(id,name,marks) FROM tbl_marks WHERE id = 1;
                 row   
            ---------------
            (1,Alice,80)
            (1 row)
            postgres=# \gset
            postgres=# SELECT :'row' as row;
                 row   
            ---------------
            (1,Alice,80)
            (1 row)

            Now let’s generate signature for the data stored in “row” variable.

            postgres=# \set sign_command `echo :'row' | openssl dgst -sha256 -sign key.private.pem | openssl base64 | tr -d '\n' | tr -d '\r'`
            Enter pass phrase for key.private.pem:

            The signed hash is stored into the “sign_command” psql variable. Let’s now add this to the data row in tbl_marks table.

            postgres=# UPDATE tbl_marks SET hash = :'sign_command' WHERE id = 1;
            UPDATE 1

            Validating Data

            So our data row now contains data with a valid signature. Let’s try to validate to it. We are going to select our data in “row” psql variable and the signature hash in “hash” psql variable.

            postgres=# SELECT row(id,name,marks), hash FROM tbl_marks;    
            Row           hash                                                                                                                                                                                                                                                                                                                                                                                            
            ---------------+-----------------------------------------------
            (1,Alice,80) | U23g3RwaZmbeZpYPmwezP5xvbIs8ILupW7jtrat8ixA ...
            (1 row)
            postgres=# \gset

            Let’s now validate the data using a public key.

            postgres=# \set verify_command `echo :'hash' | awk '{gsub(/.{65}/,"&\n")}1' | openssl base64 -d -out v && echo :'row' | openssl dgst -sha256 -verify key.public.pem -signature v`
            postgres=# select :'verify_command' as verify;
              verify    
            -------------
            Verified OK
            (1 row)

            Perfect! The data is validated and all this happened on the client side. Imagine somebody doesn’t like that Alice got 80 marks, and they decide to reduce Alice’s marks to 30. Nobody knows if the teacher had given Alice 80 or 30 unless somebody goes and checks the database logs. We’ll give Alice 30 marks now.

            postgres=# UPDATE tbl_marks SET marks = 30;
            UPDATE 1

            The school admin now decides to check that all data is correct before giving out the final results. The school admin has the teacher’s public key and tries to validate the data.

            postgres=# SELECT row(id,name,marks), hash FROM tbl_marks;
                row    | hash                                                                                                                                                                                                                                                                  
            --------------+--------------------------------------------------
            (1,Alice,30) | yO20vyPRPR+HgW9D2nMSQstRgyGmCxyS9bVVrJ8tC7nh18iYc...
            (1 row)
            postgres=# \gset

            postgres=# \set verify_command `echo :'hash' | awk '{gsub(/.{65}/,"&\n")}1' | openssl base64 -d -out v && echo :'row' | openssl dgst -sha256 -verify key.public.pem -signature v`
            postgres=# SELECT :'verify_command' AS verify;
                  verify      
            ----------------------
            Verification Failure

            As expected, the validation fails. Nobody other than the teacher had the private key to sign that data, and any tampering is easily identifiable.

            This might not be the most efficient way of securing a dataset, but it is definitely an option if you want to keep the data unencrypted, and yet easily detect any unauthorised changes. All the load is shifted on to the client side for signing and verification thereby reducing load on the server. It allows only users with private keys to update the data, and anybody with the associated public key to validate it.

            The example used psql as a client application for signing but you can do this on any client which can call the required openssl functions or directly used openssl binaries for signing and verification.

            by Ibrar Ahmed at October 12, 2018 01:43 PM

            October 11, 2018

            Peter Zaitsev

            How to Fix ProxySQL Configuration When it Won’t Start

            restart ProxySQL config

            restart ProxySQL configWith the exception of the three configuration variables described here, ProxySQL will only parse the configuration files the first time it is started, or if the proxysql.db file is missing for some other reason.

            If we want to change any of this data we need to do so via ProxySQL’s admin interface and then save them to disk. That’s fine if ProxySQL is running, but what if it won’t start because of these values?

            For example, perhaps we accidentally configured ProxySQL to run on port 3306 and restarted it, but there’s already a production MySQL instance running on this port. ProxySQL won’t start, so we can’t edit the value that way:

            2018-10-02 09:18:33 network.cpp:53:listen_on_port(): [ERROR] bind(): Address already in use

            We could delete proxysql.db and have it reload the configuration files, but that would mean any changes we didn’t mirror into the configuration files will be lost.

            Another option is to edit ProxySQL’s database file using sqlite3:

            [root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/
            [root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
            sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces';
            mysql-interfaces|127.0.0.1:3306
            sqlite> UPDATE global_variables SET variable_value='127.0.0.1:6033' WHERE variable_name='mysql-interfaces';
            sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces';
            mysql-interfaces|127.0.0.1:6033

            Or if we have a few edits to make we may prefer to do so with a text editor:

            [root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/
            [root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
            sqlite> .output /tmp/global_variables
            sqlite> .dump global_variables
            sqlite> .exit

            The above commands will dump the global_variables table into a file in SQL format, which we can then edit:

            [root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables
            INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:3306’);
            [root@centos7-pxc57-4 proxysql]# vim /tmp/global_variables
            [root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables
            INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:6033’);

            Now we need to restore this data. We’ll use the restore command to empty the table (as we’re restoring from a missing backup):

            [root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
            sqlite> .restore global_variables
            sqlite> .read /tmp/global_variables
            sqlite> .exit

            Once we’ve made the change, we should be able to start ProxySQL again:

            [root@centos7-pxc57-4 proxysql]# /etc/init.d/proxysql start
            Starting ProxySQL: DONE!
            [root@centos7-pxc57-4 proxysql]# lsof -I | grep proxysql
            proxysql 15171 proxysql 19u IPv4 265881 0t0 TCP localhost:6033 (LISTEN)
            proxysql 15171 proxysql 20u IPv4 265882 0t0 TCP localhost:6033 (LISTEN)
            proxysql 15171 proxysql 21u IPv4 265883 0t0 TCP localhost:6033 (LISTEN)
            proxysql 15171 proxysql 22u IPv4 265884 0t0 TCP localhost:6033 (LISTEN)
            proxysql 15171 proxysql 23u IPv4 266635 0t0 TCP *:6032 (LISTEN)

            While you are here

            You might enjoy my recent post Using ProxySQL to connect to IPV6-only databases over IPV4

            You can download ProxySQL from Percona repositories, and you might also want to check out our recorded webinars that feature ProxySQL too.

            by James Lawrie at October 11, 2018 04:07 PM

            Percona Live 2019 – Save the Date!

            Austin Texas

            Austin State Capitol

            After much speculation following the announcement in Santa Clara earlier this year, we are delighted to announce Percona Live 2019 will be taking place in Austin, Texas.

            Save the dates in your diary for May, 28-30 2019!

            The conference will take place just after Memorial Day at The Hyatt Regency, Austin on the shores of Lady Bird Lake.

            This is also an ideal central location for those who wish to extend their stay and explore what Austin has to offer! Call for papers, ticket sales and sponsorship opportunities will be announced soon, so stay tuned!

            In other Percona Live news, we’re less than 4 weeks away from this year’s European conference taking place in Frankfurt, Germany on 5-7 November. The tutorials and breakout sessions have been announced, and you can view the full schedule here. Tickets are still on sale so don’t miss out, book yours here today!

             

            by Bronwyn Campbell at October 11, 2018 12:38 PM

            October 10, 2018

            Peter Zaitsev

            Percona Monitoring and Management (PMM) 1.15.0 Is Now Available

            Percona Monitoring and Management

            Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

            Percona Monitoring and Management

            This release offers two new features for both the MySQL Community and Percona Customers:

            • MySQL Custom Queries – Turn a SELECT into a dashboard!
            • Server and Client logs – Collect troubleshooting logs for Percona Support

            We addressed 17 new features and improvements, and fixed 17 bugs.

            MySQL Custom Queries

            In 1.15 we are introducing the ability to take a SQL SELECT statement and turn the result set into metric series in PMM.  The queries are executed at the LOW RESOLUTION level, which by default is every 60 seconds.  A key advantage is that you can extend PMM to profile metrics unique to your environment (see users table example), or to introduce support for a table that isn’t part of PMM yet. This feature is on by default and only requires that you edit the configuration file and use vaild YAML syntax.  The configuration file is in /usr/local/percona/pmm-client/queries-mysqld.yml.

            Example – Application users table

            We’re going to take a fictional MySQL users table that also tracks the number of upvotes and downvotes, and we’ll convert this into two metric series, with a set of seven labels, where each label can also store a value.

            Browsing metrics series using Advanced Data Exploration Dashboard

            Lets look at the output so we understand the goal – take data from a MySQL table and store in PMM, then display as a metric series.  Using the Advanced Data Exploration Dashboard you can review your metric series. Exploring the metric series  app1_users_metrics_downvotes we see the following:

            PMM Advanced Data Exploration Dashboard

            MySQL table

            Lets assume you have the following users table that includes true/false, string, and integer types.

            SELECT * FROM `users`
            +----+------+--------------+-----------+------------+-----------+---------------------+--------+---------+-----------+
            | id | app  | user_type    | last_name | first_name | logged_in | active_subscription | banned | upvotes | downvotes |
            +----+------+--------------+-----------+------------+-----------+---------------------+--------+---------+-----------+
            |  1 | app2 | unprivileged | Marley    | Bob        |         1 |                   1 |      0 |     100 |        25 |
            |  2 | app3 | moderator    | Young     | Neil       |         1 |                   1 |      1 |     150 |        10 |
            |  3 | app4 | unprivileged | OConnor   | Sinead     |         1 |                   1 |      0 |      25 |        50 |
            |  4 | app1 | unprivileged | Yorke     | Thom       |         0 |                   1 |      0 |     100 |       100 |
            |  5 | app5 | admin        | Buckley   | Jeff       |         1 |                   1 |      0 |     175 |         0 |
            +----+------+--------------+-----------+------------+-----------+---------------------+--------+---------+-----------+

            Explaining the YAML syntax

            We’ll go through a simple example and mention what’s required for each line.  The metric series is constructed based on the first line and appends the column name to form metric series.  Therefore the number of metric series per table will be the count of columns that are of type GAUGE or COUNTER.  This metric series will be called app1_users_metrics_downvotes:

            app1_users_metrics:                                 ## leading section of your metric series.
              query: "SELECT * FROM app1.users"                 ## Your query. Don't forget the schema name.
              metrics:                                          ## Required line to start the list of metric items
                - downvotes:                                    ## Name of the column returned by the query. Will be appended to the metric series.
                    usage: "COUNTER"                            ## Column value type.  COUNTER will make this a metric series.
                    description: "Number of upvotes"            ## Helpful description of the column.

            Full queries-mysqld.yml example

            Each column in the SELECT is named in this example, but that isn’t required, you can use a SELECT * as well.  Notice the format of schema.table for the query is included.

            ---
            app1_users_metrics:
              query: "SELECT app,first_name,last_name,logged_in,active_subscription,banned,upvotes,downvotes FROM app1.users"
              metrics:
                - app:
                    usage: "LABEL"
                    description: "Name of the Application"
                - user_type:
                    usage: "LABEL"
                    description: "User's privilege level within the Application"
                - first_name:
                    usage: "LABEL"
                    description: "User's First Name"
                - last_name:
                    usage: "LABEL"
                    description: "User's Last Name"
                - logged_in:
                    usage: "LABEL"
                    description: "User's logged in or out status"
                - active_subscription:
                    usage: "LABEL"
                    description: "Whether User has an active subscription or not"
                - banned:
                    usage: "LABEL"
                    description: "Whether user is banned or not"
                - upvotes:
                    usage: "COUNTER"
                    description: "Count of upvotes the User has earned.  Upvotes once granted cannot be revoked, so the number can only increase."
                - downvotes:
                    usage: "GAUGE"
                    description: "Count of downvotes the User has earned.  Downvotes can be revoked so the number can increase as well as decrease."
            ...

            We hope you enjoy this feature, and we welcome your feedback via the Percona forums!

            Server and Client logs

            We’ve enhanced the volume of data collected from both the Server and Client perspectives.  Each service provides a set of files designed to be shared with Percona Support while you work on an issue.

            Server

            From the Server, we’ve improved the logs.zip service to include:

            • Prometheus targets
            • Consul nodes, QAN API instances
            • Amazon RDS and Aurora instances
            • Version
            • Server configuration
            • Percona Toolkit commands

            You retrieve the link from your PMM server using this format:   https://pmmdemo.percona.com/managed/logs.zip

            Client

            On the Client side we’ve added a new action called summary which fetches logs, network, and Percona Toolkit output in order to share with Percona Support. To initiate a Client side collection, execute:

            pmm-admin summary

            The output will be a file you can use to attach to your Support ticket.  The single file will look something like this:

            summary__2018_10_10_16_20_00.tar.gz

            New Features and Improvements

            • PMM-2913 – Provide ability to execute Custom Queries against MySQL – Credit to wrouesnel for the framework of this feature in wrouesnel/postgres_exporter!
            • PMM-2904 – Improve PMM Server Diagnostics for Support
            • PMM-2860 – Improve pmm-client Diagnostics for Support
            • PMM-1754Provide functionality to easily select query and copy it to clipboard in QAN
            • PMM-1855Add swap to AMI
            • PMM-3013Rename PXC Overview graph Sequence numbers of transactions to IST Progress
            • PMM-2726 – Abort data collection in Exporters based on Prometheus Timeout – MySQLd Exporter
            • PMM-3003 – PostgreSQL Overview Dashboard Tooltip fixes
            • PMM-2936Some improvements for Query Analytics Settings screen
            • PMM-3029PostgreSQL Dashboard Improvements

            Fixed Bugs

            • PMM-2976Upgrading to PMM 1.14.x fails if dashboards from Grafana 4.x are present on an installation
            • PMM-2969rds_exporter becomes throttled by CloudWatch API
            • PMM-1443The credentials for a secured server are exposed without explicit request
            • PMM-3006Monitoring over 1000 instances is displayed imperfectly on the label
            • PMM-3011PMM’s default MongoDB DSN is localhost, which is not resolved to IPv4 on modern systems
            • PMM-2211Bad display when using old range in QAN
            • PMM-1664Infinite loading with wrong queryID
            • PMM-2715Since pmm-client-1.9.0, pmm-admin detects CentOS/RHEL 6 installations using linux-upstart as service manager and ignores SysV scripts
            • PMM-2839Tablestats safety precaution does not work for RDS/Aurora instances
            • PMM-2845pmm-admin purge causes client to panic
            • PMM-2968pmm-admin list shows empty data source column for mysql:metrics
            • PMM-3043 Total Time percentage is incorrectly shown as a decimal fraction
            • PMM-3082Prometheus Scrape Interval Variance chart doesn’t display data

            How to get PMM Server

            PMM is available for installation using three methods:

            Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

            by Dmitriy Kostiuk at October 10, 2018 06:31 PM

            MariaDB AB

            MariaDB OpenWorks 2019 – Early-Bird Registration Now Open

            MariaDB OpenWorks 2019 – Early-Bird Registration Now Open MariaDB Team Wed, 10/10/2018 - 14:16

            MariaDB OpenWorks, our 2019 user and developer conference, takes place in February – but now is the best time to get your passes. Early-bird registration recently opened. Don't miss this opportunity for the lowest rates and first dibs on workshop seats. 

            Early-bird savings of 30% (or more)

            By reserving your passes now, you save 30% off regular prices. That means $175 for the 1-day Workshops pass, $350 for the 2-day Conference pass, and $525 for the 3-day All-Access pass. At those rates, your attendance is easy to justify – plus, you can save an additional 20% by registering a group of three or more people. Prices go up on December 1, so be sure to take advantage of early-bird pricing.

            Guaranteed access to in-demand workshops

            MariaDB OpenWorks 2019 features a full day dedicated to hands-on workshops. Learn directly from MariaDB engineers, remote DBAs and consultants – ask questions, do the labs and increase your MariaDB expertise. Workshops sold out at the 2018 conference, so claim your seats now for the workshops that will best help you achieve your business goals:

            • High availability
            • Containerization
            • High-performance analytics
            • Advanced security
            • Performance optimization

            Something for everyone

            MariaDB OpenWorks 2019 will offer 45+ sessions on a wide range of topics, from the latest MariaDB product and community innovations to real-world customer stories, ops tools and processes, insights for developers, and the power of analytics.

            Check out the 1-minute recap video from the 2018 conference and picture yourself joining the festivities in 2019.

            We hope to see you at MariaDB OpenWorks in New York, February 25-27, 2019. Don't forget to register early for 30% off!

            MariaDB OpenWorks, our 2019 user and developer conference, takes place in February – but now is the best time to get your passes. Early-bird registration recently opened. Don't miss this opportunity for the lowest rates and first dibs on workshop seats.

            Login or Register to post comments

            by MariaDB Team at October 10, 2018 06:16 PM

            Peter Zaitsev

            Instrumenting Read Only Transactions in InnoDB

            Instrumenting read only transactions MySQL

            Instrumenting read only transactions MySQLProbably not well known but quite an important optimization was introduced in MySQL 5.6 – reduced overhead for “read only transactions”. While usually by a “transaction” we mean a query or a group of queries that change data, with transaction engines like InnoDB, every data read or write operation is a transaction.

            Now, as a non-locking read operation obviously has less impact on the data, it does not need all the instrumenting overhead a write transaction has. The main thing that can be avoided, as described by documentation, is the transaction ID. So, since MySQL 5.6, a read only transaction does not have a transaction ID. Moreover, such a transaction is not visible in the SHOW ENGINE INNODB STATUS output, though I will not go deeper on what really that means under the hood in this article. The fact is that this optimization allows for better scaling of workloads with many RO threads. An example RO benchmark, where 5.5 vs 5.6/5.7 difference is well seen, may be found here: https://www.percona.com/blog/2016/04/07/mysql-5-7-sysbench-oltp-read-results-really-faster/

            To benefit from this optimization in MySQL 5.6, either a transaction has to start with the explicit START TRANSACTION READ ONLY clause or it must be an autocommit, non-locking SELECT statement. In version 5.7 and newer, it goes further, as a new transaction is treated as read-only until a locking read or write is executed, at which point it gets “upgraded” to a read-write one.

            Information Schema Instrumentation

            Let’s see how it looks like (on MySQL 8.0.12) by looking at information_schema.innodb_trx and information_schema.innodb_metrics tables. The second of these, by default, has transaction counters disabled, so before the test we have to enable it with:

            SET GLOBAL innodb_monitor_enable = 'trx%comm%';

            or by adding a parameter to the

            [mysqld]
             section of the configuration file and restarting the instance:

            innodb_monitor_enable = "trx_%"

            Now, let’s start a transaction which should be read only according to the rules:

            mysql [localhost] {msandbox} (db1) > START TRANSACTION; SELECT count(*) FROM db1.t1;
            Query OK, 0 rows affected (0.00 sec)
            +----------+
            | count(*) |
            +----------+
            |        3 |
            +----------+
            1 row in set (0.00 sec
            mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
            FROM information_schema.innodb_trx\G
            *************************** 1. row ***************************
                                trx_id: 421988493944672
                            trx_weight: 0
                       trx_rows_locked: 0
                     trx_rows_modified: 0
                      trx_is_read_only: 0
            trx_autocommit_non_locking: 0
            1 row in set (0.00 sec)

            Transaction started as above, did not appear in SHOW ENGINE INNODB STATUS, and its trx_id looks strangely high. And first surprise—for some reason, trx_is_read_only is 0. Now, what if we commit such a transaction—how do the counters change? (I reset them before the test):

            mysql [localhost] {msandbox} (db1) > commit;
            Query OK, 0 rows affected (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
            FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | name                      | comment                                                            | status  | count |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
            | trx_ro_commits            | Number of read-only transactions committed                         | enabled |     1 |
            | trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     0 |
            | trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            4 rows in set (0.01 sec)

            OK, so clearly it was a read-only transaction overall, just the trx_is_read_only property wasn’t set as expected. I had to report this problem here: https://bugs.mysql.com/bug.php?id=92558

            What about an explicit RO transaction:

            mysql [localhost] {msandbox} (db1) > START TRANSACTION READ ONLY; SELECT count(*) FROM db1.t1;
            Query OK, 0 rows affected (0.00 sec)
            +----------+
            | count(*) |
            +----------+
            |        3 |
            +----------+
            1 row in set (0.00 sec
            mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
            FROM information_schema.innodb_trx\G
            *************************** 1. row ***************************
                                trx_id: 421988493944672
                            trx_weight: 0
                       trx_rows_locked: 0
                     trx_rows_modified: 0
                      trx_is_read_only: 1
            trx_autocommit_non_locking: 0
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > commit;
            Query OK, 0 rows affected (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
            FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | name                      | comment                                                            | status  | count |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
            | trx_ro_commits            | Number of read-only transactions committed                         | enabled |     2 |
            | trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     0 |
            | trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            4 rows in set (0.01 sec)

            OK, both transactions are counted as the same type. Moreover, the two transactions shared the same strange trx_id, which appears to be a fake one. For a simple read executed in autocommit mode, the counters increase as expected too:

            mysql [localhost] {msandbox} (db1) > select @@autocommit; SELECT count(*) FROM db1.t1;
            +--------------+
            | @@autocommit |
            +--------------+
            |            1 |
            +--------------+
            1 row in set (0.00 sec)
            +----------+
            | count(*) |
            +----------+
            |        3 |
            +----------+
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
            FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | name                      | comment                                                            | status  | count |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
            | trx_ro_commits            | Number of read-only transactions committed                         | enabled |     2 |
            | trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     1 |
            | trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            4 rows in set (0.00 sec)

            Now, let’s test how a transaction looks when we upgrade it to RW later:

            mysql [localhost] {msandbox} (db1) > START TRANSACTION; SELECT count(*) FROM db1.t1;
            Query OK, 0 rows affected (0.00 sec)
            +----------+
            | count(*) |
            +----------+
            |        3 |
            +----------+
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
            FROM information_schema.innodb_trx\G
            *************************** 1. row ***************************
                                trx_id: 421988493944672
                            trx_weight: 0
                       trx_rows_locked: 0
                     trx_rows_modified: 0
                      trx_is_read_only: 0
            trx_autocommit_non_locking: 0
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT count(*) FROM db1.t1 FOR UPDATE;
            +----------+
            | count(*) |
            +----------+
            |        3 |
            +----------+
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
            FROM information_schema.innodb_trx\G
            *************************** 1. row ***************************
                                trx_id: 4106
                            trx_weight: 2
                       trx_rows_locked: 4
                     trx_rows_modified: 0
                      trx_is_read_only: 0
            trx_autocommit_non_locking: 0
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > commit;
            Query OK, 0 rows affected (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
            FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | name                      | comment                                                            | status  | count |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     1 |
            | trx_ro_commits            | Number of read-only transactions committed                         | enabled |     2 |
            | trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     1 |
            | trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            4 rows in set (0.00 sec)

            OK, as seen above, after a locking read was done, our transaction has transformed: it got a real, unique trx_id assigned. Then, when committed, the RW counter increased.

            Performance Schema Problem

            Nowadays it may feel natural to use performance_schema for monitoring everything. And, indeed, we can monitor types of transactions with it as well. Let’s enable the needed consumers and instruments:

            mysql [localhost] {msandbox} (db1) > UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE '%transactions%';
            Query OK, 0 rows affected (0.00 sec)
            Rows matched: 3  Changed: 0  Warnings: 0
            mysql [localhost] {msandbox} (db1) > UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES' WHERE NAME = 'transaction';
            Query OK, 0 rows affected (0.01 sec)
            Rows matched: 1  Changed: 0  Warnings: 0
            mysql [localhost] {msandbox} (db1) > SELECT * FROM performance_schema.setup_instruments WHERE NAME = 'transaction';
            +-------------+---------+-------+------------+------------+---------------+
            | NAME        | ENABLED | TIMED | PROPERTIES | VOLATILITY | DOCUMENTATION |
            +-------------+---------+-------+------------+------------+---------------+
            | transaction | YES     | YES   |            |          0 | NULL          |
            +-------------+---------+-------+------------+------------+---------------+
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT * FROM performance_schema.setup_consumers WHERE NAME LIKE '%transactions%';
            +----------------------------------+---------+
            | NAME                             | ENABLED |
            +----------------------------------+---------+
            | events_transactions_current      | YES     |
            | events_transactions_history      | YES     |
            | events_transactions_history_long | YES     |
            +----------------------------------+---------+
            3 rows in set (0.01 sec)
            mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
            FROM performance_schema.events_transactions_summary_global_by_event_name\G
            *************************** 1. row ***************************
                  COUNT_STAR: 0
            COUNT_READ_WRITE: 0
             COUNT_READ_ONLY: 0
            1 row in set (0.00 sec)

            And let’s do some simple tests:

            mysql [localhost] {msandbox} (db1) > START TRANSACTION; COMMIT;
            Query OK, 0 rows affected (0.01 sec)
            Query OK, 0 rows affected (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
            FROM performance_schema.events_transactions_summary_global_by_event_name\G
            *************************** 1. row ***************************
                  COUNT_STAR: 1
            COUNT_READ_WRITE: 1
             COUNT_READ_ONLY: 0
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
            FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | name                      | comment                                                            | status  | count |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
            | trx_ro_commits            | Number of read-only transactions committed                         | enabled |     0 |
            | trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     0 |
            | trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            4 rows in set (0.00 sec)

            A void transaction caused an increase to this RW counter in Performance Schema view! Moreover, a simple autocommit select increases it too:

            mysql [localhost] {msandbox} (db1) > SELECT count(*) FROM db1.t1;
            +----------+
            | count(*) |
            +----------+
            |        3 |
            +----------+
            1 row in set (0.01 sec)
            mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
            FROM performance_schema.events_transactions_summary_global_by_event_name\G
            *************************** 1. row ***************************
                  COUNT_STAR: 2
            COUNT_READ_WRITE: 2
             COUNT_READ_ONLY: 0
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > START TRANSACTION READ ONLY; COMMIT;
            Query OK, 0 rows affected (0.00 sec)
            Query OK, 0 rows affected (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
            FROM performance_schema.events_transactions_summary_global_by_event_name\G
            *************************** 1. row ***************************
                  COUNT_STAR: 3
            COUNT_READ_WRITE: 2
             COUNT_READ_ONLY: 1
            1 row in set (0.00 sec)
            mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
            FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | name                      | comment                                                            | status  | count |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            | trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
            | trx_ro_commits            | Number of read-only transactions committed                         | enabled |     0 |
            | trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     1 |
            | trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
            +---------------------------+--------------------------------------------------------------------+---------+-------+
            4 rows in set (0.01 sec)

            As seen above, with regard to monitoring transactions via Performance Schema, everything seems completely broken, empty transactions increase counters, and the only way to increase RO counter is to call a read-only transaction explicitly, but again, it should not count when no real read was done from a table. For this reason I filed another bug report: https://bugs.mysql.com/bug.php?id=92364

            PMM Dashboard

            We implemented a transactions information view in PMM, based on Information_schema.innodb_metrics, which—as presented above—is reliable and shows the correct counters. Therefore, I encourage everyone to use the innodb_monitor_enable setting to enable it and have the PMM graph it. It will look something like this:

            by Przemysław Malkowski at October 10, 2018 03:09 PM

            MongoDB Replica set Scenarios and Internals

            MongoDB replica sets replication internals r

            MongoDB replica sets replication internals rThe MongoDB® replica set is a group of nodes with one set as the primary node, and all other nodes set as secondary nodes. Only the primary node accepts “write” operations, while other nodes can only serve “read” operations according to the read preferences defined. In this blog post, we’ll focus on some MongoDB replica set scenarios, and take a look at the internals.

            Example configuration

            We will refer to a three node replica set that includes one primary node and two secondary nodes running as:

            "members" : [
            {
            "_id" : 0,
            "name" : "192.168.103.100:25001",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 3533,
            "optime" : {
            "ts" : Timestamp(1537800584, 1),
            "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-09-24T14:49:44Z"),
            "electionTime" : Timestamp(1537797392, 2),
            "electionDate" : ISODate("2018-09-24T13:56:32Z"),
            "configVersion" : 3,
            "self" : true
            },
            {
            "_id" : 1,
            "name" : "192.168.103.100:25002",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 3063,
            "optime" : {
            "ts" : Timestamp(1537800584, 1),
            "t" : NumberLong(1)
            },
            "optimeDurable" : {
            "ts" : Timestamp(1537800584, 1),
            "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-09-24T14:49:44Z"),
            "optimeDurableDate" : ISODate("2018-09-24T14:49:44Z"),
            "lastHeartbeat" : ISODate("2018-09-24T14:49:45.539Z"),
            "lastHeartbeatRecv" : ISODate("2018-09-24T14:49:44.664Z"),
            "pingMs" : NumberLong(0),
            "syncingTo" : "192.168.103.100:25001",
            "configVersion" : 3
            },
            {
            "_id" : 2,
            "name" : "192.168.103.100:25003",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 2979,
            "optime" : {
            "ts" : Timestamp(1537800584, 1),
            "t" : NumberLong(1)
            },
            "optimeDurable" : {
            "ts" : Timestamp(1537800584, 1),
            "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-09-24T14:49:44Z"),
            "optimeDurableDate" : ISODate("2018-09-24T14:49:44Z"),
            "lastHeartbeat" : ISODate("2018-09-24T14:49:45.539Z"),
            "lastHeartbeatRecv" : ISODate("2018-09-24T14:49:44.989Z"),
            "pingMs" : NumberLong(0),
            "syncingTo" : "192.168.103.100:25002",
            "configVersion" : 3
            }

            Here, the primary is running on port 25001, and the two secondaries are running on ports 25002 and 25003 on the same host.

            Secondary nodes can only sync from Primary?

            No, it’s not mandatory. Each secondary can replicate data from the primary or any other secondary to the node that is syncing. This term is also known as chaining, and by default, this is enabled.

            In the above replica set, you can see that secondary node

            "_id":2 
              is syncing from another secondary node
            "_id":1
               as
            "syncingTo" : "192.168.103.100:25002" 

            This can also be found in the logs as here the parameter

            chainingAllowed :true
               is the default setting.

            settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: 60000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5ba8ed10d4fddccfedeb7492') } }

            Chaining?

            That means that a secondary member node is able to replicate from another secondary member node instead of from the primary node. This helps to reduce the load from the primary. If the replication lag is not tolerable, then chaining could be disabled.

            For more details about chaining and the steps to disable it please refer to my earlier blog post here.

            Ok, then how does the secondary node select the source to sync from?

            If Chaining is False

            When chaining is explicitly set to be false, then the secondary node will sync from the primary node only or could be overridden temporarily.

            If Chaining is True

            • Before choosing any sync node, TopologyCoordinator performs validations like:
              • Whether chaining is set to true or false.
              • If that particular node is part of the current replica set configurations.
              • Identify the node ahead with oplog with the lowest ping time.
              • The source code that includes validation is here.
            • Once the validation is done, SyncSourceSelector relies on SyncSourceResolver which contains the result and details for the new sync source
            • To get the details and response, SyncSourceResolver coordinates with ReplicationCoordinator
            • This ReplicationCoordinator is responsible for the replication, and co-ordinates with TopologyCoordinator
            • The TopologyCoordinator is responsible for topology of the cluster. It finds the primary oplog time and checks for the maxSyncSourceLagSecs
            • It will reject the source to sync from if the maxSyncSourceLagSecs  is greater than the newest oplog entry. The code for this can be found here
            • If the criteria for the source selection is not fulfilled, then BackgroundSync thread waits and restarts the whole process again to get the sync source.

            Example for “unable to find a member to sync from” then, in the next attempt, finding a candidate to sync from

            This can be found in the log like this. On receiving the message from rsBackgroundSync thread

            could not find member to sync from
            , the whole internal process restarts and finds a member to sync from i.e.
            sync source candidate: 192.168.103.100:25001
            , which means it is now syncing from node 192.168.103.100 running on port 25001.

            2018-09-24T13:58:43.197+0000 I REPL     [rsSync] transition to RECOVERING
            2018-09-24T13:58:43.198+0000 I REPL     [rsBackgroundSync] could not find member to sync from
            2018-09-24T13:58:43.201+0000 I REPL     [rsSync] transition to SECONDARY
            2018-09-24T13:58:59.208+0000 I REPL     [rsBackgroundSync] sync source candidate: 192.168.103.100:25001

            • Once the sync source node is selected, SyncSourceResolver probes the sync source to confirm that it is able to fetch the oplogs.
            • RollbackID is also fetched i.e. rbid  after the first batch is returned by oplogfetcher.
            • If all eligible sync sources are too fresh, such as during initial sync, then the syncSourceStatus Oplog start is missing and earliestOpTimeSeen will set a new minValid.
            • This minValid is also set in the case of rollback and abrupt shutdown.
            • If the node has a minValid entry then this is checked for the eligible sync source node.

            Example showing the selection of a new sync source when the existing source is found to be invalid

            Here, as the logs show, during sync the node chooses a new sync source. This is because it found the original sync source is not ahead, so not does not contain recent oplogs from which to sync.

            2018-09-25T15:20:55.424+0000 I REPL     [replication-1] Choosing new sync source because our current sync source, 192.168.103.100:25001, has an OpTime ({ ts: Timestamp 1537879296000|1, t: 4 }) which is not ahead of ours ({ ts: Timestamp 1537879296000|1, t: 4 }), it does not have a sync source, and it's not the primary (sync source does not know the primary)

            2018-09-25T15:20:55.425+0000 W REPL [rsBackgroundSync] Fetcher stopped querying remote oplog with error: InvalidSyncSource: sync source 192.168.103.100:25001 (config version: 3; last applied optime: { ts: Timestamp 1537879296000|1, t: 4 }; sync source index: -1; primary index: -1) is no longer valid

            • If the secondary node is too far behind the eligible sync source node, then the node will enter maintenance node and then resync needs to be call manually.
            • Once the sync source is chosen, BackgroundSync starts oplogFetcher.

            Example for oplogFetcher

            Here is an example of fetching oplog from the “oplog.rs” collection, and checking for the greater than required timestamp.

            2018-09-26T10:35:07.372+0000 I COMMAND  [conn113] command local.oplog.rs command: getMore { getMore: 20830044306, collection: "oplog.rs", maxTimeMS: 5000, term: 7, lastKnownCommittedOpTime: { ts: Timestamp 1537955038000|1, t: 7 } } originatingCommand: { find: "oplog.rs", filter: { ts: { $gte: Timestamp 1537903865000|1 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 7, readConcern: { afterOpTime: { ts: Timestamp 1537903865000|1, t: 6 } } } planSummary: COLLSCAN cursorid:20830044306 keysExamined:0 docsExamined:0 numYields:1 nreturned:0 reslen:451 locks:{ Global: { acquireCount: { r: 6 } }, Database: { acquireCount: { r: 3 } }, oplog: { acquireCount: { r: 3 } } } protocol:op_command 3063398ms

            When and what details replica set nodes communicate with each other?

            At a regular interval, all the nodes communicate with each other to check the status of the primary node, check the status of the sync source, to get the oplogs and so on.

            ReplicationCoordinator has ReplicaSetConfig that has a list of all the replica set nodes, and each node has a copy of it. This makes nodes aware of other nodes under same replica set.

            This is how nodes communicate in more detail:

            Heartbeats: This checks the status of other nodes i.e. alive or die

            heartbeatInterval: Every node, at an interval of two seconds, sends the other nodes a heartbeat to make them aware that “yes I am alive!”

            heartbeatTimeoutSecs: This is a timeout, and means that if the heartbeat is not returned in 10 seconds then that node is marked as inaccessible or simply die.

            Every heartbeat is identified by these replica set details:

            • replica set config version
            • replica set name
            • Sender host address
            • id from the replicasetconfig

            The source code could be referred to from here.

            When the remote node receives the heartbeat, it processes this data and validates if the details are correct. It then prepares a ReplSetHeartbeatResponse, that includes:

            • Name of the replica set, config version, and optime details
            • Details about primary node as per the receiving node.
            • Sync source details and state of receiving node

            This heartbeat data is processed, and if primary details are found then the election gets postponed.

            TopologyCoordinator checks for the heartbeat data and confirms if the node is OK or NOT. If the node is OK then no action is taken. Otherwise it needs to be reconfigured or else initiate a priority takeover based on the config.

            Response from oplog fetcher

            To get the oplogs from the sync source, nodes communicate with each other. This oplog fetcher fetches oplogs through “find” and “getMore”. This will only affect the downstream node that gets metadata from its sync source to update its view from the replica set.

            OplogQueryMetadata only comes with OplogFetcher responses

            OplogQueryMetadata comes with OplogFetcher response and ReplSetMetadata comes with all the replica set details including configversion and replication commands.

            Communicate to update Position commands:

            This is to get an update for replication progress. ReplicationCoordinatorExternalState creates SyncSourceFeedback sends replSetUpdatePosition commands.

            It includes Oplog details, Replicaset config version, and replica set metadata.

            If a new node is added to the existing replica set, how will that node get the data?

            If a new node is added to the existing replica set then the “initial sync” process takes place. This initial sync can be done in two ways:

            1. Just add the new node to the replicaset and let initial sync threads restore the data. Then it syncs from the oplogs until it reaches the secondary state.
            2. Copy the data from the recent data directory to the node, and restart this new node. Then it will also sync from the oplogs until it reaches the secondary state.

            This is how it works internally

            When “initial sync” or “rsync” is called by ReplicationCoordinator  then the node goes to “STARTUP2” state, and this initial sync is done in DataReplicator

            • A sync source is selected to get the data from, then it drops all the databases except the local database, and oplogs are recreated.
            • DatabasesCloner asks syncsource for a list of the databases, and for each database it creates DatabaseCloner.
            • For each DatabaseCloner it creates CollectionCloner to clone the collections
            • This CollectionCloner calls ListIndexes on the syncsource and creates a CollectionBulkLoader for parallel index creation while data cloning
            • The node also checks for the sync source rollback id. If rollback occurred, then it restarts the initial sync. Otherwise, datareplicator is done with its work and then replicationCoordinator assumes the role for ongoing replication.

            Example for the “initial sync” :

            Here node enters  

            "STARTUP2"- "transition to STARTUP2"

            Then sync source gets selected and drops all the databases except the local database.  Next, replication oplog is created and CollectionCloner is called.

            Local database not dropped: because every node has its own “local” database with its own and other nodes’ information, based on itself, this database is not replicated to other nodes.

            2018-09-26T17:57:09.571+0000 I REPL     [ReplicationExecutor] transition to STARTUP2
            2018-09-26T17:57:14.589+0000 I REPL     [replication-1] sync source candidate: 192.168.103.100:25003
            2018-09-26T17:57:14.590+0000 I STORAGE  [replication-1] dropAllDatabasesExceptLocal 1
            2018-09-26T17:57:14.592+0000 I REPL     [replication-1] creating replication oplog of size: 990MB... 2018-09-26T17:57:14.633+0000 I REPL     [replication-0] CollectionCloner::start called, on ns:admin.system.version

            Finished fetching all the oplogs, and finishing up initial sync.

            2018-09-26T17:57:15.685+0000 I REPL     [replication-0] Finished fetching oplog during initial sync: CallbackCanceled: Callback canceled. Last fetched optime and hash: { ts: Timestamp 1537984626000|1, t: 9 }[-1139925876765058240]
            2018-09-26T17:57:15.685+0000 I REPL     [replication-0] Initial sync attempt finishing up.

            What are oplogs and where do these reside?

            oplogs stands for “operation logs”. We have used this term so many times in this blog post as these are the mandatory logs for the replica set. These operations are in the capped collection called “oplog.rs”  that resides in “local” database.

            Below, this is how oplogs are stored in the collection “oplog.rs” that includes details for timestamp, operations, namespace, output.

            rplint:PRIMARY> use local
            rplint:PRIMARY> show collections
            oplog.rs
            rplint:PRIMARY> db.oplog.rs.findOne()
            {
             "ts" : Timestamp(1537797392, 1),
             "h" : NumberLong("-169301588285533642"),
             "v" : 2,
             "op" : "n",
             "ns" : "",
             "o" : {
             "msg" : "initiating set"
             }
            }

            It consists of rolling update operations coming to the database. Then these oplogs replicate to the secondary node(s) to maintain the high availability of the data in case of failover.

            When the replica MongoDB instance starts, it creates an oplog ocdefault size. For Wired tiger, the default size is 5% of disk space, with a lower bound size of 990MB. So here in the example it creates 990MB of data. If you’d like to learn more about oplog size then please refer here

            2018-09-26T17:57:14.592+0000 I REPL     [replication-1] creating replication oplog of size: 990MB...

            What if the same oplog is applied multiple times, will that not lead to inconsistent data?

            Fortunately, oplogs are Idempotent that means the value will remain unchanged, or will provide the same output, even when applied multiple times.

            Let’s check an example:

            For the $inc operator that will increment the value by 1 for the filed “item”, if this oplog is applied multiple times then the result might lead to an inconsistent record if this is not Idempotent. However, rather than increasing the item value multiple times, it is actually applied only once.

            rplint:PRIMARY> use db1
            //inserting one document
            rplint:PRIMARY> db.col1.insert({item:1, name:"abc"})
            //updating document by incrementing item value with 1
            rplint:PRIMARY> db.col1.update({name:"abc"},{$inc:{item:1}})
            //updated value is now item:2
            rplint:PRIMARY> db.col1.find()
            { "_id" : ObjectId("5babd57cce2ef78096ac8e16"), "item" : 2, "name" : "abc" }

            This is how these operations are stored in oplog, here this $inc value is stored in oplog as $set

            rplint:PRIMARY> db.oplog.rs.find({ns:"db1.col1"})
            //insert operation
            { "ts" : Timestamp(1537987964, 2), "t" : NumberLong(9), "h" : NumberLong("8083740413874479202"), "v" : 2, "op" : "i", "ns" : "db1.col1", "o" : { "_id" : ObjectId("5babd57cce2ef78096ac8e16"), "item" : 1, "name" : "abc" } }
            //$inc operation is changed as ""$set" : { "item" : 2"
            { "ts" : Timestamp(1537988022, 1), "t" : NumberLong(9), "h" : NumberLong("-1432987813358665721"), "v" : 2, "op" : "u", "ns" : "db1.col1", "o2" : { "_id" : ObjectId("5babd57cce2ef78096ac8e16") }, "o" : { "$set" : { "item" : 2 } } }

            That means that however many  times it is applied, it will generate the same results, so no inconsistent data!

            I hope this blog post helps you to understand multiple scenarios for MongoDB replica sets, and how data replicates to the nodes.

            by Aayushi Mangal at October 10, 2018 11:08 AM

            October 09, 2018

            Peter Zaitsev

            PostgreSQL Monitoring: Set Up an Enterprise-Grade Server (and Sign Up for Webinar Weds 10/10…)

            PostgreSQL Monitoring

            PostgreSQL logoThis is the last post in our series on building an enterprise-grade PostgreSQL set up using open source tools, and we’ll be covering monitoring.

            The previous posts in this series discussed aspects such as security, backup strategy, high availability, connection pooling and load balancing, extensions, and detailed logging in PostgreSQL. Tomorrow, Wednesday, October 10 at 10AM EST, we will be reviewing these topics together, and showcasing then in practice in a webinar format: we hope you can join us!

             

            Monitoring databases

            The importance of monitoring the activity and health of production systems is unquestionable. When it comes to the database, with its high number of customizable settings, the ability to track its various metrics (status counters and gauges) allows for the maintenance of a historical record of its performance over time. This can be used for capacity planningtroubleshooting and validation.

            When it comes to capacity planning, a monitoring solution is a helpful tool to help you assess how the current setup is faring. At the same time, it can help predict future needs based on trends, such as the increase of active connections, queries, and CPU usage. For example, an increase in CPU usage might be due to a genuine increase in workload, but it could also be a sign of unoptimized queries growing in popularity. In which case, comparing CPU with disk access might provide a more complete view of what is going on.

            Being able to easily correlate data like this helps you to catch minor issues and to plan accordingly, sometimes allowing you to avoid an easier but more costly solution of scaling up to mitigate problems like this. But having the right monitoring solution is really invaluable when it comes to investigative work and root cause analysis. Trying to understand a problem that has already taken place is a rather complicated, and often unenviable, task unless you established a continuous, watchful eye on the set up for the whole time.

            Finally, a monitoring solution can help you validate changes made in the business logic in general or in the database configuration in specific. By comparing prior and post results for a given metric or for overall performance, you can observe the impact of such changes in practice.

            Monitoring PostgreSQL with open source solutions

            There is a number of monitoring solutions for PostgreSQL and postgresql.org’s Wiki provides an extensive list, albeit a little outdated. It categorizes the main monitoring solutions into two distinct categories: those that can be identified as generic solutions—and can be extended to cover different technologies through custom plugins—and those labeled as Postgres-centric, which are specific to PostgreSQL.

            In the first group, we find venerated open source monitoring tools such as Munin, Zabbix, and CactiNagios could have also been added to this group but it was instead indirectly included in the “Checkers” group. That category includes monitoring scripts that can be used both in stand-alone mode or as feeders (plugins) for “Nagios like software“. Examples of these are check_pgactivity and check_postgres.

            One omission from this list is Grafana, a modern time series analytics platform conceived to display metrics from a number of different data sources. Grafana includes a solution packaged as a PostgreSQL native plugin. Percona has built its Percona Monitoring and Management (PMM) platform around Grafana, using Prometheus as its data source. Since version 1.14.0, PMM supports PostgreSQL. Query Analytics (QAN) integration is coming soon.

            An important factor that all these generic solutions have in common is that they are widely used for the monitoring of a diverse collection of services, like you’d normally find in enterprise-like environments. It’s common for a given company to adopt one, or sometimes two, such solutions with the aim of monitoring their entire infrastructure. This infrastructure often includes a heterogeneous combination of databases and application servers.

            Nevertheless, there is a place for complementary Postgres-centric monitoring solutions in such enterprise environments too. These solutions are usually implemented with a specific goal in mind. Two examples we can mention in this context are PGObserver, which has a focus on monitoring stored procedures, and pgCluu, with its focus on auditing.

            Monitoring PostgreSQL with PMM

            We built an enterprise-grade PostgreSQL set up for the webinar, and use PMM for monitoring. We will be showcasing some of PMM’s main features, and highlighting some of the most important metrics to watch, during our demo.You may want to have a look at this demo setup to get a feel of how our PostgreSQL Overview dashboard looks:

            You can find instructions on how to setup PMM for monitoring your PostgreSQL server in our documentation space. And if there’s still time, sign up for tomorrow’s webinar!

             

            by Fernando Laudares Camargos at October 09, 2018 04:23 PM

            Upcoming Webinar Thurs 10/11: Build Highly Scalable IoT Architectures with Percona Server for MongoDB

            Highly Scalable IoT Architectures

            Highly Scalable IoT ArchitecturesPlease join Percona’s Product Manager for Percona Server for MongoDB, Jeff Sandstrom; Sr. Tech Ops Architect for MongoDB, Tim Vaillancourt; and Mesosphere’s Senior Director of Community and Evangelism, Matt Jarvis, on Thursday, October 11, 2018 at 10:00 AM PDT (UTC–7) / 1:00 PM EDT (UTC–4), as they demonstrate how to build highly scalable Internet of Things architectures with Percona Server for MongoDB on DC/OS.

             

            Percona Server for MongoDB is a free and open-source drop-in replacement for MongoDB Community Edition. It combines all the features and benefits of MongoDB Community Edition with enterprise-class features from Percona, including an in-memory engine, log redaction, auditing, and hot backups.

            Mesosphere DC/OS is an enterprise-grade, datacenter-scale operating system, providing a single platform for running containers, data services, and distributed applications on a single unified computing environment.

            In this webinar, we’ll:

            • Review the benefits of Percona Server for MongoDB
            • Discuss a variety of use cases for Percona Server for MongoDB on DC/OS
            • Demonstrate exactly how you can use Percona Server for MongoDB on DC/OS to capture data from Internet of Things devices
            • Tell you how you can participate in the beta program for this exciting solution

            Register for this webinar to learn how to build highly scalable IoT architectures with Percona Server for MongoDB on DC/OS.

            by Jeff Sandstrom at October 09, 2018 02:30 PM

            Announcement: Second Alpha Build of Percona XtraBackup 8.0 Is Available

            Percona XtraBackup 8.0

            Percona XtraBackup 8.0The second alpha build of Percona XtraBackup 8.0.2 is now available in the Percona experimental software repositories.

            Note that, due to the new MySQL redo log and data dictionary formats, the Percona XtraBackup 8.0.x versions will only be compatible with MySQL 8.0.x and Percona Server for MySQL 8.0.x. This release supports backing up Percona Server 8.0 Alpha.

            For experimental migrations from earlier database server versions, you will need to backup and restore and using XtraBackup 2.4 and then use mysql_upgrade from MySQL 8.0.x

            PXB 8.0.2 alpha is available for the following platforms:

            • RHEL/Centos 6.x
            • RHEL/Centos 7.x
            • Ubuntu 14.04 Trusty*
            • Ubuntu 16.04 Xenial
            • Ubuntu 18.04 Bionic
            • Debian 8 Jessie*
            • Debian 9 Stretch

            Information on how to configure the Percona repositories for apt and yum systems and access the Percona experimental software is here.

            * We might drop these platforms before GA release.

            Improvements

            • PXB-1658: Import keyring vault plugin from Percona Server 8
            • PXB-1609: Make version_check optional at build time
            • PXB-1626: Support encrypted redo logs
            • PXB-1627: Support obtaining binary log coordinates from performance_schema.log_status

            Fixed Bugs

            • PXB-1634: The CREATE TABLE statement could fail with the DUPLICATE KEY error
            • PXB-1643: Memory issues reported by ASAN in PXB 8
            • PXB-1651: Buffer pool dump could create a (null) file during prepare stage of Mysql8.0.12 data
            • PXB-1671: A backup could fail when the MySQL user was not specified
            • PXB-1660: InnoDB: Log block N at lsn M has valid header, but checksum field contains Q, should be P

            Other bugs fixed: PXB-1623PXB-1648PXB-1669PXB-1639, and PXB-1661.

            by Borys Belinsky at October 09, 2018 06:33 AM

            MariaDB Foundation

            My first week, Looking Forward

            First of all, thank you for your warm words of welcome – coming from so many people across different media, I think this is a very positive sign for working with the organisations and individuals within the MariaDB community. Regardless of the situation, a change like this, or actually the person that comes in, is […]

            The post My first week, Looking Forward appeared first on MariaDB.org.

            by Arjen Lentz at October 09, 2018 03:00 AM

            October 08, 2018

            Peter Zaitsev

            Persistence of autoinc fixed in MySQL 8.0

            MySQL 8.0 autoinc persistence fixed

            MySQL 8.0 autoinc persistence fixedThe release of MySQL 8.0 has brought a lot of bold implementations that touched on things that have been avoided before, such as added support for common table expressions and window functions. Another example is the change in how AUTO_INCREMENT (autoinc) sequences are persisted, and thus replicated.

            This new implementation carries the fix for bug #73563 (Replace result in auto_increment value less or equal than max value in row-based), which we’ve only found about recently. The surprising part is that the use case we were analyzing is a somewhat common one; this must be affecting a good number of people out there.

            Understanding the bug

            The business logic of the use case is such the UNIQUE column found in a table whose id is managed by an AUTO_INCREMENT sequence needs to be updated, and this is done with a REPLACE operation:

            “REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.”

            So, what happens in practice in this particular case is a DELETE followed by an INSERT of the target row.

            We will explore this scenario here in the context of an oversimplified currency converter application that uses USD as base reference:

            CREATE TABLE exchange_rate (
            id INT PRIMARY KEY AUTO_INCREMENT,
            currency VARCHAR(3) UNIQUE,
            rate FLOAT(5,3)
            ) ENGINE=InnoDB;

            Let’s add a trio of rows to this new table:

            INSERT INTO exchange_rate (currency,rate) VALUES ('EUR',0.854), ('GBP',0.767), ('BRL',4.107);

            which gives us the following initial set:

            master (test) > select * from exchange_rate;
            +----+----------+-------+
            | id | currency | rate  |
            +----+----------+-------+
            |  1 | EUR      | 0.854 |
            |  2 | GBP      | 0.767 |
            |  3 | BRL      | 4.107 |
            +----+----------+-------+
            3 rows in set (0.00 sec)

            Now we update the rate for Brazilian Reais using a REPLACE operation:

            REPLACE INTO exchange_rate SET currency='BRL', rate=4.500;

            With currency being a UNIQUE field the row is fully replaced:

            master (test) > select * from exchange_rate;
            +----+----------+-------+
            | id | currency | rate  |
            +----+----------+-------+
            |  1 | EUR      | 0.854 |
            |  2 | GBP      | 0.767 |
            |  4 | BRL      | 4.500 |
            +----+----------+-------+
            3 rows in set (0.00 sec)

            and thus the autoinc sequence is updated:

            master (test) > SHOW CREATE TABLE exchange_rate\G
            *************************** 1. row ***************************
                 Table: exchange_rate
            Create Table: CREATE TABLE `exchange_rate` (
            `id` int(11) NOT NULL AUTO_INCREMENT,
            `currency` varchar(3) DEFAULT NULL,
            `rate` float(5,3) DEFAULT NULL,
            PRIMARY KEY (`id`),
            UNIQUE KEY `currency` (`currency`)
            ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1
            1 row in set (0.00 sec)

            The problem is that the autoinc sequence is not updated in the replica as well:

            slave1 (test) > select * from exchange_rate;show create table exchange_rate\G
            +----+----------+-------+
            | id | currency | rate  |
            +----+----------+-------+
            |  1 | EUR      | 0.854 |
            |  2 | GBP      | 0.767 |
            |  4 | BRL      | 4.500 |
            +----+----------+-------+
            3 rows in set (0.00 sec)
            *************************** 1. row ***************************
                 Table: exchange_rate
            Create Table: CREATE TABLE `exchange_rate` (
            `id` int(11) NOT NULL AUTO_INCREMENT,
            `currency` varchar(3) DEFAULT NULL,
            `rate` float(5,3) DEFAULT NULL,
            PRIMARY KEY (`id`),
            UNIQUE KEY `currency` (`currency`)
            ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1
            1 row in set (0.00 sec)

            Now, the moment we promote that replica as master and start writing to this table we’ll hit a duplicate key error:

            slave1 (test) > REPLACE INTO exchange_rate SET currency='BRL', rate=4.600;
            ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'

            Note that:

            a) the transaction fails and the row is not replaced, however the autoinc sequence is incremented:

            slave1 (test) > SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE table_schema='test' AND table_name='exchange_rate';
            +----------------+
            | AUTO_INCREMENT |
            +----------------+
            |              5 |
            +----------------+
            1 row in set (0.00 sec)

            b) this problem only happens with row-based replication (binlog_format=ROW), where REPLACE in this case is logged as a row UPDATE:

            # at 6129
            #180829 18:29:55 server id 100  end_log_pos 5978 CRC32 0x88da50ba Update_rows: table id 117 flags: STMT_END_F
            ### UPDATE `test`.`exchange_rate`
            ### WHERE
            ###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
            ###   @3=4.107                /* FLOAT meta=4 nullable=1 is_null=0 */
            ### SET
            ###   @1=4 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
            ###   @3=4.5                  /* FLOAT meta=4 nullable=1 is_null=0 */

            With statement-based replication—or even mixed format—the REPLACE statement is replicated as is: it will trigger a DELETE+INSERT in the background on the replica and thus update the autoinc sequence in the same way it did on the master.

            This example (tested with Percona Server versions 5.5.61, 5.6.36 and 5.7.22) helps illustrate the issue with autoinc sequences not being persisted as they should be with row-based replication. However, MySQL’s Worklog #6204 includes a couple of scarier scenarios involving the master itself, such as when the server crashes while a transaction is writing to a table similar to the one used in the example above. MySQL 8.0 remedies this bug.

            Workarounds

            There are a few possible workarounds to consider if this problem is impacting you and if neither upgrading to the 8 series nor resorting to statement-based or mixed replication format are viable options.

            We’ll be discussing three of them here: one that resorts around the execution of checks before a failover (to detect and fix autoinc inconsistencies in replicas), another that requires a review of all REPLACE statements like the one from our example and adapt it as to include the id field, thus avoiding the bug, and finally one that requires changing the schema of affected tables in such a way that the target field is made the Primary Key of the table while id (autoinc) is converted into a UNIQUE key.

            a) Detect and fix

            The less intrusive of the workarounds we conceived for the problem at hand in terms of query and schema changes is to run a check for each of the tables that might be facing this issue in a replica before we promote it as master in a failover scenario:

            slave1 (test) > SELECT ((SELECT MAX(id) FROM exchange_rate)>=(SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE table_schema='test' AND table_name='exchange_rate')) as `check`;
            +-------+
            | check |
            +-------+
            |     1 |
            +-------+
            1 row in set (0.00 sec)

            If the table does not pass the test, like ours didn’t at first (just before we attempted a REPLACE after we failed over to the replica), then update autoinc accordingly. The full routine (check + update of autoinc) could be made into a single stored procedure:

            DELIMITER //
            CREATE PROCEDURE CheckAndFixAutoinc()
            BEGIN
             DECLARE done TINYINT UNSIGNED DEFAULT 0;
             DECLARE tableschema VARCHAR(64);
             DECLARE tablename VARCHAR(64);
             DECLARE columnname VARCHAR(64);  
             DECLARE cursor1 CURSOR FOR SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys') AND EXTRA LIKE '%auto_increment%';
             DECLARE CONTINUE HANDLER FOR NOT FOUND SET done=1;
             OPEN cursor1;  
             start_loop: LOOP
              IF done THEN
                LEAVE start_loop;
              END IF;
              FETCH cursor1 INTO tableschema, tablename, columnname;
              SET @get_autoinc = CONCAT('SELECT @check1 := ((SELECT MAX(', columnname, ') FROM ', tableschema, '.', tablename, ')>=(SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA=\'', tableschema, '\' AND TABLE_NAME=\'', tablename, '\')) as `check`');
              PREPARE stm FROM @get_autoinc;
              EXECUTE stm;
              DEALLOCATE PREPARE stm;
              IF @check1>0 THEN
                BEGIN
                  SET @select_max_id = CONCAT('SELECT @max_id := MAX(', columnname, ')+1 FROM ', tableschema, '.', tablename);
                  PREPARE select_max_id FROM @select_max_id;
                  EXECUTE select_max_id;
                  DEALLOCATE PREPARE select_max_id;
                  SET @update_autoinc = CONCAT('ALTER TABLE ', tableschema, '.', tablename, ' AUTO_INCREMENT=', @max_id);
                  PREPARE update_autoinc FROM @update_autoinc;
                  EXECUTE update_autoinc;
                  DEALLOCATE PREPARE update_autoinc;
                END;
              END IF;
             END LOOP start_loop;  
             CLOSE cursor1;
            END//
            DELIMITER ;

            It doesn’t allow for as clean a failover as we would like but it can be helpful if you’re stuck with MySQL<8.0 and binlog_format=ROW and cannot make changes to your queries or schema.

            b) Include Primary Key in REPLACE statements

            If we had explicitly included the id (Primary Key) in the REPLACE operation from our example it would have also been replicated as a DELETE+INSERT even when binlog_format=ROW:

            master (test) > REPLACE INTO exchange_rate SET currency='BRL', rate=4.500, id=3;
            # at 16151
            #180905 13:32:17 server id 100  end_log_pos 15986 CRC32 0x1d819ae9  Write_rows: table id 117 flags: STMT_END_F
            ### DELETE FROM `test`.`exchange_rate`
            ### WHERE
            ###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
            ###   @3=4.107                /* FLOAT meta=4 nullable=1 is_null=0 */
            ### INSERT INTO `test`.`exchange_rate`
            ### SET
            ###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
            ###   @3=4.5                  /* FLOAT meta=4 nullable=1 is_null=0 */
            # at 16199
            #180905 13:32:17 server id 100  end_log_pos 16017 CRC32 0xf11fed56  Xid = 184
            COMMIT/*!*/;

            We could point out that we are doing it wrong by not having the id included in the REPLACE statement in the first place; the reason for not doing so would be mostly related to avoiding an extra lookup for each replace (to obtain the id for the currency we want to update). On the other hand, what if your business logic do expects the id to change at each REPLACE ? You should have such requirement in mind when considering this workaround as it is effectively a functional change to what we had initially.

            c) Make the target field the Primary Key and keep autoinc as a UNIQUE key

            If we make currency the Primary Key of our table and id a UNIQUE key instead:

            CREATE TABLE exchange_rate (
            id INT UNIQUE AUTO_INCREMENT,
            currency VARCHAR(3) PRIMARY KEY,
            rate FLOAT(5,3)
            ) ENGINE=InnoDB;

            the same REPLACE operation will be replicated as a DELETE+INSERT too:

            # at 19390
            #180905 14:03:56 server id 100  end_log_pos 19225 CRC32 0x7042dcd5  Write_rows: table id 131 flags: STMT_END_F
            ### DELETE FROM `test`.`exchange_rate`
            ### WHERE
            ###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2='BRL' /* VARSTRING(3) meta=3 nullable=0 is_null=0 */
            ###   @3=4.107                /* FLOAT meta=4 nullable=1 is_null=0 */
            ### INSERT INTO `test`.`exchange_rate`
            ### SET
            ###   @1=4 /* INT meta=0 nullable=0 is_null=0 */
            ###   @2='BRL' /* VARSTRING(3) meta=3 nullable=0 is_null=0 */
            ###   @3=4.5                  /* FLOAT meta=4 nullable=1 is_null=0 */
            # at 19438
            #180905 14:03:56 server id 100  end_log_pos 19256 CRC32 0x79efc619  Xid = 218
            COMMIT/*!*/;

            Of course, the same would be true if we had just removed id entirely from the table and kept currency as the Primary Key. This would work in our particular test example but that won’t always be the case. Please note though that if you do keep id on the table you must make it a UNIQUE key: this workaround is based on the fact that this key becomes a second unique constraint, which triggers a different code path to log a replace operation. Had we made it a simple, non-unique key instead that wouldn’t be the case.

            If you have any comments or suggestions about the issue addressed in this post, the workarounds we propose, or even a different view of the problem you would like to share please leave a comment in the section below.

            Co-Author: Trey Raymond

            Trey RaymondTrey Raymond is a Sr. Database Engineer for Oath Inc. (née Yahoo!), specializing in MySQL. Since 2010, he has worked to build the company’s database platform and supporting team into industry leaders.

            While a performance guru at heart, his experience and responsibilities range from hardware and capacity planning all through the stack to database tool and utility development.

            He has a reputation for breaking things to learn something new.

            Co-Author: Fernando Laudares

            fernando laudaresFernando is a Senior Support Engineer with Percona. Fernando’s work experience includes the architecture, deployment and maintenance of IT infrastructures based on Linux, open source software and a layer of server virtualization. He’s now focusing on the universe of MySQL, MongoDB and PostgreSQL with a particular interest in understanding the intricacies of database systems, and contributes regularly to this blog. You can read his other articles here.

            by Fernando Laudares Camargos at October 08, 2018 04:00 PM

            Detailed Logging for Enterprise-Grade PostreSQL

            detailed logging PostgreSQL

            PostgreSQL® logoIn this penultimate post from our series on building an enterprise-grade PostgreSQL environment we cover the parameters we have enabled to configure detailed logging in the demo setup we will showcase in our upcoming webinar.

            Detailed logging in PostgreSQL and log analyzer

            Like other RDBMS, PostgreSQL allows you to maintain a log of activities and error messages. Until PostgreSQL 9.6, PostgreSQL log files were generated in pg_log directory (inside the data directory) by default. Since PostgreSQL 10, pg_log has been renamed to simply log. However, this directory can be modified to a different location by modifying the parameter log_directory.

            Unlike MySQL, PostgreSQL writes the error and activity log to the same log file thus it may grow to several GBs when detailed logging is enabled. In these cases, logging becomes IO-intensive thus it is recommended to store log files in a different storage to the one hosting the data directory.

            Parameters to enable detailed logging

            Here’s a list of parameters used to customize logging in PostgreSQL. All of them need to be modified in the postgresql.conf or postgresql.auto.conf files.

            logging_collector: in order to log any activity in PostgreSQL this parameter must be enabled. The backend process responsible for logging database activity is called logger, it gets started when logging_collector is set to ON. Changing this parameter requires a PostgreSQL restart.

            log_min_duration_statement: this parameter is used primarily to set a time threshold: queries running longer than such should be logged (as “slow queries”). Setting it to -1 disables logging of statements. Setting it to 0 enables the logging of every statement running in the database, regardless of its duration. The time unit should follow the actual value, for example: 250ms,  250s, 250min, 1h. Changing this parameter does not require a PostgreSQL restart – a simple reload of the configuration is enough.reload but not a restart. For example:

            log_min_duration_statement = 5s
              logs every statement running for 5 seconds or longer.

            log_line_prefix: helps you customize every log line being printed in the PostgreSQL log file. You can log the process id, application name, database name and other details for every statement as required. The following log_line_prefix may be helpful in most scenarios:

            log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h'

            The above setting records the following for every statement being logged:

            %t : Time stamp without milliseconds
            %p : Process id
            %l-1 : Number of the log line for each session or process, starting at 1
            %u : User name
            %d : Database name
            %a : Application name
            %h : Remote host name or IP address

            With the above settings employed for log_line_prefix, the log appears as follows:

            2018-06-08 12:36:26 EDT [21315]: [1-1] user=postgres,db=percona,app=psql,client=192.168.0.12 LOG: duration: 2.611 ms statement: select * from departments where id = 26;
            2018-06-08 12:36:49 EDT [21323]: [1-1] user=postgres,db=percona,app=psql,client=192.168.0.12 LOG: duration: 2.251 ms statement: select count(*) from departments;

            To see more on log_line_prefix, You can refer to the PostgreSQL documentation for further details on this feature.

            log_duration: the enabling of this parameter records the duration of every completed statement in PostgreSQL log, and this irrespective of log_min_duration_statement. Have in mind that, as it happens with log_min_duration_statement, the enabling of log_duration may increase log file usage and add affect the server’s general performance. For this reason, if you already have log_min_duration_statement enabled it is often suggested to disable log_duration, unless there’s a specific need to keep track of both.

            log_lock_waits: when log_lock_waits is enabled a log message is recorded when a session waits longer than deadlock_timeout to acquire a lock.

            log_checkpoints: logs all checkpoints and restart points to the PostgreSQL log file.

            log_rotation_size: defines the size limit for each log file; once it reaches this threshold the log file is rotated.
            Example: 

            log_rotation_size = '500MB'
              – every log file is limited to a size of 500 MB.

            log_rotation_age: determines the maximum life span for a log file, forcing its rotation once this threshold is reached. This parameter is usually set in terms of hours, or maybe days, however the minimum granularity is a minute. However, if log_rotation_size is reached first, the log gets rotated anyway, irrespective of this setting.
            Example: 

            log_rotation_age = 1d

            log_statement: controls what type of SQLs are logged. The recommended setting is DDL, which logs all DDLs that are executed. Tracking those allow you to later audit when a given DDL was executed, and by who. By monitoring and understanding the amount of information it may write to the log file you may consider modifying this setting. Other possible values are none, mod (includes DDLs plus DMLs) and all.

            log_temp_files: logs information related to a temporary table file whose size is greater than this value (in KBs).

            log_directory: defines the directory in which log files are created. Once more, please note that if you have enabled detailed logging it is recommended to have a separate disk—different from the data directory disk—allocated for log_directory.
            Example: 

            log_directory = /this_is_a_new_disk/pg_log

            Log Analyzer – pgBadger

            You cannot have a separate error log and a slow query log. Everything is written to one log file, which may be periodically rotated based on time and size. Over a period of time, this log file size may increase to several MB’s or even GB’s depending on the amount of logging that has been enabled. It could get difficult for a DBA/developer to parse the log files and get a better view about what is running slowly and how many times a query has run. To help with this taks you may use pgBadger, a log analyzer for PostgreSQL, to parse log files and generate a rich HTML-based report that you can access from a browser. An example report can be seen in the screenshots below:

            We’ll be showing detailed logging and pgBadger in action in a couple of days so, if there’s still time, sign up for our October webinar !

            by Fernando Laudares Camargos at October 08, 2018 01:13 PM

            October 06, 2018

            Valeriy Kravchuk

            On MySQL XA Transactions

            One of the features I missed in my blog post on problematic MySQL features back in July is XA transactions. Probably I was too much in a hurry, as this feature is known to be somewhat buggy, limited and not widely used outside of Java applications. My first related feature request, Bug #40445 - "Provide C functions library implementing X/OPEN XA interface for Bea Tuxedo", was created almost 10 years ago, based on the issue from one of MySQL/Sun customers of that time. I remember some internal debates on how much time and efforts the implementation may require, but the decision was not made anyway, and one still can not directly integrate MySQL with Tuxedo transaction manager (that the idea of XA transactions originally came from). It's even more funny to see that feature request still just "Verified" when taking into account the fact that BEA Tuxedo software is Oracle's software since... 2008.

            XA Transactions support is a useful MySQL feature, but I wonder if one day it may just become abandoned as that West Pier in Brighton, or overwhelmed with many small bugs in a same way as this stairs to the beach in Hove...

            But maybe XA transactions are not widely used and nobody cares much about them?

            Let me try to do a quick review of related active bug reports and feature requests before making any conclusions:
            • Bug #91702 - "Feature Request: JOIN support for XA START command". This feature request was added less than 3 months ago and is still "Open". It means there are users interested in this feature, but Oracle engineers do not care much even to verify related requests, even less - to give them some priority. 
              See also Bug #78498 - "XA issue or limitation with 5.6.19 engine", reported 3 years ago, that is essentially about the same limitation. As bug reporter explained:
              "... it prevents us to use MySQL with Weblogic on 2 phase commit scenarii..."
            • Yet another example of a request ignored for a long time is Bug #90659 - "implicit commit and unlocking with xa start", that is about the inconsistency of current implementation. Even less (as we already know) they care about XA support outside of Java as one can conclude from the fact that Connector/Net related request, Bug #70587 - "Dot Net Distributed transaction not supported in MySql Server 5.6", had not got any attention since July, 2015...
            • Bug #91646 - "xa command still operation when super_read_only is true". This bug was reported in July by Zhenghu Wen. It seems nobody cares much about XA transactions integration when new features are added to MySQL server.
            • Bug #89860 - "XA may lost prepared transaction and cause different between master and slave". This bug reported by Michael Yang (See also Bug #88534) sounds really serious and was previously reported by Andrei Elkin (who works for MariaDB now) as Bug #76233 - "XA prepare is logged ahead of engine prepare". See also Bug #87560 - "XA PREPARE log order error in replication and binlog recovery" by Wei Zhao, who also contributed a patch. See also Bug #83983 - "innodb fail to recover the prepared xa transaction" (the bug reported by Dennis Gao is still "Open", while it's clearly related to or is a duplicate of "Verified" bugs mentioned above).
              So many related/duplicate problem reports, but no fix so far!
            • Bug #88748 - "InnoDB: Failing assertion: trx->conc_state == 1". This assertion failure was reported by Roel Van de Paar back in December, 2017. See also his Bug #84754 - "oid String::chop(): Assertion `strlen(m_ptr) == m_length' failed."
              I noted that Oracle recently invented new "low" severity levels, and this bug is S6 (Debug Builds). I do not really agree that assertions in debug builds are of so low severity - they are in the code for a reason, to prevent crashes in non-debug builds and all kinds of inconsistencies.
            • Bug #87526 - "The output of 'XA recover convert xid' is not useful". This bug reported by Sveta Smirnova caused a lot of troubles to poor users with prepared transactions hanging around for weeks after crash, as it prevented any easy way to get rid of them (and related locks) in some cases. The bug is still "Verified" in MySQL and "On hold" in Percona Server, while MariaDB fixed it in 10.3, see MDEV-14593.
            • Bug #87130 - "XA COMMIT not taken as transaction boundary". Yet another bug report with a patch from Wei Zhao.
            • Bug #75205 - "Master should write a LOST_EVENTS entry on xa commit after recovery." Daniël van Eeden reported this at early 5.7 pre-GA stage, and manual explains now that:
              "In MySQL 5.7.7 and later, there is a change in behavior and an XA transaction is written to the binary log in two parts. When XA PREPARE is issued, the first part of the transaction up to XA PREPARE is written using an initial GTID. A XA_prepare_log_event is used to identify such transactions in the binary log. When XA COMMIT or XA ROLLBACK is issued, a second part of the transaction containing only the XA COMMIT or XA ROLLBACK statement is written using a second GTID. Note that the initial part of the transaction, identified by XA_prepare_log_event, is not necessarily followed by its XA COMMIT or XA ROLLBACK, which can cause interleaved binary logging of any two XA transactions. The two parts of the XA transaction can even appear in different binary log files. This means that an XA transaction in PREPARED state is now persistent until an explicit XA COMMIT or XA ROLLBACK statement is issued, ensuring that XA transactions are compatible with replication."
              but the bug report is still "Verified".
              By the way, the need to deal with such prepared transactions recovered from the binary log caused problems like those listed above (with XA RECOVER CONVERT and order of preparing in the binary log vs engines that support XA...
            • Bug #71351 - "select hit query cache after xa commit, no result return". This bug probably affects only MySQL 5.5, so no wonder it's ignored now. Nobody tried to fix it while MySQL 5.5 was still supported, though.
            There are some more bugs originally filed in other categories, but still related to XA:
            • Bug #72036 - "XA isSameRM() shouldn't take database into account". This Connecotr/J bug was reported in 2014 by Jess Balint.
            • Bug #78050 - "Crash on when XA functions activated by a storage engine". It happens when binary log not enabled. This bug was reported by Zhenye Xie, who also contributed a patch later. Still this crashing bug remains "Verified".
            • Bug #87385 - "Partial external XA transactions are not rolled back correctly". Yet another bug report with a patch from Wei Zhao. See also his Bug #87389 - "Replication position not persisted correctly for XA transactions".
            • Bug #91633 - "Replication failure (errno 1399) on update in XA tx after deadlock". This bug reported by Lukas Sydorowski got recent comment from other community member yesterday. So, the feature is used these days, still.
            Now time for conclusions:
            1. Take extra care while using XA transactions in replication environments or with point in time recovery - you may easily end up with slaves out of sync with master and data lost.
            2. Feature requests related to XA transactions are mostly ignored, sometimes for a decade... 
            3. Patches contributed do not seem to speed up XA bug fixing.
            4. I'd say that Oracle does not care much about XA Transactions since MySQL 5.7 GA release in 2015.
            5. MySQL Community still use XA transactions with MySQL (and they will be used even more as corporate users migrate from Oracle RDBMS), find bugs and even try to fix them. But probably will have to use forks rather than MySQL itself if current attitude towards XA bugs processing and fixing remains.

            by Valeriy Kravchuk (noreply@blogger.com) at October 06, 2018 03:59 PM

            October 05, 2018

            MariaDB AB

            New Certified Docker Images + Kubernetes Scripts Simplify MariaDB Cloud Deployments

            New Certified Docker Images + Kubernetes Scripts Simplify MariaDB Cloud Deployments Saravana Krish… Fri, 10/05/2018 - 14:33

            In the last few years, enterprise development teams have been focused on reducing the cost of production-grade applications while improving the velocity and agility of development. That’s led to massive public and private cloud adoption – and deployment of databases in containers. To address this growing need, we’ve released new Docker images and Kubernetes scripts that make it easy to deploy and manage MariaDB databases. Now organizations can focus on building their applications rather than on managing and optimizing container infrastructure.

            On-demand webinar: The Future of MariaDB on Containers
            Watch this recorded webinar to get a look at official Docker images, learn how to run stateful MariaDB clusters on Kubernetes and more.
            Watch Now

            New – MariaDB Docker Images, Kubernetes Scripts & Sandbox Environments

            We’ve released certified Docker images and enabled customers to seamlessly deploy MariaDB servers in Kubernetes and Docker environments. We are delivering three standalone Docker images (one each for MariaDB Server, ColumnStore and MaxScale) and two sandboxes (one for MariaDB AX and one for MariaDB TX). Customers can deploy the standalone Docker images in a standard Docker environment or create complex topologies in Kubernetes environment using YAML scripts.

            For someone just getting started with MariaDB or wanting to learn the capabilities offered, the sandboxes will enable them to easily experiment with MariaDB AX and TX. Sandboxes are self-contained, with all the documentation needed to quickly bring up TX or AX and experiment with sample apps (bookstore in the case of TX, and Zeppelin notebook in the case of AX). You have the flexibility to quickly deploy them on a laptop or desktop using Docker Compose to immediately run sample applications.

            We’ve also released a Kubernetes script to create a master/slave (one master + two slaves) cluster with MaxScale at the front end. (This script is showcased in the on-demand webinar.) Now customers can deploy MariaDB TX with MaxScale easily in high availability mode. The script deploys the master/slave cluster in such a way that when the master fails one of the slaves will be automatically promoted as the new master. Kubernetes will try to bring up the pod and maintain the configuration integrity. When the old master comes back it will automatically become a slave. The script also supports expanding the number of slave nodes easily, using a simple command. 

            MariaDB clusters_Kubernetes.png

             

            How to Get Started with Docker and Kubernetes

            In the last few years, enterprise development teams have been focused on reducing the cost of production-grade applications while improving the velocity and agility of development. That’s led to massive public and private cloud adoption – and deployment of databases in containers. To address this growing need, we’ve released new Docker images and Kubernetes scripts that make it easy to deploy and manage MariaDB databases. Now organizations can focus on building their applications rather than on managing and optimizing container infrastructure.

            Login or Register to post comments

            by Saravana Krishnamurthy at October 05, 2018 06:33 PM

            Peter Zaitsev

            PostgreSQL Extensions for an Enterprise-Grade System

            PostgreSQL extensions for logging

            PostgreSQL® logoIn this current series of blog posts we have been discussing various relevant aspects when building an enterprise-grade PostgreSQL setup, such as security, back up strategy, high availability, and different methods to scale PostgreSQL. In this blog post, we’ll get to review some of the most popular open source extensions for PostgreSQL, used to expand its capabilities and address specific needs. We’ll cover some of them during a demo in our upcoming webinar on October 10.

            Expanding with PostgreSQL Extensions

            PostgreSQL is one of the world’s most feature-rich and advanced open source RDBMSs. Its features are not just limited to those released by the community through major/minor releases. There are hundreds of additional features developed using the extensions capabilities in PostgreSQL, which can cater to needs of specific users. Some of these extensions are very popular and useful to build an enterprise-grade PostgreSQL environment. We previously blogged about a couple of FDW extensions (mysql_fdw and postgres_fdw ) which will allow PostgreSQL databases to talk to remote homogeneous/heterogeneous databases like PostgreSQL and MySQL, MongoDB, etc. We will now cover a few other additional extensions that can expand your PostgreSQL server capabilities.

            pg_stat_statements

            The pg_stat_statements module provides a means for tracking execution statistics of all SQL statements executed by a server. The statistics gathered by the module are made available via a view named pg_stat_statements. This extension must be installed in each of the databases you want to track, and like many of the extensions in this list, it is available in the contrib package from the PostgreSQL PGDG repository.

            pg_repack

            Tables in PostgreSQL may end up with fragmentation and bloat due to the specific MVCC implementation in PostgreSQL, or simply due to a high number of rows being naturally removed. This could lead to not only unused space being held inside the table but also to sub-optimal execution of SQL statements. pg_repack is the most popular way to address this problem by reorganizing and repacking the table. It can reorganize the table’s content without placing an exclusive lock on it during the process. DMLs and queries can continue while repacking is happening.  Version 1.2 of pg_repack introduces further new features of parallel index builds, and the ability to rebuild just the indexes. Please refer to the official documentation for more details.

            pgaudit

            PostgreSQL has a basic statement logging feature. It can be implemented using the standard logging facility with

            log_statement = all
             . But this is not sufficient for many audit requirements. One of the essential features for enterprise deployments is the capability for fine-grained auditing the user interactions/statements issued to the database. This is a major compliance requirement for many security standards. The pgaudit extension caters to these requirements.

            The PostgreSQL Audit Extension (pgaudit) provides detailed session and/or object audit logging via the standard PostgreSQL logging facility. Please refer to the settings section of its official documentation for more details.

            pldebugger

            This is a must-have extension for developers who work on stored functions written in PL/pgSQL. This extension is well integrated with GUI tools like pgadmin, which allows developers to step through their code and debug it. Packages for pldebugger are also available in the PGDG repository and installation is straightforward.Once it is set up, we can step though and debug the code remotely.

            The official git repo is available here

            plprofiler

            This is a wonderful extension for finding out where the code is slowing down. This is very helpful, particularly during complex migrations from proprietary databases, like from Oracle to PostgreSQL, which affect application performance. This extension can prepare a report on the overall execution time and tables representation, including flamegraphs, with clear information about each line of code. This extension is not, however, available from the PGDG repo: you will need to build it from source. Details on building and installing plprofiler will be covered in a future blog post. Meanwhile, the official repository and documentation is available here

            PostGIS

            PostGIS is arguably the most versatile implementation of the specifications of the Open Geospatial Consortium. We can see a large list of features in PostGIS that are rarely available in any other RDBMSs.

            There are many users who have primarily opted to use PostgreSQL because of the features supported by PostGIS. In fact, all these features are not implemented as a single extension, but are instead delivered by a collection of extensions. This makes PostGIS one of the most complex extensions to build from source. Luckily, everything is available from the PGDG repository:

            $ sudo yum install postgis24_10.x86_64

            Once the postgis package is installed, we are able to create the extensions on our target database:

            postgres=# CREATE EXTENSION postgis;
            CREATE EXTENSION
            postgres=# CREATE EXTENSION postgis_topology;
            CREATE EXTENSION
            postgres=# CREATE EXTENSION postgis_sfcgal;
            CREATE EXTENSION
            postgres=# CREATE EXTENSION fuzzystrmatch;
            CREATE EXTENSION
            postgres=# CREATE EXTENSION postgis_tiger_geocoder;
            CREATE EXTENSION
            postgres=# CREATE EXTENSION address_standardizer;
            CREATE EXTENSION

            Language Extensions : PL/Python, PL/Perl, PL/V8,PL/R etc.

            Another powerful feature of PostgreSQL is its programming languages support. You can code database functions/procedures in pretty much every popular language.

            Thanks to the enormous number of libraries available, which includes machine learning ones, and its vibrant community, Python has claimed the third spot amongst the most popular languages of choice according to the TIOBE Programming index. Your team’s skills and libraries remain valid for PostgreSQL server coding too! Teams that regularly code in JavaScript for Node.js or Angular can easily write PostgreSQL server code in PL/V8. All of the packages required are readily available from the PGDG repository.

            cstore_fdw

            cstore_fdw is an open source columnar store extension for PostgreSQL. Columnar stores provide notable benefits for analytics use cases where data is loaded in batches. Cstore_fdw’s columnar nature delivers performance by only reading relevant data from disk. It may compress data by 6 to 10 times to reduce space requirements for data archive. The official repository and documentation is available here

            HypoPG

            HypoPG is an extension for adding support for hypothetical indexes – that is, without actually adding the index. This helps us to answer questions such as “how will the execution plan change if there is an index on column X?”. Installation and setup instructions are part of its official documentation

            mongo_fdw

            Mongo_fdw presents collections from mongodb as tables in PostgreSQL. This is a case where the NoSQL world meets the SQL world and features combine. We will be covering this extension in a future blog post. The official repository is available here

            tds_fdw

            Another important FDW (foreign data wrapper) extension in the PostgreSQL world is tds_fdw. Both Microsoft SQL Server and Sybase uses TDS (Tabular Data Stream) format. This fdw allows PostgreSQL to use tables stored in remote SQL Server or Sybase database as local tables. This FDW make use of FreeTDS libraries.

            orafce

            As previously mentioned, there are lot of migrations underway from Oracle to PostgreSQL. Incompatible functions in PostgreSQL are often painful for those who are migrating server code. The “orafce” project implements some of the functions from the Oracle database. The functionality was verified on Oracle 10g and the module is useful for production work. Please refer to the list in its official documentation about the Oracle functions implemented in PostgreSQL

            TimescaleDB

            In this new world of IOT and connected devices, there is a growing need of time-series data. Timescale can convert PostgreSQL into a scalable time-series data store. The official site is available here with all relevant links.

            pg_bulkload

            Is loading a large volume of data into database in a very efficient and faster way a challenge for you? If so pg_bulkload may help you solve that problem. Official documentation is available here

            pg_partman

            PostgreSQL 10 introduced declarative partitions. But creating new partitions and maintaining existing ones, including purging unwanted partitions, requires a good dose of manual effort. If you are looking to automate part of this maintenance you should have a look at what pg_partman offers. The repository with documentation is available here.

            wal2json

            PostgreSQL has feature related to logical replication built-in. Extra information is recorded in WALs which will facilitate logical decoding. wal2json is a popular output plugin for logical decoding. This can be utilized for different purposes including change data capture. In addition to wal2json, there are other output plugins: a concise list is available in the PostgreSQL wiki.

            There are many more extensions that help us build an enterprise-grade PostgreSQL set up using open source solutions. Please feel free to comment and ask us if we know about one that satisfies your particular needs. Or, if there’s still time, sign up for our October webinar and ask us in person!

            by Jobin Augustine at October 05, 2018 04:43 PM

            MongoDB-Disable Chained Replication

            MongoDB chained replication

            MongoDB chained replicationIn this blog post, we will learn what MongoDB chained replication is, why you might choose to disable it, and the steps you need to take to do so.

            What is chain replication?

            Chain Replication in MongoDB, as the name suggests, means that a secondary member is able to replicate from another secondary member instead of a primary.

            Default settings

            By default, chained replication is enabled in MongoDB. It helps to reduce the load from the primary but it may lead to a replication lag. When enabled, the secondary node selects its target using the ping time for the closest node.

            Reasons to disable chained replication

            The main reason to disable chain replication is replication lag. In other words, the length of the delay between MongoDB writing an operation on the primary and replicating the same operation to the secondary.

            In either case—chained replication enabled or disabled—replication works in the same way when the primary node fails: the secondary server will promote to the primary. Therefore, writing and reading of data from the application is not affected.

            Steps to disable chained replication

            1) Check the current status of chained replication in replica set configuration for “settings” like this:

            PRIMARY> rs.config().settings
            {
            "chainingAllowed" : true,
            }

            2) Disable chained replication, set “chainingAllowed” to false and then reconfig to implement changes.

            PRIMARY> cg = rs.config()
            PRIMARY> cg.settings.chainingAllowed = false
            false
            PRIMARY> rs.reconfig(cg)

            3) Check again for the current status of chained replication and its done.

            PRIMARY> rs.config().settings
            {
            	"chainingAllowed" : false,
            }

            Can I override sync source target even after disabling chaining?

            Yes, even after you have disabled chained replication, you can still override sync target, though only temporarily. That means it will be overridden until:

            • mognod instance restarts
            • established connection between sync source and secondary node.
            • Additional: if chaining is enabled and sync source falls more than 30 seconds behind another member then the SyncSourceResolver will choose other member having recent oplogs to sync from.

            Override sync source

            Parameter “replSetSyncFrom” could be used, for example, the secondary node is syncing from host

            192.168.103.100:27001
              and we would like to sync it from
            192.168.103.100:27003

            1) Check for the current host it is syncing from:

            PRIMARY> rs.status()
            {
            			"_id" : 1,
            			"name" : "192.168.103.100:27002",
            			"syncingTo" : "192.168.103.100:27001",
            			"syncSourceHost" : "192.168.103.100:27001",
            		},

            2) Login to that mongod, and execute:

            SECONDARY> db.adminCommand( { replSetSyncFrom: "192.168.103.100:27003" })

            3) Check replica set status again

            SECONDARY> rs.status()
            {
            			"_id" : 1,
            			"name" : "192.168.103.100:27002",
            			"syncingTo" : "192.168.103.100:27003",
            			"syncSourceHost" : "192.168.103.100:27003",
            		},

            This is how we can override the sync source in case of testing, maintenance or while the replica is not syncing from the required host.

            I hope this blog helps you to understand how to disable chained replication or override the sync source for the specific purpose or reason. The preferred setting of the chainingAllowed parameter is true as it reduces the load from the primary node and also a default setting.

            by Aayushi Mangal at October 05, 2018 10:53 AM