Planet MariaDB

October 12, 2018

Peter Zaitsev

Track PostgreSQL Row Changes Using Public/Private Key Signing

PostgreSQL encryption and authorization

row signing with postgresqlAuthorisations and encryption/decryption within a database system establish the basic guidelines in protecting your database by guarding against malicious structural or data changes.

What are authorisations?

Authorisations are the access privileges that mainly control what a user can and cannot do on the database server for one or more databases. So consider this to be like granting a key to unlock specific doors. Think of this as more like your five star hotel smart card. It allows you access all facilities that are meant for you, but doesn’t let you open every door. Whereas, privileged staff have master keys which let them open any door.

Similarly, in the database world, granting permissions secures the system by allowing specific actions by specific users or user groups, yet it allows database administrator to perform whatever action(s) on the database he/she wishes. PostgreSQL provides user management where you can can create users, and grant and revoke their privileges.

Encryption

Encryption, decryption can protect your data, obfuscate schema structure and help hide code from prying eyes. Encryption/decryption hides the valuable information and ensures that there are no mischievous changes in the code or data that may be considered harmful. In almost all cases, data encryption and decryption happens on the database server. This is more like hiding your stuff somewhere in your room so that nobody can see it, but also making your stuff difficult to access.

PostgreSQL also provides encryption using pgcrypto (PostgreSQL extension). There are some cases where you don’t want to hide the data, but don’t want people to update it either. You can revoke the privileges to modify the data.

Data modifications

But what if an admin user modifies the data? How you can identify that data is changed? If somebody changes the data and you don’t know about, then it is more dangerous than you losing your data, as you are relying on data which may no longer be valid.

Logs in database systems allow us to track back changes and “potentially” identify what was changed—unless, those logs are removed by the administrator.

So consider if you can leave your stuff openly in your room and in case of any changes, you can identify that something was tampered with. In database terms, that translates to data without encryption, but with your very own signature. One option is to add a column to your database table which keeps a checksum for the data that is generated on the client side using the user’s own private key.  Any changes in the data would mean that checksum doesn’t match anymore, and hence, one can easily identify if the data has changed. The data signing happens on the client-side, thereby ensuring that only users with the required private key can insert the data and anyone with a public key can validate.

Public/Private Keys

Asymmetric cryptographic system uses pairs of keys; public keys and private keys. Private keys are known only to the owner(s). It is used for signing or decrypting data. Public keys are shared with other stakeholders who may use it to encrypt messages or validate messages signed by the owner.

Generate Private / Public Key

Private Key

$ openssl genrsa -aes128 -passout pass:password -out key.private.pem
Generating RSA private key, 2048 bit long modulus

Public Key

$ openssl rsa -in key.private.pem -passin pass:password -pubout -out key.public.pem
writing RSA key

Signing Data

Create a sample table tbl_marks and insert a sample row in that. We’ll need to add additional columns for signature verification. This will understandably increase the table size as we are adding additional columns.

postgres=# CREATE TABLE tbl_marks (id INTEGER, name TEXT, marks INTEGER, hash TEXT);

Let’s add a row that we’d like to validate.

postgres=# INSERT INTO tbl_marks VALUES(1, 'Alice', 80);

We will select the data to store the value into into query buffer using

\gset
  command (https://www.postgresql.org/docs/current/static/app-psql.html). The complete row will be saved into “row” psql variable.

postgres=# SELECT row(id,name,marks) FROM tbl_marks WHERE id = 1;
     row   
---------------
(1,Alice,80)
(1 row)
postgres=# \gset
postgres=# SELECT :'row' as row;
     row   
---------------
(1,Alice,80)
(1 row)

Now let’s generate signature for the data stored in “row” variable.

postgres=# \set sign_command `echo :'row' | openssl dgst -sha256 -sign key.private.pem | openssl base64 | tr -d '\n' | tr -d '\r'`
Enter pass phrase for key.private.pem:

The signed hash is stored into the “sign_command” psql variable. Let’s now add this to the data row in tbl_marks table.

postgres=# UPDATE tbl_marks SET hash = :'sign_command' WHERE id = 1;
UPDATE 1

Validating Data

So our data row now contains data with a valid signature. Let’s try to validate to it. We are going to select our data in “row” psql variable and the signature hash in “hash” psql variable.

postgres=# SELECT row(id,name,marks), hash FROM tbl_marks;    
Row           hash                                                                                                                                                                                                                                                                                                                                                                                            
---------------+-----------------------------------------------
(1,Alice,80) | U23g3RwaZmbeZpYPmwezP5xvbIs8ILupW7jtrat8ixA ...
(1 row)
postgres=# \gset

Let’s now validate the data using a public key.

postgres=# \set verify_command `echo :'hash' | awk '{gsub(/.{65}/,"&\n")}1' | openssl base64 -d -out v && echo :'row' | openssl dgst -sha256 -verify key.public.pem -signature v`
postgres=# select :'verify_command' as verify;
  verify    
-------------
Verified OK
(1 row)

Perfect! The data is validated and all this happened on the client side. Imagine somebody doesn’t like that Alice got 80 marks, and they decide to reduce Alice’s marks to 30. Nobody knows if the teacher had given Alice 80 or 30 unless somebody goes and checks the database logs. We’ll give Alice 30 marks now.

postgres=# UPDATE tbl_marks SET marks = 30;
UPDATE 1

The school admin now decides to check that all data is correct before giving out the final results. The school admin has the teacher’s public key and tries to validate the data.

postgres=# SELECT row(id,name,marks), hash FROM tbl_marks;
    row    | hash                                                                                                                                                                                                                                                                  
--------------+--------------------------------------------------
(1,Alice,30) | yO20vyPRPR+HgW9D2nMSQstRgyGmCxyS9bVVrJ8tC7nh18iYc...
(1 row)
postgres=# \gset

postgres=# \set verify_command `echo :'hash' | awk '{gsub(/.{65}/,"&\n")}1' | openssl base64 -d -out v && echo :'row' | openssl dgst -sha256 -verify key.public.pem -signature v`
postgres=# SELECT :'verify_command' AS verify;
      verify      
----------------------
Verification Failure

As expected, the validation fails. Nobody other than the teacher had the private key to sign that data, and any tampering is easily identifiable.

This might not be the most efficient way of securing a dataset, but it is definitely an option if you want to keep the data unencrypted, and yet easily detect any unauthorised changes. All the load is shifted on to the client side for signing and verification thereby reducing load on the server. It allows only users with private keys to update the data, and anybody with the associated public key to validate it.

The example used psql as a client application for signing but you can do this on any client which can call the required openssl functions or directly used openssl binaries for signing and verification.

by Ibrar Ahmed at October 12, 2018 01:43 PM

October 11, 2018

Peter Zaitsev

How to Fix ProxySQL Configuration When it Won’t Start

restart ProxySQL config

restart ProxySQL configWith the exception of the three configuration variables described here, ProxySQL will only parse the configuration files the first time it is started, or if the proxysql.db file is missing for some other reason.

If we want to change any of this data we need to do so via ProxySQL’s admin interface and then save them to disk. That’s fine if ProxySQL is running, but what if it won’t start because of these values?

For example, perhaps we accidentally configured ProxySQL to run on port 3306 and restarted it, but there’s already a production MySQL instance running on this port. ProxySQL won’t start, so we can’t edit the value that way:

2018-10-02 09:18:33 network.cpp:53:listen_on_port(): [ERROR] bind(): Address already in use

We could delete proxysql.db and have it reload the configuration files, but that would mean any changes we didn’t mirror into the configuration files will be lost.

Another option is to edit ProxySQL’s database file using sqlite3:

[root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/
[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces';
mysql-interfaces|127.0.0.1:3306
sqlite> UPDATE global_variables SET variable_value='127.0.0.1:6033' WHERE variable_name='mysql-interfaces';
sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces';
mysql-interfaces|127.0.0.1:6033

Or if we have a few edits to make we may prefer to do so with a text editor:

[root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/
[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
sqlite> .output /tmp/global_variables
sqlite> .dump global_variables
sqlite> .exit

The above commands will dump the global_variables table into a file in SQL format, which we can then edit:

[root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables
INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:3306’);
[root@centos7-pxc57-4 proxysql]# vim /tmp/global_variables
[root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables
INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:6033’);

Now we need to restore this data. We’ll use the restore command to empty the table (as we’re restoring from a missing backup):

[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
sqlite> .restore global_variables
sqlite> .read /tmp/global_variables
sqlite> .exit

Once we’ve made the change, we should be able to start ProxySQL again:

[root@centos7-pxc57-4 proxysql]# /etc/init.d/proxysql start
Starting ProxySQL: DONE!
[root@centos7-pxc57-4 proxysql]# lsof -I | grep proxysql
proxysql 15171 proxysql 19u IPv4 265881 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 20u IPv4 265882 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 21u IPv4 265883 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 22u IPv4 265884 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 23u IPv4 266635 0t0 TCP *:6032 (LISTEN)

While you are here

You might enjoy my recent post Using ProxySQL to connect to IPV6-only databases over IPV4

You can download ProxySQL from Percona repositories, and you might also want to check out our recorded webinars that feature ProxySQL too.

by James Lawrie at October 11, 2018 04:07 PM

Percona Live 2019 – Save the Date!

Austin Texas

Austin State Capitol

After much speculation following the announcement in Santa Clara earlier this year, we are delighted to announce Percona Live 2019 will be taking place in Austin, Texas.

Save the dates in your diary for May, 28-30 2019!

The conference will take place just after Memorial Day at The Hyatt Regency, Austin on the shores of Lady Bird Lake.

This is also an ideal central location for those who wish to extend their stay and explore what Austin has to offer! Call for papers, ticket sales and sponsorship opportunities will be announced soon, so stay tuned!

In other Percona Live news, we’re less than 4 weeks away from this year’s European conference taking place in Frankfurt, Germany on 5-7 November. The tutorials and breakout sessions have been announced, and you can view the full schedule here. Tickets are still on sale so don’t miss out, book yours here today!

 

by Bronwyn Campbell at October 11, 2018 12:38 PM

October 10, 2018

Peter Zaitsev

Percona Monitoring and Management (PMM) 1.15.0 Is Now Available

Percona Monitoring and Management

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

Percona Monitoring and Management

This release offers two new features for both the MySQL Community and Percona Customers:

  • MySQL Custom Queries – Turn a SELECT into a dashboard!
  • Server and Client logs – Collect troubleshooting logs for Percona Support

We addressed 17 new features and improvements, and fixed 17 bugs.

MySQL Custom Queries

In 1.15 we are introducing the ability to take a SQL SELECT statement and turn the result set into metric series in PMM.  The queries are executed at the LOW RESOLUTION level, which by default is every 60 seconds.  A key advantage is that you can extend PMM to profile metrics unique to your environment (see users table example), or to introduce support for a table that isn’t part of PMM yet. This feature is on by default and only requires that you edit the configuration file and use vaild YAML syntax.  The configuration file is in /usr/local/percona/pmm-client/queries-mysqld.yml.

Example – Application users table

We’re going to take a fictional MySQL users table that also tracks the number of upvotes and downvotes, and we’ll convert this into two metric series, with a set of seven labels, where each label can also store a value.

Browsing metrics series using Advanced Data Exploration Dashboard

Lets look at the output so we understand the goal – take data from a MySQL table and store in PMM, then display as a metric series.  Using the Advanced Data Exploration Dashboard you can review your metric series. Exploring the metric series  app1_users_metrics_downvotes we see the following:

PMM Advanced Data Exploration Dashboard

MySQL table

Lets assume you have the following users table that includes true/false, string, and integer types.

SELECT * FROM `users`
+----+------+--------------+-----------+------------+-----------+---------------------+--------+---------+-----------+
| id | app  | user_type    | last_name | first_name | logged_in | active_subscription | banned | upvotes | downvotes |
+----+------+--------------+-----------+------------+-----------+---------------------+--------+---------+-----------+
|  1 | app2 | unprivileged | Marley    | Bob        |         1 |                   1 |      0 |     100 |        25 |
|  2 | app3 | moderator    | Young     | Neil       |         1 |                   1 |      1 |     150 |        10 |
|  3 | app4 | unprivileged | OConnor   | Sinead     |         1 |                   1 |      0 |      25 |        50 |
|  4 | app1 | unprivileged | Yorke     | Thom       |         0 |                   1 |      0 |     100 |       100 |
|  5 | app5 | admin        | Buckley   | Jeff       |         1 |                   1 |      0 |     175 |         0 |
+----+------+--------------+-----------+------------+-----------+---------------------+--------+---------+-----------+

Explaining the YAML syntax

We’ll go through a simple example and mention what’s required for each line.  The metric series is constructed based on the first line and appends the column name to form metric series.  Therefore the number of metric series per table will be the count of columns that are of type GAUGE or COUNTER.  This metric series will be called app1_users_metrics_downvotes:

app1_users_metrics:                                 ## leading section of your metric series.
  query: "SELECT * FROM app1.users"                 ## Your query. Don't forget the schema name.
  metrics:                                          ## Required line to start the list of metric items
    - downvotes:                                    ## Name of the column returned by the query. Will be appended to the metric series.
        usage: "COUNTER"                            ## Column value type.  COUNTER will make this a metric series.
        description: "Number of upvotes"            ## Helpful description of the column.

Full queries-mysqld.yml example

Each column in the SELECT is named in this example, but that isn’t required, you can use a SELECT * as well.  Notice the format of schema.table for the query is included.

---
app1_users_metrics:
  query: "SELECT app,first_name,last_name,logged_in,active_subscription,banned,upvotes,downvotes FROM app1.users"
  metrics:
    - app:
        usage: "LABEL"
        description: "Name of the Application"
    - user_type:
        usage: "LABEL"
        description: "User's privilege level within the Application"
    - first_name:
        usage: "LABEL"
        description: "User's First Name"
    - last_name:
        usage: "LABEL"
        description: "User's Last Name"
    - logged_in:
        usage: "LABEL"
        description: "User's logged in or out status"
    - active_subscription:
        usage: "LABEL"
        description: "Whether User has an active subscription or not"
    - banned:
        usage: "LABEL"
        description: "Whether user is banned or not"
    - upvotes:
        usage: "COUNTER"
        description: "Count of upvotes the User has earned.  Upvotes once granted cannot be revoked, so the number can only increase."
    - downvotes:
        usage: "GAUGE"
        description: "Count of downvotes the User has earned.  Downvotes can be revoked so the number can increase as well as decrease."
...

We hope you enjoy this feature, and we welcome your feedback via the Percona forums!

Server and Client logs

We’ve enhanced the volume of data collected from both the Server and Client perspectives.  Each service provides a set of files designed to be shared with Percona Support while you work on an issue.

Server

From the Server, we’ve improved the logs.zip service to include:

  • Prometheus targets
  • Consul nodes, QAN API instances
  • Amazon RDS and Aurora instances
  • Version
  • Server configuration
  • Percona Toolkit commands

You retrieve the link from your PMM server using this format:   https://pmmdemo.percona.com/managed/logs.zip

Client

On the Client side we’ve added a new action called summary which fetches logs, network, and Percona Toolkit output in order to share with Percona Support. To initiate a Client side collection, execute:

pmm-admin summary

The output will be a file you can use to attach to your Support ticket.  The single file will look something like this:

summary__2018_10_10_16_20_00.tar.gz

New Features and Improvements

  • PMM-2913 – Provide ability to execute Custom Queries against MySQL – Credit to wrouesnel for the framework of this feature in wrouesnel/postgres_exporter!
  • PMM-2904 – Improve PMM Server Diagnostics for Support
  • PMM-2860 – Improve pmm-client Diagnostics for Support
  • PMM-1754Provide functionality to easily select query and copy it to clipboard in QAN
  • PMM-1855Add swap to AMI
  • PMM-3013Rename PXC Overview graph Sequence numbers of transactions to IST Progress
  • PMM-2726 – Abort data collection in Exporters based on Prometheus Timeout – MySQLd Exporter
  • PMM-3003 – PostgreSQL Overview Dashboard Tooltip fixes
  • PMM-2936Some improvements for Query Analytics Settings screen
  • PMM-3029PostgreSQL Dashboard Improvements

Fixed Bugs

  • PMM-2976Upgrading to PMM 1.14.x fails if dashboards from Grafana 4.x are present on an installation
  • PMM-2969rds_exporter becomes throttled by CloudWatch API
  • PMM-1443The credentials for a secured server are exposed without explicit request
  • PMM-3006Monitoring over 1000 instances is displayed imperfectly on the label
  • PMM-3011PMM’s default MongoDB DSN is localhost, which is not resolved to IPv4 on modern systems
  • PMM-2211Bad display when using old range in QAN
  • PMM-1664Infinite loading with wrong queryID
  • PMM-2715Since pmm-client-1.9.0, pmm-admin detects CentOS/RHEL 6 installations using linux-upstart as service manager and ignores SysV scripts
  • PMM-2839Tablestats safety precaution does not work for RDS/Aurora instances
  • PMM-2845pmm-admin purge causes client to panic
  • PMM-2968pmm-admin list shows empty data source column for mysql:metrics
  • PMM-3043 Total Time percentage is incorrectly shown as a decimal fraction
  • PMM-3082Prometheus Scrape Interval Variance chart doesn’t display data

How to get PMM Server

PMM is available for installation using three methods:

Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

by Dmitriy Kostiuk at October 10, 2018 06:31 PM

MariaDB AB

MariaDB OpenWorks 2019 – Early-Bird Registration Now Open

MariaDB OpenWorks 2019 – Early-Bird Registration Now Open MariaDB Team Wed, 10/10/2018 - 14:16

MariaDB OpenWorks, our 2019 user and developer conference, takes place in February – but now is the best time to get your passes. Early-bird registration recently opened. Don't miss this opportunity for the lowest rates and first dibs on workshop seats. 

Early-bird savings of 30% (or more)

By reserving your passes now, you save 30% off regular prices. That means $175 for the 1-day Workshops pass, $350 for the 2-day Conference pass, and $525 for the 3-day All-Access pass. At those rates, your attendance is easy to justify – plus, you can save an additional 20% by registering a group of three or more people. Prices go up on December 1, so be sure to take advantage of early-bird pricing.

Guaranteed access to in-demand workshops

MariaDB OpenWorks 2019 features a full day dedicated to hands-on workshops. Learn directly from MariaDB engineers, remote DBAs and consultants – ask questions, do the labs and increase your MariaDB expertise. Workshops sold out at the 2018 conference, so claim your seats now for the workshops that will best help you achieve your business goals:

  • High availability
  • Containerization
  • High-performance analytics
  • Advanced security
  • Performance optimization

Something for everyone

MariaDB OpenWorks 2019 will offer 45+ sessions on a wide range of topics, from the latest MariaDB product and community innovations to real-world customer stories, ops tools and processes, insights for developers, and the power of analytics.

Check out the 1-minute recap video from the 2018 conference and picture yourself joining the festivities in 2019.

We hope to see you at MariaDB OpenWorks in New York, February 25-27, 2019. Don't forget to register early for 30% off!

MariaDB OpenWorks, our 2019 user and developer conference, takes place in February – but now is the best time to get your passes. Early-bird registration recently opened. Don't miss this opportunity for the lowest rates and first dibs on workshop seats.

Login or Register to post comments

by MariaDB Team at October 10, 2018 06:16 PM

Peter Zaitsev

Instrumenting Read Only Transactions in InnoDB

Instrumenting read only transactions MySQL

Instrumenting read only transactions MySQLProbably not well known but quite an important optimization was introduced in MySQL 5.6 – reduced overhead for “read only transactions”. While usually by a “transaction” we mean a query or a group of queries that change data, with transaction engines like InnoDB, every data read or write operation is a transaction.

Now, as a non-locking read operation obviously has less impact on the data, it does not need all the instrumenting overhead a write transaction has. The main thing that can be avoided, as described by documentation, is the transaction ID. So, since MySQL 5.6, a read only transaction does not have a transaction ID. Moreover, such a transaction is not visible in the SHOW ENGINE INNODB STATUS output, though I will not go deeper on what really that means under the hood in this article. The fact is that this optimization allows for better scaling of workloads with many RO threads. An example RO benchmark, where 5.5 vs 5.6/5.7 difference is well seen, may be found here: https://www.percona.com/blog/2016/04/07/mysql-5-7-sysbench-oltp-read-results-really-faster/

To benefit from this optimization in MySQL 5.6, either a transaction has to start with the explicit START TRANSACTION READ ONLY clause or it must be an autocommit, non-locking SELECT statement. In version 5.7 and newer, it goes further, as a new transaction is treated as read-only until a locking read or write is executed, at which point it gets “upgraded” to a read-write one.

Information Schema Instrumentation

Let’s see how it looks like (on MySQL 8.0.12) by looking at information_schema.innodb_trx and information_schema.innodb_metrics tables. The second of these, by default, has transaction counters disabled, so before the test we have to enable it with:

SET GLOBAL innodb_monitor_enable = 'trx%comm%';

or by adding a parameter to the

[mysqld]
 section of the configuration file and restarting the instance:

innodb_monitor_enable = "trx_%"

Now, let’s start a transaction which should be read only according to the rules:

mysql [localhost] {msandbox} (db1) > START TRANSACTION; SELECT count(*) FROM db1.t1;
Query OK, 0 rows affected (0.00 sec)
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec
mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
FROM information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 421988493944672
                trx_weight: 0
           trx_rows_locked: 0
         trx_rows_modified: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)

Transaction started as above, did not appear in SHOW ENGINE INNODB STATUS, and its trx_id looks strangely high. And first surprise—for some reason, trx_is_read_only is 0. Now, what if we commit such a transaction—how do the counters change? (I reset them before the test):

mysql [localhost] {msandbox} (db1) > commit;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
+---------------------------+--------------------------------------------------------------------+---------+-------+
| name                      | comment                                                            | status  | count |
+---------------------------+--------------------------------------------------------------------+---------+-------+
| trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
| trx_ro_commits            | Number of read-only transactions committed                         | enabled |     1 |
| trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     0 |
| trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
+---------------------------+--------------------------------------------------------------------+---------+-------+
4 rows in set (0.01 sec)

OK, so clearly it was a read-only transaction overall, just the trx_is_read_only property wasn’t set as expected. I had to report this problem here: https://bugs.mysql.com/bug.php?id=92558

What about an explicit RO transaction:

mysql [localhost] {msandbox} (db1) > START TRANSACTION READ ONLY; SELECT count(*) FROM db1.t1;
Query OK, 0 rows affected (0.00 sec)
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec
mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
FROM information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 421988493944672
                trx_weight: 0
           trx_rows_locked: 0
         trx_rows_modified: 0
          trx_is_read_only: 1
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > commit;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
+---------------------------+--------------------------------------------------------------------+---------+-------+
| name                      | comment                                                            | status  | count |
+---------------------------+--------------------------------------------------------------------+---------+-------+
| trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
| trx_ro_commits            | Number of read-only transactions committed                         | enabled |     2 |
| trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     0 |
| trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
+---------------------------+--------------------------------------------------------------------+---------+-------+
4 rows in set (0.01 sec)

OK, both transactions are counted as the same type. Moreover, the two transactions shared the same strange trx_id, which appears to be a fake one. For a simple read executed in autocommit mode, the counters increase as expected too:

mysql [localhost] {msandbox} (db1) > select @@autocommit; SELECT count(*) FROM db1.t1;
+--------------+
| @@autocommit |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
+---------------------------+--------------------------------------------------------------------+---------+-------+
| name                      | comment                                                            | status  | count |
+---------------------------+--------------------------------------------------------------------+---------+-------+
| trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
| trx_ro_commits            | Number of read-only transactions committed                         | enabled |     2 |
| trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     1 |
| trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
+---------------------------+--------------------------------------------------------------------+---------+-------+
4 rows in set (0.00 sec)

Now, let’s test how a transaction looks when we upgrade it to RW later:

mysql [localhost] {msandbox} (db1) > START TRANSACTION; SELECT count(*) FROM db1.t1;
Query OK, 0 rows affected (0.00 sec)
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
FROM information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 421988493944672
                trx_weight: 0
           trx_rows_locked: 0
         trx_rows_modified: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT count(*) FROM db1.t1 FOR UPDATE;
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT trx_id,trx_weight,trx_rows_locked,trx_rows_modified,trx_is_read_only,trx_autocommit_non_locking
FROM information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 4106
                trx_weight: 2
           trx_rows_locked: 4
         trx_rows_modified: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > commit;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
+---------------------------+--------------------------------------------------------------------+---------+-------+
| name                      | comment                                                            | status  | count |
+---------------------------+--------------------------------------------------------------------+---------+-------+
| trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     1 |
| trx_ro_commits            | Number of read-only transactions committed                         | enabled |     2 |
| trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     1 |
| trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
+---------------------------+--------------------------------------------------------------------+---------+-------+
4 rows in set (0.00 sec)

OK, as seen above, after a locking read was done, our transaction has transformed: it got a real, unique trx_id assigned. Then, when committed, the RW counter increased.

Performance Schema Problem

Nowadays it may feel natural to use performance_schema for monitoring everything. And, indeed, we can monitor types of transactions with it as well. Let’s enable the needed consumers and instruments:

mysql [localhost] {msandbox} (db1) > UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE '%transactions%';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 3  Changed: 0  Warnings: 0
mysql [localhost] {msandbox} (db1) > UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES' WHERE NAME = 'transaction';
Query OK, 0 rows affected (0.01 sec)
Rows matched: 1  Changed: 0  Warnings: 0
mysql [localhost] {msandbox} (db1) > SELECT * FROM performance_schema.setup_instruments WHERE NAME = 'transaction';
+-------------+---------+-------+------------+------------+---------------+
| NAME        | ENABLED | TIMED | PROPERTIES | VOLATILITY | DOCUMENTATION |
+-------------+---------+-------+------------+------------+---------------+
| transaction | YES     | YES   |            |          0 | NULL          |
+-------------+---------+-------+------------+------------+---------------+
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT * FROM performance_schema.setup_consumers WHERE NAME LIKE '%transactions%';
+----------------------------------+---------+
| NAME                             | ENABLED |
+----------------------------------+---------+
| events_transactions_current      | YES     |
| events_transactions_history      | YES     |
| events_transactions_history_long | YES     |
+----------------------------------+---------+
3 rows in set (0.01 sec)
mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
FROM performance_schema.events_transactions_summary_global_by_event_name\G
*************************** 1. row ***************************
      COUNT_STAR: 0
COUNT_READ_WRITE: 0
 COUNT_READ_ONLY: 0
1 row in set (0.00 sec)

And let’s do some simple tests:

mysql [localhost] {msandbox} (db1) > START TRANSACTION; COMMIT;
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
FROM performance_schema.events_transactions_summary_global_by_event_name\G
*************************** 1. row ***************************
      COUNT_STAR: 1
COUNT_READ_WRITE: 1
 COUNT_READ_ONLY: 0
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
+---------------------------+--------------------------------------------------------------------+---------+-------+
| name                      | comment                                                            | status  | count |
+---------------------------+--------------------------------------------------------------------+---------+-------+
| trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
| trx_ro_commits            | Number of read-only transactions committed                         | enabled |     0 |
| trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     0 |
| trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
+---------------------------+--------------------------------------------------------------------+---------+-------+
4 rows in set (0.00 sec)

A void transaction caused an increase to this RW counter in Performance Schema view! Moreover, a simple autocommit select increases it too:

mysql [localhost] {msandbox} (db1) > SELECT count(*) FROM db1.t1;
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.01 sec)
mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
FROM performance_schema.events_transactions_summary_global_by_event_name\G
*************************** 1. row ***************************
      COUNT_STAR: 2
COUNT_READ_WRITE: 2
 COUNT_READ_ONLY: 0
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > START TRANSACTION READ ONLY; COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT COUNT_STAR,COUNT_READ_WRITE,COUNT_READ_ONLY
FROM performance_schema.events_transactions_summary_global_by_event_name\G
*************************** 1. row ***************************
      COUNT_STAR: 3
COUNT_READ_WRITE: 2
 COUNT_READ_ONLY: 1
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (db1) > SELECT name, comment, status, count
FROM information_schema.innodb_metrics   WHERE name like 'trx%comm%';
+---------------------------+--------------------------------------------------------------------+---------+-------+
| name                      | comment                                                            | status  | count |
+---------------------------+--------------------------------------------------------------------+---------+-------+
| trx_rw_commits            | Number of read-write transactions  committed                       | enabled |     0 |
| trx_ro_commits            | Number of read-only transactions committed                         | enabled |     0 |
| trx_nl_ro_commits         | Number of non-locking auto-commit read-only transactions committed | enabled |     1 |
| trx_commits_insert_update | Number of transactions committed with inserts and updates          | enabled |     0 |
+---------------------------+--------------------------------------------------------------------+---------+-------+
4 rows in set (0.01 sec)

As seen above, with regard to monitoring transactions via Performance Schema, everything seems completely broken, empty transactions increase counters, and the only way to increase RO counter is to call a read-only transaction explicitly, but again, it should not count when no real read was done from a table. For this reason I filed another bug report: https://bugs.mysql.com/bug.php?id=92364

PMM Dashboard

We implemented a transactions information view in PMM, based on Information_schema.innodb_metrics, which—as presented above—is reliable and shows the correct counters. Therefore, I encourage everyone to use the innodb_monitor_enable setting to enable it and have the PMM graph it. It will look something like this:

by Przemysław Malkowski at October 10, 2018 03:09 PM

MongoDB Replica set Scenarios and Internals

MongoDB replica sets replication internals r

MongoDB replica sets replication internals rThe MongoDB® replica set is a group of nodes with one set as the primary node, and all other nodes set as secondary nodes. Only the primary node accepts “write” operations, while other nodes can only serve “read” operations according to the read preferences defined. In this blog post, we’ll focus on some MongoDB replica set scenarios, and take a look at the internals.

Example configuration

We will refer to a three node replica set that includes one primary node and two secondary nodes running as:

"members" : [
{
"_id" : 0,
"name" : "192.168.103.100:25001",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 3533,
"optime" : {
"ts" : Timestamp(1537800584, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-09-24T14:49:44Z"),
"electionTime" : Timestamp(1537797392, 2),
"electionDate" : ISODate("2018-09-24T13:56:32Z"),
"configVersion" : 3,
"self" : true
},
{
"_id" : 1,
"name" : "192.168.103.100:25002",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 3063,
"optime" : {
"ts" : Timestamp(1537800584, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1537800584, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-09-24T14:49:44Z"),
"optimeDurableDate" : ISODate("2018-09-24T14:49:44Z"),
"lastHeartbeat" : ISODate("2018-09-24T14:49:45.539Z"),
"lastHeartbeatRecv" : ISODate("2018-09-24T14:49:44.664Z"),
"pingMs" : NumberLong(0),
"syncingTo" : "192.168.103.100:25001",
"configVersion" : 3
},
{
"_id" : 2,
"name" : "192.168.103.100:25003",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 2979,
"optime" : {
"ts" : Timestamp(1537800584, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1537800584, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-09-24T14:49:44Z"),
"optimeDurableDate" : ISODate("2018-09-24T14:49:44Z"),
"lastHeartbeat" : ISODate("2018-09-24T14:49:45.539Z"),
"lastHeartbeatRecv" : ISODate("2018-09-24T14:49:44.989Z"),
"pingMs" : NumberLong(0),
"syncingTo" : "192.168.103.100:25002",
"configVersion" : 3
}

Here, the primary is running on port 25001, and the two secondaries are running on ports 25002 and 25003 on the same host.

Secondary nodes can only sync from Primary?

No, it’s not mandatory. Each secondary can replicate data from the primary or any other secondary to the node that is syncing. This term is also known as chaining, and by default, this is enabled.

In the above replica set, you can see that secondary node

"_id":2 
  is syncing from another secondary node
"_id":1
   as
"syncingTo" : "192.168.103.100:25002" 

This can also be found in the logs as here the parameter

chainingAllowed :true
   is the default setting.

settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: 60000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5ba8ed10d4fddccfedeb7492') } }

Chaining?

That means that a secondary member node is able to replicate from another secondary member node instead of from the primary node. This helps to reduce the load from the primary. If the replication lag is not tolerable, then chaining could be disabled.

For more details about chaining and the steps to disable it please refer to my earlier blog post here.

Ok, then how does the secondary node select the source to sync from?

If Chaining is False

When chaining is explicitly set to be false, then the secondary node will sync from the primary node only or could be overridden temporarily.

If Chaining is True

  • Before choosing any sync node, TopologyCoordinator performs validations like:
    • Whether chaining is set to true or false.
    • If that particular node is part of the current replica set configurations.
    • Identify the node ahead with oplog with the lowest ping time.
    • The source code that includes validation is here.
  • Once the validation is done, SyncSourceSelector relies on SyncSourceResolver which contains the result and details for the new sync source
  • To get the details and response, SyncSourceResolver coordinates with ReplicationCoordinator
  • This ReplicationCoordinator is responsible for the replication, and co-ordinates with TopologyCoordinator
  • The TopologyCoordinator is responsible for topology of the cluster. It finds the primary oplog time and checks for the maxSyncSourceLagSecs
  • It will reject the source to sync from if the maxSyncSourceLagSecs  is greater than the newest oplog entry. The code for this can be found here
  • If the criteria for the source selection is not fulfilled, then BackgroundSync thread waits and restarts the whole process again to get the sync source.

Example for “unable to find a member to sync from” then, in the next attempt, finding a candidate to sync from

This can be found in the log like this. On receiving the message from rsBackgroundSync thread

could not find member to sync from
, the whole internal process restarts and finds a member to sync from i.e.
sync source candidate: 192.168.103.100:25001
, which means it is now syncing from node 192.168.103.100 running on port 25001.

2018-09-24T13:58:43.197+0000 I REPL     [rsSync] transition to RECOVERING
2018-09-24T13:58:43.198+0000 I REPL     [rsBackgroundSync] could not find member to sync from
2018-09-24T13:58:43.201+0000 I REPL     [rsSync] transition to SECONDARY
2018-09-24T13:58:59.208+0000 I REPL     [rsBackgroundSync] sync source candidate: 192.168.103.100:25001

  • Once the sync source node is selected, SyncSourceResolver probes the sync source to confirm that it is able to fetch the oplogs.
  • RollbackID is also fetched i.e. rbid  after the first batch is returned by oplogfetcher.
  • If all eligible sync sources are too fresh, such as during initial sync, then the syncSourceStatus Oplog start is missing and earliestOpTimeSeen will set a new minValid.
  • This minValid is also set in the case of rollback and abrupt shutdown.
  • If the node has a minValid entry then this is checked for the eligible sync source node.

Example showing the selection of a new sync source when the existing source is found to be invalid

Here, as the logs show, during sync the node chooses a new sync source. This is because it found the original sync source is not ahead, so not does not contain recent oplogs from which to sync.

2018-09-25T15:20:55.424+0000 I REPL     [replication-1] Choosing new sync source because our current sync source, 192.168.103.100:25001, has an OpTime ({ ts: Timestamp 1537879296000|1, t: 4 }) which is not ahead of ours ({ ts: Timestamp 1537879296000|1, t: 4 }), it does not have a sync source, and it's not the primary (sync source does not know the primary)

2018-09-25T15:20:55.425+0000 W REPL [rsBackgroundSync] Fetcher stopped querying remote oplog with error: InvalidSyncSource: sync source 192.168.103.100:25001 (config version: 3; last applied optime: { ts: Timestamp 1537879296000|1, t: 4 }; sync source index: -1; primary index: -1) is no longer valid

  • If the secondary node is too far behind the eligible sync source node, then the node will enter maintenance node and then resync needs to be call manually.
  • Once the sync source is chosen, BackgroundSync starts oplogFetcher.

Example for oplogFetcher

Here is an example of fetching oplog from the “oplog.rs” collection, and checking for the greater than required timestamp.

2018-09-26T10:35:07.372+0000 I COMMAND  [conn113] command local.oplog.rs command: getMore { getMore: 20830044306, collection: "oplog.rs", maxTimeMS: 5000, term: 7, lastKnownCommittedOpTime: { ts: Timestamp 1537955038000|1, t: 7 } } originatingCommand: { find: "oplog.rs", filter: { ts: { $gte: Timestamp 1537903865000|1 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 7, readConcern: { afterOpTime: { ts: Timestamp 1537903865000|1, t: 6 } } } planSummary: COLLSCAN cursorid:20830044306 keysExamined:0 docsExamined:0 numYields:1 nreturned:0 reslen:451 locks:{ Global: { acquireCount: { r: 6 } }, Database: { acquireCount: { r: 3 } }, oplog: { acquireCount: { r: 3 } } } protocol:op_command 3063398ms

When and what details replica set nodes communicate with each other?

At a regular interval, all the nodes communicate with each other to check the status of the primary node, check the status of the sync source, to get the oplogs and so on.

ReplicationCoordinator has ReplicaSetConfig that has a list of all the replica set nodes, and each node has a copy of it. This makes nodes aware of other nodes under same replica set.

This is how nodes communicate in more detail:

Heartbeats: This checks the status of other nodes i.e. alive or die

heartbeatInterval: Every node, at an interval of two seconds, sends the other nodes a heartbeat to make them aware that “yes I am alive!”

heartbeatTimeoutSecs: This is a timeout, and means that if the heartbeat is not returned in 10 seconds then that node is marked as inaccessible or simply die.

Every heartbeat is identified by these replica set details:

  • replica set config version
  • replica set name
  • Sender host address
  • id from the replicasetconfig

The source code could be referred to from here.

When the remote node receives the heartbeat, it processes this data and validates if the details are correct. It then prepares a ReplSetHeartbeatResponse, that includes:

  • Name of the replica set, config version, and optime details
  • Details about primary node as per the receiving node.
  • Sync source details and state of receiving node

This heartbeat data is processed, and if primary details are found then the election gets postponed.

TopologyCoordinator checks for the heartbeat data and confirms if the node is OK or NOT. If the node is OK then no action is taken. Otherwise it needs to be reconfigured or else initiate a priority takeover based on the config.

Response from oplog fetcher

To get the oplogs from the sync source, nodes communicate with each other. This oplog fetcher fetches oplogs through “find” and “getMore”. This will only affect the downstream node that gets metadata from its sync source to update its view from the replica set.

OplogQueryMetadata only comes with OplogFetcher responses

OplogQueryMetadata comes with OplogFetcher response and ReplSetMetadata comes with all the replica set details including configversion and replication commands.

Communicate to update Position commands:

This is to get an update for replication progress. ReplicationCoordinatorExternalState creates SyncSourceFeedback sends replSetUpdatePosition commands.

It includes Oplog details, Replicaset config version, and replica set metadata.

If a new node is added to the existing replica set, how will that node get the data?

If a new node is added to the existing replica set then the “initial sync” process takes place. This initial sync can be done in two ways:

  1. Just add the new node to the replicaset and let initial sync threads restore the data. Then it syncs from the oplogs until it reaches the secondary state.
  2. Copy the data from the recent data directory to the node, and restart this new node. Then it will also sync from the oplogs until it reaches the secondary state.

This is how it works internally

When “initial sync” or “rsync” is called by ReplicationCoordinator  then the node goes to “STARTUP2” state, and this initial sync is done in DataReplicator

  • A sync source is selected to get the data from, then it drops all the databases except the local database, and oplogs are recreated.
  • DatabasesCloner asks syncsource for a list of the databases, and for each database it creates DatabaseCloner.
  • For each DatabaseCloner it creates CollectionCloner to clone the collections
  • This CollectionCloner calls ListIndexes on the syncsource and creates a CollectionBulkLoader for parallel index creation while data cloning
  • The node also checks for the sync source rollback id. If rollback occurred, then it restarts the initial sync. Otherwise, datareplicator is done with its work and then replicationCoordinator assumes the role for ongoing replication.

Example for the “initial sync” :

Here node enters  

"STARTUP2"- "transition to STARTUP2"

Then sync source gets selected and drops all the databases except the local database.  Next, replication oplog is created and CollectionCloner is called.

Local database not dropped: because every node has its own “local” database with its own and other nodes’ information, based on itself, this database is not replicated to other nodes.

2018-09-26T17:57:09.571+0000 I REPL     [ReplicationExecutor] transition to STARTUP2
2018-09-26T17:57:14.589+0000 I REPL     [replication-1] sync source candidate: 192.168.103.100:25003
2018-09-26T17:57:14.590+0000 I STORAGE  [replication-1] dropAllDatabasesExceptLocal 1
2018-09-26T17:57:14.592+0000 I REPL     [replication-1] creating replication oplog of size: 990MB... 2018-09-26T17:57:14.633+0000 I REPL     [replication-0] CollectionCloner::start called, on ns:admin.system.version

Finished fetching all the oplogs, and finishing up initial sync.

2018-09-26T17:57:15.685+0000 I REPL     [replication-0] Finished fetching oplog during initial sync: CallbackCanceled: Callback canceled. Last fetched optime and hash: { ts: Timestamp 1537984626000|1, t: 9 }[-1139925876765058240]
2018-09-26T17:57:15.685+0000 I REPL     [replication-0] Initial sync attempt finishing up.

What are oplogs and where do these reside?

oplogs stands for “operation logs”. We have used this term so many times in this blog post as these are the mandatory logs for the replica set. These operations are in the capped collection called “oplog.rs”  that resides in “local” database.

Below, this is how oplogs are stored in the collection “oplog.rs” that includes details for timestamp, operations, namespace, output.

rplint:PRIMARY> use local
rplint:PRIMARY> show collections
oplog.rs
rplint:PRIMARY> db.oplog.rs.findOne()
{
 "ts" : Timestamp(1537797392, 1),
 "h" : NumberLong("-169301588285533642"),
 "v" : 2,
 "op" : "n",
 "ns" : "",
 "o" : {
 "msg" : "initiating set"
 }
}

It consists of rolling update operations coming to the database. Then these oplogs replicate to the secondary node(s) to maintain the high availability of the data in case of failover.

When the replica MongoDB instance starts, it creates an oplog ocdefault size. For Wired tiger, the default size is 5% of disk space, with a lower bound size of 990MB. So here in the example it creates 990MB of data. If you’d like to learn more about oplog size then please refer here

2018-09-26T17:57:14.592+0000 I REPL     [replication-1] creating replication oplog of size: 990MB...

What if the same oplog is applied multiple times, will that not lead to inconsistent data?

Fortunately, oplogs are Idempotent that means the value will remain unchanged, or will provide the same output, even when applied multiple times.

Let’s check an example:

For the $inc operator that will increment the value by 1 for the filed “item”, if this oplog is applied multiple times then the result might lead to an inconsistent record if this is not Idempotent. However, rather than increasing the item value multiple times, it is actually applied only once.

rplint:PRIMARY> use db1
//inserting one document
rplint:PRIMARY> db.col1.insert({item:1, name:"abc"})
//updating document by incrementing item value with 1
rplint:PRIMARY> db.col1.update({name:"abc"},{$inc:{item:1}})
//updated value is now item:2
rplint:PRIMARY> db.col1.find()
{ "_id" : ObjectId("5babd57cce2ef78096ac8e16"), "item" : 2, "name" : "abc" }

This is how these operations are stored in oplog, here this $inc value is stored in oplog as $set

rplint:PRIMARY> db.oplog.rs.find({ns:"db1.col1"})
//insert operation
{ "ts" : Timestamp(1537987964, 2), "t" : NumberLong(9), "h" : NumberLong("8083740413874479202"), "v" : 2, "op" : "i", "ns" : "db1.col1", "o" : { "_id" : ObjectId("5babd57cce2ef78096ac8e16"), "item" : 1, "name" : "abc" } }
//$inc operation is changed as ""$set" : { "item" : 2"
{ "ts" : Timestamp(1537988022, 1), "t" : NumberLong(9), "h" : NumberLong("-1432987813358665721"), "v" : 2, "op" : "u", "ns" : "db1.col1", "o2" : { "_id" : ObjectId("5babd57cce2ef78096ac8e16") }, "o" : { "$set" : { "item" : 2 } } }

That means that however many  times it is applied, it will generate the same results, so no inconsistent data!

I hope this blog post helps you to understand multiple scenarios for MongoDB replica sets, and how data replicates to the nodes.

by Aayushi Mangal at October 10, 2018 11:08 AM

October 09, 2018

Peter Zaitsev

PostgreSQL Monitoring: Set Up an Enterprise-Grade Server (and Sign Up for Webinar Weds 10/10…)

PostgreSQL Monitoring

PostgreSQL logoThis is the last post in our series on building an enterprise-grade PostgreSQL set up using open source tools, and we’ll be covering monitoring.

The previous posts in this series discussed aspects such as security, backup strategy, high availability, connection pooling and load balancing, extensions, and detailed logging in PostgreSQL. Tomorrow, Wednesday, October 10 at 10AM EST, we will be reviewing these topics together, and showcasing then in practice in a webinar format: we hope you can join us!

 

Monitoring databases

The importance of monitoring the activity and health of production systems is unquestionable. When it comes to the database, with its high number of customizable settings, the ability to track its various metrics (status counters and gauges) allows for the maintenance of a historical record of its performance over time. This can be used for capacity planningtroubleshooting and validation.

When it comes to capacity planning, a monitoring solution is a helpful tool to help you assess how the current setup is faring. At the same time, it can help predict future needs based on trends, such as the increase of active connections, queries, and CPU usage. For example, an increase in CPU usage might be due to a genuine increase in workload, but it could also be a sign of unoptimized queries growing in popularity. In which case, comparing CPU with disk access might provide a more complete view of what is going on.

Being able to easily correlate data like this helps you to catch minor issues and to plan accordingly, sometimes allowing you to avoid an easier but more costly solution of scaling up to mitigate problems like this. But having the right monitoring solution is really invaluable when it comes to investigative work and root cause analysis. Trying to understand a problem that has already taken place is a rather complicated, and often unenviable, task unless you established a continuous, watchful eye on the set up for the whole time.

Finally, a monitoring solution can help you validate changes made in the business logic in general or in the database configuration in specific. By comparing prior and post results for a given metric or for overall performance, you can observe the impact of such changes in practice.

Monitoring PostgreSQL with open source solutions

There is a number of monitoring solutions for PostgreSQL and postgresql.org’s Wiki provides an extensive list, albeit a little outdated. It categorizes the main monitoring solutions into two distinct categories: those that can be identified as generic solutions—and can be extended to cover different technologies through custom plugins—and those labeled as Postgres-centric, which are specific to PostgreSQL.

In the first group, we find venerated open source monitoring tools such as Munin, Zabbix, and CactiNagios could have also been added to this group but it was instead indirectly included in the “Checkers” group. That category includes monitoring scripts that can be used both in stand-alone mode or as feeders (plugins) for “Nagios like software“. Examples of these are check_pgactivity and check_postgres.

One omission from this list is Grafana, a modern time series analytics platform conceived to display metrics from a number of different data sources. Grafana includes a solution packaged as a PostgreSQL native plugin. Percona has built its Percona Monitoring and Management (PMM) platform around Grafana, using Prometheus as its data source. Since version 1.14.0, PMM supports PostgreSQL. Query Analytics (QAN) integration is coming soon.

An important factor that all these generic solutions have in common is that they are widely used for the monitoring of a diverse collection of services, like you’d normally find in enterprise-like environments. It’s common for a given company to adopt one, or sometimes two, such solutions with the aim of monitoring their entire infrastructure. This infrastructure often includes a heterogeneous combination of databases and application servers.

Nevertheless, there is a place for complementary Postgres-centric monitoring solutions in such enterprise environments too. These solutions are usually implemented with a specific goal in mind. Two examples we can mention in this context are PGObserver, which has a focus on monitoring stored procedures, and pgCluu, with its focus on auditing.

Monitoring PostgreSQL with PMM

We built an enterprise-grade PostgreSQL set up for the webinar, and use PMM for monitoring. We will be showcasing some of PMM’s main features, and highlighting some of the most important metrics to watch, during our demo.You may want to have a look at this demo setup to get a feel of how our PostgreSQL Overview dashboard looks:

You can find instructions on how to setup PMM for monitoring your PostgreSQL server in our documentation space. And if there’s still time, sign up for tomorrow’s webinar!

 

by Fernando Laudares Camargos at October 09, 2018 04:23 PM

Upcoming Webinar Thurs 10/11: Build Highly Scalable IoT Architectures with Percona Server for MongoDB

Highly Scalable IoT Architectures

Highly Scalable IoT ArchitecturesPlease join Percona’s Product Manager for Percona Server for MongoDB, Jeff Sandstrom; Sr. Tech Ops Architect for MongoDB, Tim Vaillancourt; and Mesosphere’s Senior Director of Community and Evangelism, Matt Jarvis, on Thursday, October 11, 2018 at 10:00 AM PDT (UTC–7) / 1:00 PM EDT (UTC–4), as they demonstrate how to build highly scalable Internet of Things architectures with Percona Server for MongoDB on DC/OS.

 

Percona Server for MongoDB is a free and open-source drop-in replacement for MongoDB Community Edition. It combines all the features and benefits of MongoDB Community Edition with enterprise-class features from Percona, including an in-memory engine, log redaction, auditing, and hot backups.

Mesosphere DC/OS is an enterprise-grade, datacenter-scale operating system, providing a single platform for running containers, data services, and distributed applications on a single unified computing environment.

In this webinar, we’ll:

  • Review the benefits of Percona Server for MongoDB
  • Discuss a variety of use cases for Percona Server for MongoDB on DC/OS
  • Demonstrate exactly how you can use Percona Server for MongoDB on DC/OS to capture data from Internet of Things devices
  • Tell you how you can participate in the beta program for this exciting solution

Register for this webinar to learn how to build highly scalable IoT architectures with Percona Server for MongoDB on DC/OS.

by Jeff Sandstrom at October 09, 2018 02:30 PM

Announcement: Second Alpha Build of Percona XtraBackup 8.0 Is Available

Percona XtraBackup 8.0

Percona XtraBackup 8.0The second alpha build of Percona XtraBackup 8.0.2 is now available in the Percona experimental software repositories.

Note that, due to the new MySQL redo log and data dictionary formats, the Percona XtraBackup 8.0.x versions will only be compatible with MySQL 8.0.x and Percona Server for MySQL 8.0.x. This release supports backing up Percona Server 8.0 Alpha.

For experimental migrations from earlier database server versions, you will need to backup and restore and using XtraBackup 2.4 and then use mysql_upgrade from MySQL 8.0.x

PXB 8.0.2 alpha is available for the following platforms:

  • RHEL/Centos 6.x
  • RHEL/Centos 7.x
  • Ubuntu 14.04 Trusty*
  • Ubuntu 16.04 Xenial
  • Ubuntu 18.04 Bionic
  • Debian 8 Jessie*
  • Debian 9 Stretch

Information on how to configure the Percona repositories for apt and yum systems and access the Percona experimental software is here.

* We might drop these platforms before GA release.

Improvements

  • PXB-1658: Import keyring vault plugin from Percona Server 8
  • PXB-1609: Make version_check optional at build time
  • PXB-1626: Support encrypted redo logs
  • PXB-1627: Support obtaining binary log coordinates from performance_schema.log_status

Fixed Bugs

  • PXB-1634: The CREATE TABLE statement could fail with the DUPLICATE KEY error
  • PXB-1643: Memory issues reported by ASAN in PXB 8
  • PXB-1651: Buffer pool dump could create a (null) file during prepare stage of Mysql8.0.12 data
  • PXB-1671: A backup could fail when the MySQL user was not specified
  • PXB-1660: InnoDB: Log block N at lsn M has valid header, but checksum field contains Q, should be P

Other bugs fixed: PXB-1623PXB-1648PXB-1669PXB-1639, and PXB-1661.

by Borys Belinsky at October 09, 2018 06:33 AM

MariaDB Foundation

My first week, Looking Forward

First of all, thank you for your warm words of welcome – coming from so many people across different media, I think this is a very positive sign for working with the organisations and individuals within the MariaDB community. Regardless of the situation, a change like this, or actually the person that comes in, is […]

The post My first week, Looking Forward appeared first on MariaDB.org.

by Arjen Lentz at October 09, 2018 03:00 AM

October 08, 2018

Peter Zaitsev

Persistence of autoinc fixed in MySQL 8.0

MySQL 8.0 autoinc persistence fixed

MySQL 8.0 autoinc persistence fixedThe release of MySQL 8.0 has brought a lot of bold implementations that touched on things that have been avoided before, such as added support for common table expressions and window functions. Another example is the change in how AUTO_INCREMENT (autoinc) sequences are persisted, and thus replicated.

This new implementation carries the fix for bug #73563 (Replace result in auto_increment value less or equal than max value in row-based), which we’ve only found about recently. The surprising part is that the use case we were analyzing is a somewhat common one; this must be affecting a good number of people out there.

Understanding the bug

The business logic of the use case is such the UNIQUE column found in a table whose id is managed by an AUTO_INCREMENT sequence needs to be updated, and this is done with a REPLACE operation:

“REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.”

So, what happens in practice in this particular case is a DELETE followed by an INSERT of the target row.

We will explore this scenario here in the context of an oversimplified currency converter application that uses USD as base reference:

CREATE TABLE exchange_rate (
id INT PRIMARY KEY AUTO_INCREMENT,
currency VARCHAR(3) UNIQUE,
rate FLOAT(5,3)
) ENGINE=InnoDB;

Let’s add a trio of rows to this new table:

INSERT INTO exchange_rate (currency,rate) VALUES ('EUR',0.854), ('GBP',0.767), ('BRL',4.107);

which gives us the following initial set:

master (test) > select * from exchange_rate;
+----+----------+-------+
| id | currency | rate  |
+----+----------+-------+
|  1 | EUR      | 0.854 |
|  2 | GBP      | 0.767 |
|  3 | BRL      | 4.107 |
+----+----------+-------+
3 rows in set (0.00 sec)

Now we update the rate for Brazilian Reais using a REPLACE operation:

REPLACE INTO exchange_rate SET currency='BRL', rate=4.500;

With currency being a UNIQUE field the row is fully replaced:

master (test) > select * from exchange_rate;
+----+----------+-------+
| id | currency | rate  |
+----+----------+-------+
|  1 | EUR      | 0.854 |
|  2 | GBP      | 0.767 |
|  4 | BRL      | 4.500 |
+----+----------+-------+
3 rows in set (0.00 sec)

and thus the autoinc sequence is updated:

master (test) > SHOW CREATE TABLE exchange_rate\G
*************************** 1. row ***************************
     Table: exchange_rate
Create Table: CREATE TABLE `exchange_rate` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`currency` varchar(3) DEFAULT NULL,
`rate` float(5,3) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `currency` (`currency`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

The problem is that the autoinc sequence is not updated in the replica as well:

slave1 (test) > select * from exchange_rate;show create table exchange_rate\G
+----+----------+-------+
| id | currency | rate  |
+----+----------+-------+
|  1 | EUR      | 0.854 |
|  2 | GBP      | 0.767 |
|  4 | BRL      | 4.500 |
+----+----------+-------+
3 rows in set (0.00 sec)
*************************** 1. row ***************************
     Table: exchange_rate
Create Table: CREATE TABLE `exchange_rate` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`currency` varchar(3) DEFAULT NULL,
`rate` float(5,3) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `currency` (`currency`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Now, the moment we promote that replica as master and start writing to this table we’ll hit a duplicate key error:

slave1 (test) > REPLACE INTO exchange_rate SET currency='BRL', rate=4.600;
ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'

Note that:

a) the transaction fails and the row is not replaced, however the autoinc sequence is incremented:

slave1 (test) > SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE table_schema='test' AND table_name='exchange_rate';
+----------------+
| AUTO_INCREMENT |
+----------------+
|              5 |
+----------------+
1 row in set (0.00 sec)

b) this problem only happens with row-based replication (binlog_format=ROW), where REPLACE in this case is logged as a row UPDATE:

# at 6129
#180829 18:29:55 server id 100  end_log_pos 5978 CRC32 0x88da50ba Update_rows: table id 117 flags: STMT_END_F
### UPDATE `test`.`exchange_rate`
### WHERE
###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
###   @3=4.107                /* FLOAT meta=4 nullable=1 is_null=0 */
### SET
###   @1=4 /* INT meta=0 nullable=0 is_null=0 */
###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
###   @3=4.5                  /* FLOAT meta=4 nullable=1 is_null=0 */

With statement-based replication—or even mixed format—the REPLACE statement is replicated as is: it will trigger a DELETE+INSERT in the background on the replica and thus update the autoinc sequence in the same way it did on the master.

This example (tested with Percona Server versions 5.5.61, 5.6.36 and 5.7.22) helps illustrate the issue with autoinc sequences not being persisted as they should be with row-based replication. However, MySQL’s Worklog #6204 includes a couple of scarier scenarios involving the master itself, such as when the server crashes while a transaction is writing to a table similar to the one used in the example above. MySQL 8.0 remedies this bug.

Workarounds

There are a few possible workarounds to consider if this problem is impacting you and if neither upgrading to the 8 series nor resorting to statement-based or mixed replication format are viable options.

We’ll be discussing three of them here: one that resorts around the execution of checks before a failover (to detect and fix autoinc inconsistencies in replicas), another that requires a review of all REPLACE statements like the one from our example and adapt it as to include the id field, thus avoiding the bug, and finally one that requires changing the schema of affected tables in such a way that the target field is made the Primary Key of the table while id (autoinc) is converted into a UNIQUE key.

a) Detect and fix

The less intrusive of the workarounds we conceived for the problem at hand in terms of query and schema changes is to run a check for each of the tables that might be facing this issue in a replica before we promote it as master in a failover scenario:

slave1 (test) > SELECT ((SELECT MAX(id) FROM exchange_rate)>=(SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE table_schema='test' AND table_name='exchange_rate')) as `check`;
+-------+
| check |
+-------+
|     1 |
+-------+
1 row in set (0.00 sec)

If the table does not pass the test, like ours didn’t at first (just before we attempted a REPLACE after we failed over to the replica), then update autoinc accordingly. The full routine (check + update of autoinc) could be made into a single stored procedure:

DELIMITER //
CREATE PROCEDURE CheckAndFixAutoinc()
BEGIN
 DECLARE done TINYINT UNSIGNED DEFAULT 0;
 DECLARE tableschema VARCHAR(64);
 DECLARE tablename VARCHAR(64);
 DECLARE columnname VARCHAR(64);  
 DECLARE cursor1 CURSOR FOR SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys') AND EXTRA LIKE '%auto_increment%';
 DECLARE CONTINUE HANDLER FOR NOT FOUND SET done=1;
 OPEN cursor1;  
 start_loop: LOOP
  IF done THEN
    LEAVE start_loop;
  END IF;
  FETCH cursor1 INTO tableschema, tablename, columnname;
  SET @get_autoinc = CONCAT('SELECT @check1 := ((SELECT MAX(', columnname, ') FROM ', tableschema, '.', tablename, ')>=(SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA=\'', tableschema, '\' AND TABLE_NAME=\'', tablename, '\')) as `check`');
  PREPARE stm FROM @get_autoinc;
  EXECUTE stm;
  DEALLOCATE PREPARE stm;
  IF @check1>0 THEN
    BEGIN
      SET @select_max_id = CONCAT('SELECT @max_id := MAX(', columnname, ')+1 FROM ', tableschema, '.', tablename);
      PREPARE select_max_id FROM @select_max_id;
      EXECUTE select_max_id;
      DEALLOCATE PREPARE select_max_id;
      SET @update_autoinc = CONCAT('ALTER TABLE ', tableschema, '.', tablename, ' AUTO_INCREMENT=', @max_id);
      PREPARE update_autoinc FROM @update_autoinc;
      EXECUTE update_autoinc;
      DEALLOCATE PREPARE update_autoinc;
    END;
  END IF;
 END LOOP start_loop;  
 CLOSE cursor1;
END//
DELIMITER ;

It doesn’t allow for as clean a failover as we would like but it can be helpful if you’re stuck with MySQL<8.0 and binlog_format=ROW and cannot make changes to your queries or schema.

b) Include Primary Key in REPLACE statements

If we had explicitly included the id (Primary Key) in the REPLACE operation from our example it would have also been replicated as a DELETE+INSERT even when binlog_format=ROW:

master (test) > REPLACE INTO exchange_rate SET currency='BRL', rate=4.500, id=3;
# at 16151
#180905 13:32:17 server id 100  end_log_pos 15986 CRC32 0x1d819ae9  Write_rows: table id 117 flags: STMT_END_F
### DELETE FROM `test`.`exchange_rate`
### WHERE
###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
###   @3=4.107                /* FLOAT meta=4 nullable=1 is_null=0 */
### INSERT INTO `test`.`exchange_rate`
### SET
###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
###   @2='BRL' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
###   @3=4.5                  /* FLOAT meta=4 nullable=1 is_null=0 */
# at 16199
#180905 13:32:17 server id 100  end_log_pos 16017 CRC32 0xf11fed56  Xid = 184
COMMIT/*!*/;

We could point out that we are doing it wrong by not having the id included in the REPLACE statement in the first place; the reason for not doing so would be mostly related to avoiding an extra lookup for each replace (to obtain the id for the currency we want to update). On the other hand, what if your business logic do expects the id to change at each REPLACE ? You should have such requirement in mind when considering this workaround as it is effectively a functional change to what we had initially.

c) Make the target field the Primary Key and keep autoinc as a UNIQUE key

If we make currency the Primary Key of our table and id a UNIQUE key instead:

CREATE TABLE exchange_rate (
id INT UNIQUE AUTO_INCREMENT,
currency VARCHAR(3) PRIMARY KEY,
rate FLOAT(5,3)
) ENGINE=InnoDB;

the same REPLACE operation will be replicated as a DELETE+INSERT too:

# at 19390
#180905 14:03:56 server id 100  end_log_pos 19225 CRC32 0x7042dcd5  Write_rows: table id 131 flags: STMT_END_F
### DELETE FROM `test`.`exchange_rate`
### WHERE
###   @1=3 /* INT meta=0 nullable=0 is_null=0 */
###   @2='BRL' /* VARSTRING(3) meta=3 nullable=0 is_null=0 */
###   @3=4.107                /* FLOAT meta=4 nullable=1 is_null=0 */
### INSERT INTO `test`.`exchange_rate`
### SET
###   @1=4 /* INT meta=0 nullable=0 is_null=0 */
###   @2='BRL' /* VARSTRING(3) meta=3 nullable=0 is_null=0 */
###   @3=4.5                  /* FLOAT meta=4 nullable=1 is_null=0 */
# at 19438
#180905 14:03:56 server id 100  end_log_pos 19256 CRC32 0x79efc619  Xid = 218
COMMIT/*!*/;

Of course, the same would be true if we had just removed id entirely from the table and kept currency as the Primary Key. This would work in our particular test example but that won’t always be the case. Please note though that if you do keep id on the table you must make it a UNIQUE key: this workaround is based on the fact that this key becomes a second unique constraint, which triggers a different code path to log a replace operation. Had we made it a simple, non-unique key instead that wouldn’t be the case.

If you have any comments or suggestions about the issue addressed in this post, the workarounds we propose, or even a different view of the problem you would like to share please leave a comment in the section below.

Co-Author: Trey Raymond

Trey RaymondTrey Raymond is a Sr. Database Engineer for Oath Inc. (née Yahoo!), specializing in MySQL. Since 2010, he has worked to build the company’s database platform and supporting team into industry leaders.

While a performance guru at heart, his experience and responsibilities range from hardware and capacity planning all through the stack to database tool and utility development.

He has a reputation for breaking things to learn something new.

Co-Author: Fernando Laudares

fernando laudaresFernando is a Senior Support Engineer with Percona. Fernando’s work experience includes the architecture, deployment and maintenance of IT infrastructures based on Linux, open source software and a layer of server virtualization. He’s now focusing on the universe of MySQL, MongoDB and PostgreSQL with a particular interest in understanding the intricacies of database systems, and contributes regularly to this blog. You can read his other articles here.

by Fernando Laudares Camargos at October 08, 2018 04:00 PM

Detailed Logging for Enterprise-Grade PostreSQL

detailed logging PostgreSQL

PostgreSQL® logoIn this penultimate post from our series on building an enterprise-grade PostgreSQL environment we cover the parameters we have enabled to configure detailed logging in the demo setup we will showcase in our upcoming webinar.

Detailed logging in PostgreSQL and log analyzer

Like other RDBMS, PostgreSQL allows you to maintain a log of activities and error messages. Until PostgreSQL 9.6, PostgreSQL log files were generated in pg_log directory (inside the data directory) by default. Since PostgreSQL 10, pg_log has been renamed to simply log. However, this directory can be modified to a different location by modifying the parameter log_directory.

Unlike MySQL, PostgreSQL writes the error and activity log to the same log file thus it may grow to several GBs when detailed logging is enabled. In these cases, logging becomes IO-intensive thus it is recommended to store log files in a different storage to the one hosting the data directory.

Parameters to enable detailed logging

Here’s a list of parameters used to customize logging in PostgreSQL. All of them need to be modified in the postgresql.conf or postgresql.auto.conf files.

logging_collector: in order to log any activity in PostgreSQL this parameter must be enabled. The backend process responsible for logging database activity is called logger, it gets started when logging_collector is set to ON. Changing this parameter requires a PostgreSQL restart.

log_min_duration_statement: this parameter is used primarily to set a time threshold: queries running longer than such should be logged (as “slow queries”). Setting it to -1 disables logging of statements. Setting it to 0 enables the logging of every statement running in the database, regardless of its duration. The time unit should follow the actual value, for example: 250ms,  250s, 250min, 1h. Changing this parameter does not require a PostgreSQL restart – a simple reload of the configuration is enough.reload but not a restart. For example:

log_min_duration_statement = 5s
  logs every statement running for 5 seconds or longer.

log_line_prefix: helps you customize every log line being printed in the PostgreSQL log file. You can log the process id, application name, database name and other details for every statement as required. The following log_line_prefix may be helpful in most scenarios:

log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h'

The above setting records the following for every statement being logged:

%t : Time stamp without milliseconds
%p : Process id
%l-1 : Number of the log line for each session or process, starting at 1
%u : User name
%d : Database name
%a : Application name
%h : Remote host name or IP address

With the above settings employed for log_line_prefix, the log appears as follows:

2018-06-08 12:36:26 EDT [21315]: [1-1] user=postgres,db=percona,app=psql,client=192.168.0.12 LOG: duration: 2.611 ms statement: select * from departments where id = 26;
2018-06-08 12:36:49 EDT [21323]: [1-1] user=postgres,db=percona,app=psql,client=192.168.0.12 LOG: duration: 2.251 ms statement: select count(*) from departments;

To see more on log_line_prefix, You can refer to the PostgreSQL documentation for further details on this feature.

log_duration: the enabling of this parameter records the duration of every completed statement in PostgreSQL log, and this irrespective of log_min_duration_statement. Have in mind that, as it happens with log_min_duration_statement, the enabling of log_duration may increase log file usage and add affect the server’s general performance. For this reason, if you already have log_min_duration_statement enabled it is often suggested to disable log_duration, unless there’s a specific need to keep track of both.

log_lock_waits: when log_lock_waits is enabled a log message is recorded when a session waits longer than deadlock_timeout to acquire a lock.

log_checkpoints: logs all checkpoints and restart points to the PostgreSQL log file.

log_rotation_size: defines the size limit for each log file; once it reaches this threshold the log file is rotated.
Example: 

log_rotation_size = '500MB'
  – every log file is limited to a size of 500 MB.

log_rotation_age: determines the maximum life span for a log file, forcing its rotation once this threshold is reached. This parameter is usually set in terms of hours, or maybe days, however the minimum granularity is a minute. However, if log_rotation_size is reached first, the log gets rotated anyway, irrespective of this setting.
Example: 

log_rotation_age = 1d

log_statement: controls what type of SQLs are logged. The recommended setting is DDL, which logs all DDLs that are executed. Tracking those allow you to later audit when a given DDL was executed, and by who. By monitoring and understanding the amount of information it may write to the log file you may consider modifying this setting. Other possible values are none, mod (includes DDLs plus DMLs) and all.

log_temp_files: logs information related to a temporary table file whose size is greater than this value (in KBs).

log_directory: defines the directory in which log files are created. Once more, please note that if you have enabled detailed logging it is recommended to have a separate disk—different from the data directory disk—allocated for log_directory.
Example: 

log_directory = /this_is_a_new_disk/pg_log

Log Analyzer – pgBadger

You cannot have a separate error log and a slow query log. Everything is written to one log file, which may be periodically rotated based on time and size. Over a period of time, this log file size may increase to several MB’s or even GB’s depending on the amount of logging that has been enabled. It could get difficult for a DBA/developer to parse the log files and get a better view about what is running slowly and how many times a query has run. To help with this taks you may use pgBadger, a log analyzer for PostgreSQL, to parse log files and generate a rich HTML-based report that you can access from a browser. An example report can be seen in the screenshots below:

We’ll be showing detailed logging and pgBadger in action in a couple of days so, if there’s still time, sign up for our October webinar !

by Fernando Laudares Camargos at October 08, 2018 01:13 PM

October 06, 2018

Valeriy Kravchuk

On MySQL XA Transactions

One of the features I missed in my blog post on problematic MySQL features back in July is XA transactions. Probably I was too much in a hurry, as this feature is known to be somewhat buggy, limited and not widely used outside of Java applications. My first related feature request, Bug #40445 - "Provide C functions library implementing X/OPEN XA interface for Bea Tuxedo", was created almost 10 years ago, based on the issue from one of MySQL/Sun customers of that time. I remember some internal debates on how much time and efforts the implementation may require, but the decision was not made anyway, and one still can not directly integrate MySQL with Tuxedo transaction manager (that the idea of XA transactions originally came from). It's even more funny to see that feature request still just "Verified" when taking into account the fact that BEA Tuxedo software is Oracle's software since... 2008.

XA Transactions support is a useful MySQL feature, but I wonder if one day it may just become abandoned as that West Pier in Brighton, or overwhelmed with many small bugs in a same way as this stairs to the beach in Hove...

But maybe XA transactions are not widely used and nobody cares much about them?

Let me try to do a quick review of related active bug reports and feature requests before making any conclusions:
  • Bug #91702 - "Feature Request: JOIN support for XA START command". This feature request was added less than 3 months ago and is still "Open". It means there are users interested in this feature, but Oracle engineers do not care much even to verify related requests, even less - to give them some priority. 
    See also Bug #78498 - "XA issue or limitation with 5.6.19 engine", reported 3 years ago, that is essentially about the same limitation. As bug reporter explained:
    "... it prevents us to use MySQL with Weblogic on 2 phase commit scenarii..."
  • Yet another example of a request ignored for a long time is Bug #90659 - "implicit commit and unlocking with xa start", that is about the inconsistency of current implementation. Even less (as we already know) they care about XA support outside of Java as one can conclude from the fact that Connector/Net related request, Bug #70587 - "Dot Net Distributed transaction not supported in MySql Server 5.6", had not got any attention since July, 2015...
  • Bug #91646 - "xa command still operation when super_read_only is true". This bug was reported in July by Zhenghu Wen. It seems nobody cares much about XA transactions integration when new features are added to MySQL server.
  • Bug #89860 - "XA may lost prepared transaction and cause different between master and slave". This bug reported by Michael Yang (See also Bug #88534) sounds really serious and was previously reported by Andrei Elkin (who works for MariaDB now) as Bug #76233 - "XA prepare is logged ahead of engine prepare". See also Bug #87560 - "XA PREPARE log order error in replication and binlog recovery" by Wei Zhao, who also contributed a patch. See also Bug #83983 - "innodb fail to recover the prepared xa transaction" (the bug reported by Dennis Gao is still "Open", while it's clearly related to or is a duplicate of "Verified" bugs mentioned above).
    So many related/duplicate problem reports, but no fix so far!
  • Bug #88748 - "InnoDB: Failing assertion: trx->conc_state == 1". This assertion failure was reported by Roel Van de Paar back in December, 2017. See also his Bug #84754 - "oid String::chop(): Assertion `strlen(m_ptr) == m_length' failed."
    I noted that Oracle recently invented new "low" severity levels, and this bug is S6 (Debug Builds). I do not really agree that assertions in debug builds are of so low severity - they are in the code for a reason, to prevent crashes in non-debug builds and all kinds of inconsistencies.
  • Bug #87526 - "The output of 'XA recover convert xid' is not useful". This bug reported by Sveta Smirnova caused a lot of troubles to poor users with prepared transactions hanging around for weeks after crash, as it prevented any easy way to get rid of them (and related locks) in some cases. The bug is still "Verified" in MySQL and "On hold" in Percona Server, while MariaDB fixed it in 10.3, see MDEV-14593.
  • Bug #87130 - "XA COMMIT not taken as transaction boundary". Yet another bug report with a patch from Wei Zhao.
  • Bug #75205 - "Master should write a LOST_EVENTS entry on xa commit after recovery." Daniël van Eeden reported this at early 5.7 pre-GA stage, and manual explains now that:
    "In MySQL 5.7.7 and later, there is a change in behavior and an XA transaction is written to the binary log in two parts. When XA PREPARE is issued, the first part of the transaction up to XA PREPARE is written using an initial GTID. A XA_prepare_log_event is used to identify such transactions in the binary log. When XA COMMIT or XA ROLLBACK is issued, a second part of the transaction containing only the XA COMMIT or XA ROLLBACK statement is written using a second GTID. Note that the initial part of the transaction, identified by XA_prepare_log_event, is not necessarily followed by its XA COMMIT or XA ROLLBACK, which can cause interleaved binary logging of any two XA transactions. The two parts of the XA transaction can even appear in different binary log files. This means that an XA transaction in PREPARED state is now persistent until an explicit XA COMMIT or XA ROLLBACK statement is issued, ensuring that XA transactions are compatible with replication."
    but the bug report is still "Verified".
    By the way, the need to deal with such prepared transactions recovered from the binary log caused problems like those listed above (with XA RECOVER CONVERT and order of preparing in the binary log vs engines that support XA...
  • Bug #71351 - "select hit query cache after xa commit, no result return". This bug probably affects only MySQL 5.5, so no wonder it's ignored now. Nobody tried to fix it while MySQL 5.5 was still supported, though.
There are some more bugs originally filed in other categories, but still related to XA:
  • Bug #72036 - "XA isSameRM() shouldn't take database into account". This Connecotr/J bug was reported in 2014 by Jess Balint.
  • Bug #78050 - "Crash on when XA functions activated by a storage engine". It happens when binary log not enabled. This bug was reported by Zhenye Xie, who also contributed a patch later. Still this crashing bug remains "Verified".
  • Bug #87385 - "Partial external XA transactions are not rolled back correctly". Yet another bug report with a patch from Wei Zhao. See also his Bug #87389 - "Replication position not persisted correctly for XA transactions".
  • Bug #91633 - "Replication failure (errno 1399) on update in XA tx after deadlock". This bug reported by Lukas Sydorowski got recent comment from other community member yesterday. So, the feature is used these days, still.
Now time for conclusions:
  1. Take extra care while using XA transactions in replication environments or with point in time recovery - you may easily end up with slaves out of sync with master and data lost.
  2. Feature requests related to XA transactions are mostly ignored, sometimes for a decade... 
  3. Patches contributed do not seem to speed up XA bug fixing.
  4. I'd say that Oracle does not care much about XA Transactions since MySQL 5.7 GA release in 2015.
  5. MySQL Community still use XA transactions with MySQL (and they will be used even more as corporate users migrate from Oracle RDBMS), find bugs and even try to fix them. But probably will have to use forks rather than MySQL itself if current attitude towards XA bugs processing and fixing remains.

by Valeriy Kravchuk (noreply@blogger.com) at October 06, 2018 03:59 PM

October 05, 2018

MariaDB AB

New Certified Docker Images + Kubernetes Scripts Simplify MariaDB Cloud Deployments

New Certified Docker Images + Kubernetes Scripts Simplify MariaDB Cloud Deployments Saravana Krish… Fri, 10/05/2018 - 14:33

In the last few years, enterprise development teams have been focused on reducing the cost of production-grade applications while improving the velocity and agility of development. That’s led to massive public and private cloud adoption – and deployment of databases in containers. To address this growing need, we’ve released new Docker images and Kubernetes scripts that make it easy to deploy and manage MariaDB databases. Now organizations can focus on building their applications rather than on managing and optimizing container infrastructure.

On-demand webinar: The Future of MariaDB on Containers
Watch this recorded webinar to get a look at official Docker images, learn how to run stateful MariaDB clusters on Kubernetes and more.
Watch Now

New – MariaDB Docker Images, Kubernetes Scripts & Sandbox Environments

We’ve released certified Docker images and enabled customers to seamlessly deploy MariaDB servers in Kubernetes and Docker environments. We are delivering three standalone Docker images (one each for MariaDB Server, ColumnStore and MaxScale) and two sandboxes (one for MariaDB AX and one for MariaDB TX). Customers can deploy the standalone Docker images in a standard Docker environment or create complex topologies in Kubernetes environment using YAML scripts.

For someone just getting started with MariaDB or wanting to learn the capabilities offered, the sandboxes will enable them to easily experiment with MariaDB AX and TX. Sandboxes are self-contained, with all the documentation needed to quickly bring up TX or AX and experiment with sample apps (bookstore in the case of TX, and Zeppelin notebook in the case of AX). You have the flexibility to quickly deploy them on a laptop or desktop using Docker Compose to immediately run sample applications.

We’ve also released a Kubernetes script to create a master/slave (one master + two slaves) cluster with MaxScale at the front end. (This script is showcased in the on-demand webinar.) Now customers can deploy MariaDB TX with MaxScale easily in high availability mode. The script deploys the master/slave cluster in such a way that when the master fails one of the slaves will be automatically promoted as the new master. Kubernetes will try to bring up the pod and maintain the configuration integrity. When the old master comes back it will automatically become a slave. The script also supports expanding the number of slave nodes easily, using a simple command. 

MariaDB clusters_Kubernetes.png

 

How to Get Started with Docker and Kubernetes

In the last few years, enterprise development teams have been focused on reducing the cost of production-grade applications while improving the velocity and agility of development. That’s led to massive public and private cloud adoption – and deployment of databases in containers. To address this growing need, we’ve released new Docker images and Kubernetes scripts that make it easy to deploy and manage MariaDB databases. Now organizations can focus on building their applications rather than on managing and optimizing container infrastructure.

Login or Register to post comments

by Saravana Krishnamurthy at October 05, 2018 06:33 PM

Peter Zaitsev

PostgreSQL Extensions for an Enterprise-Grade System

PostgreSQL extensions for logging

PostgreSQL® logoIn this current series of blog posts we have been discussing various relevant aspects when building an enterprise-grade PostgreSQL setup, such as security, back up strategy, high availability, and different methods to scale PostgreSQL. In this blog post, we’ll get to review some of the most popular open source extensions for PostgreSQL, used to expand its capabilities and address specific needs. We’ll cover some of them during a demo in our upcoming webinar on October 10.

Expanding with PostgreSQL Extensions

PostgreSQL is one of the world’s most feature-rich and advanced open source RDBMSs. Its features are not just limited to those released by the community through major/minor releases. There are hundreds of additional features developed using the extensions capabilities in PostgreSQL, which can cater to needs of specific users. Some of these extensions are very popular and useful to build an enterprise-grade PostgreSQL environment. We previously blogged about a couple of FDW extensions (mysql_fdw and postgres_fdw ) which will allow PostgreSQL databases to talk to remote homogeneous/heterogeneous databases like PostgreSQL and MySQL, MongoDB, etc. We will now cover a few other additional extensions that can expand your PostgreSQL server capabilities.

pg_stat_statements

The pg_stat_statements module provides a means for tracking execution statistics of all SQL statements executed by a server. The statistics gathered by the module are made available via a view named pg_stat_statements. This extension must be installed in each of the databases you want to track, and like many of the extensions in this list, it is available in the contrib package from the PostgreSQL PGDG repository.

pg_repack

Tables in PostgreSQL may end up with fragmentation and bloat due to the specific MVCC implementation in PostgreSQL, or simply due to a high number of rows being naturally removed. This could lead to not only unused space being held inside the table but also to sub-optimal execution of SQL statements. pg_repack is the most popular way to address this problem by reorganizing and repacking the table. It can reorganize the table’s content without placing an exclusive lock on it during the process. DMLs and queries can continue while repacking is happening.  Version 1.2 of pg_repack introduces further new features of parallel index builds, and the ability to rebuild just the indexes. Please refer to the official documentation for more details.

pgaudit

PostgreSQL has a basic statement logging feature. It can be implemented using the standard logging facility with

log_statement = all
 . But this is not sufficient for many audit requirements. One of the essential features for enterprise deployments is the capability for fine-grained auditing the user interactions/statements issued to the database. This is a major compliance requirement for many security standards. The pgaudit extension caters to these requirements.

The PostgreSQL Audit Extension (pgaudit) provides detailed session and/or object audit logging via the standard PostgreSQL logging facility. Please refer to the settings section of its official documentation for more details.

pldebugger

This is a must-have extension for developers who work on stored functions written in PL/pgSQL. This extension is well integrated with GUI tools like pgadmin, which allows developers to step through their code and debug it. Packages for pldebugger are also available in the PGDG repository and installation is straightforward.Once it is set up, we can step though and debug the code remotely.

The official git repo is available here

plprofiler

This is a wonderful extension for finding out where the code is slowing down. This is very helpful, particularly during complex migrations from proprietary databases, like from Oracle to PostgreSQL, which affect application performance. This extension can prepare a report on the overall execution time and tables representation, including flamegraphs, with clear information about each line of code. This extension is not, however, available from the PGDG repo: you will need to build it from source. Details on building and installing plprofiler will be covered in a future blog post. Meanwhile, the official repository and documentation is available here

PostGIS

PostGIS is arguably the most versatile implementation of the specifications of the Open Geospatial Consortium. We can see a large list of features in PostGIS that are rarely available in any other RDBMSs.

There are many users who have primarily opted to use PostgreSQL because of the features supported by PostGIS. In fact, all these features are not implemented as a single extension, but are instead delivered by a collection of extensions. This makes PostGIS one of the most complex extensions to build from source. Luckily, everything is available from the PGDG repository:

$ sudo yum install postgis24_10.x86_64

Once the postgis package is installed, we are able to create the extensions on our target database:

postgres=# CREATE EXTENSION postgis;
CREATE EXTENSION
postgres=# CREATE EXTENSION postgis_topology;
CREATE EXTENSION
postgres=# CREATE EXTENSION postgis_sfcgal;
CREATE EXTENSION
postgres=# CREATE EXTENSION fuzzystrmatch;
CREATE EXTENSION
postgres=# CREATE EXTENSION postgis_tiger_geocoder;
CREATE EXTENSION
postgres=# CREATE EXTENSION address_standardizer;
CREATE EXTENSION

Language Extensions : PL/Python, PL/Perl, PL/V8,PL/R etc.

Another powerful feature of PostgreSQL is its programming languages support. You can code database functions/procedures in pretty much every popular language.

Thanks to the enormous number of libraries available, which includes machine learning ones, and its vibrant community, Python has claimed the third spot amongst the most popular languages of choice according to the TIOBE Programming index. Your team’s skills and libraries remain valid for PostgreSQL server coding too! Teams that regularly code in JavaScript for Node.js or Angular can easily write PostgreSQL server code in PL/V8. All of the packages required are readily available from the PGDG repository.

cstore_fdw

cstore_fdw is an open source columnar store extension for PostgreSQL. Columnar stores provide notable benefits for analytics use cases where data is loaded in batches. Cstore_fdw’s columnar nature delivers performance by only reading relevant data from disk. It may compress data by 6 to 10 times to reduce space requirements for data archive. The official repository and documentation is available here

HypoPG

HypoPG is an extension for adding support for hypothetical indexes – that is, without actually adding the index. This helps us to answer questions such as “how will the execution plan change if there is an index on column X?”. Installation and setup instructions are part of its official documentation

mongo_fdw

Mongo_fdw presents collections from mongodb as tables in PostgreSQL. This is a case where the NoSQL world meets the SQL world and features combine. We will be covering this extension in a future blog post. The official repository is available here

tds_fdw

Another important FDW (foreign data wrapper) extension in the PostgreSQL world is tds_fdw. Both Microsoft SQL Server and Sybase uses TDS (Tabular Data Stream) format. This fdw allows PostgreSQL to use tables stored in remote SQL Server or Sybase database as local tables. This FDW make use of FreeTDS libraries.

orafce

As previously mentioned, there are lot of migrations underway from Oracle to PostgreSQL. Incompatible functions in PostgreSQL are often painful for those who are migrating server code. The “orafce” project implements some of the functions from the Oracle database. The functionality was verified on Oracle 10g and the module is useful for production work. Please refer to the list in its official documentation about the Oracle functions implemented in PostgreSQL

TimescaleDB

In this new world of IOT and connected devices, there is a growing need of time-series data. Timescale can convert PostgreSQL into a scalable time-series data store. The official site is available here with all relevant links.

pg_bulkload

Is loading a large volume of data into database in a very efficient and faster way a challenge for you? If so pg_bulkload may help you solve that problem. Official documentation is available here

pg_partman

PostgreSQL 10 introduced declarative partitions. But creating new partitions and maintaining existing ones, including purging unwanted partitions, requires a good dose of manual effort. If you are looking to automate part of this maintenance you should have a look at what pg_partman offers. The repository with documentation is available here.

wal2json

PostgreSQL has feature related to logical replication built-in. Extra information is recorded in WALs which will facilitate logical decoding. wal2json is a popular output plugin for logical decoding. This can be utilized for different purposes including change data capture. In addition to wal2json, there are other output plugins: a concise list is available in the PostgreSQL wiki.

There are many more extensions that help us build an enterprise-grade PostgreSQL set up using open source solutions. Please feel free to comment and ask us if we know about one that satisfies your particular needs. Or, if there’s still time, sign up for our October webinar and ask us in person!

by Jobin Augustine at October 05, 2018 04:43 PM

MongoDB-Disable Chained Replication

MongoDB chained replication

MongoDB chained replicationIn this blog post, we will learn what MongoDB chained replication is, why you might choose to disable it, and the steps you need to take to do so.

What is chain replication?

Chain Replication in MongoDB, as the name suggests, means that a secondary member is able to replicate from another secondary member instead of a primary.

Default settings

By default, chained replication is enabled in MongoDB. It helps to reduce the load from the primary but it may lead to a replication lag. When enabled, the secondary node selects its target using the ping time for the closest node.

Reasons to disable chained replication

The main reason to disable chain replication is replication lag. In other words, the length of the delay between MongoDB writing an operation on the primary and replicating the same operation to the secondary.

In either case—chained replication enabled or disabled—replication works in the same way when the primary node fails: the secondary server will promote to the primary. Therefore, writing and reading of data from the application is not affected.

Steps to disable chained replication

1) Check the current status of chained replication in replica set configuration for “settings” like this:

PRIMARY> rs.config().settings
{
"chainingAllowed" : true,
}

2) Disable chained replication, set “chainingAllowed” to false and then reconfig to implement changes.

PRIMARY> cg = rs.config()
PRIMARY> cg.settings.chainingAllowed = false
false
PRIMARY> rs.reconfig(cg)

3) Check again for the current status of chained replication and its done.

PRIMARY> rs.config().settings
{
	"chainingAllowed" : false,
}

Can I override sync source target even after disabling chaining?

Yes, even after you have disabled chained replication, you can still override sync target, though only temporarily. That means it will be overridden until:

  • mognod instance restarts
  • established connection between sync source and secondary node.
  • Additional: if chaining is enabled and sync source falls more than 30 seconds behind another member then the SyncSourceResolver will choose other member having recent oplogs to sync from.

Override sync source

Parameter “replSetSyncFrom” could be used, for example, the secondary node is syncing from host

192.168.103.100:27001
  and we would like to sync it from
192.168.103.100:27003

1) Check for the current host it is syncing from:

PRIMARY> rs.status()
{
			"_id" : 1,
			"name" : "192.168.103.100:27002",
			"syncingTo" : "192.168.103.100:27001",
			"syncSourceHost" : "192.168.103.100:27001",
		},

2) Login to that mongod, and execute:

SECONDARY> db.adminCommand( { replSetSyncFrom: "192.168.103.100:27003" })

3) Check replica set status again

SECONDARY> rs.status()
{
			"_id" : 1,
			"name" : "192.168.103.100:27002",
			"syncingTo" : "192.168.103.100:27003",
			"syncSourceHost" : "192.168.103.100:27003",
		},

This is how we can override the sync source in case of testing, maintenance or while the replica is not syncing from the required host.

I hope this blog helps you to understand how to disable chained replication or override the sync source for the specific purpose or reason. The preferred setting of the chainingAllowed parameter is true as it reduces the load from the primary node and also a default setting.

by Aayushi Mangal at October 05, 2018 10:53 AM

Oli Sennhauser

MariaDB/MySQL Schulungstermine 2019

Jetzt sind auch noch die letzten MariaDB und MySQL Schulungstermine für 2019 festgelegt und veröffentlicht.

Mit unseren drei Schulungspartnern in Essen, Köln und Berlin bietet FromDual zur Zeit insgesamt 12 öffentliche Schulungen zum Thema MariaDB und MySQL an.

Es sind dies:


Diese Schulungen finden in deutsch statt. Auf Wunsch können auch Schulungen in englisch angeboten werden.

Was bleibt übrig für 2018?

Noch im Jahr 2018 werden 5 weitere Schulungen durchgeführt. Diese finden alle sicher statt!

Es sind dies:


In diesen Schulungen sind noch vereinzelt Plätze frei. Habt Ihr Eure MariaDB/MySQL Schulung für 2018 schon bezogen?

Bei Fragen zu den MariaDB oder MySQL Schulungen hilft Euch das FromDual Schulungsteam gerne weiter!

by Shinguz at October 05, 2018 08:54 AM

October 04, 2018

MariaDB Foundation

MariaDB 10.3.10 now available

The MariaDB Foundation is pleased to announce the availability of MariaDB 10.3.10, the latest stable release in the MariaDB 10.3 series. See the release notes and changelogs for details. Download MariaDB 10.3.10 Release Notes Changelog What is MariaDB 10.3? MariaDB APT and YUM Repository Configuration Generator Contributors to MariaDB 10.3.10 Alexander Barkov (MariaDB Corporation) Alexey […]

The post MariaDB 10.3.10 now available appeared first on MariaDB.org.

by Ian Gilfillan at October 04, 2018 03:31 PM

MariaDB AB

MariaDB Server 10.3.10 now available

MariaDB Server 10.3.10 now available dbart Thu, 10/04/2018 - 10:38

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.3.10. See the release notes and changelogs for details and visit mariadb.com/downloads to download.

 

Download MariaDB Server 10.3.10

Release Notes Changelog What is MariaDB 10.3?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.3.10. See the release notes and changelog for details.

Login or Register to post comments

by dbart at October 04, 2018 02:38 PM

Peter Zaitsev

The Importance of mysqlbinlog –version

Importance of MySQL binlog version

Importance of MySQL binlog versionWhen deciding on your backup strategy, one of the key components for Point In Time Recovery (PITR) will be the binary logs. Thankfully, the mysqlbinlog command allows you to easily take binary log backups, including those that would otherwise be encrypted on disk using encrypt_binlog=ON.

When

mysqlbinlog
  is used with
--raw --read-from-remote-server --stop-never --verify-binlog-checksum
  then it will retrieve binary logs from whichever master it is pointed to, and store them locally on disk in the same format as they were written on the master. Here is an example with the extra arguments that would normally be used:

/usr/bin/mysqlbinlog --raw --read-from-remote-server \
 --stop-never --connection-server-id=1234 \
 --verify-binlog-checksum \
 --host=localhost --port=3306 mysql-bin.000024

This would retrieve the localhost binary logs (starting from mysql-bin.000024) reporting as server_id 1234, verify the checksum and then write each of them to disk.

Changes to the mysqlbinlog source code are relatively infrequent, except for when developing for a new major version, so you may be fooled into thinking that the specific version that you use is not so important—a little like the client version. This is something that is more likely to vary when you are taking remote backups.

Here is the result from the 5.7 branch of mysql-server to show the history of commits by year:

$ git blame --line-porcelain client/mysqlbinlog.cc | egrep "^(committer-time|committer-tz)" | cut -f2 -d' ' | while read ct; do read ctz; date --date "Jan 1, 1970 00:00:00 ${ctz} + ${ct} seconds" --utc +%Y; done | sort -n | uniq -c
   105 2000
    52 2001
    52 2002
   103 2003
   390 2004
   113 2005
    58 2006
   129 2007
   595 2008
    53 2009
   349 2010
   151 2011
   382 2012
   191 2013
   308 2014
   404 2015
    27 2016
    42 2017
    15 2018

Since the first GA release of 5.7 (October 2015), there haven’t been too many bugs and so if you aren’t using new features then you may think that it is OK to keep using the same version as before:

$ git log --regexp-ignore-case --grep bug --since="2015-10-19" --oneline client/mysqlbinlog.cc
1ffd7965a5e Bug#27558169 BACKPORT TO 5.7 BUG #26826272: REMOVE GCC 8 WARNINGS [noclose]
17c92835bb3 Bug #24674276 MYSQLBINLOG -R --HEXDUMP CRASHES FOR INTVAR,                   USER_VAR, OR RAND EVENTS
052dbd7b079 BUG#26878022 MYSQLBINLOG: ASSERTION `(OLD_MH->M_KEY == KEY) ||              (OLD_MH->M_KEY == 0)' FAILED
543129a577c BUG#26878022 MYSQLBINLOG: ASSERTION `(OLD_MH->M_KEY == KEY) || (OLD_MH->M_KEY == 0)' FAILED
ba1a99c5cd7 Bug#26825211 BACKPORT FIX FOR #25643811 TO 5.7
1f0f4476b28 Bug#26117735: MYSQLBINLOG READ-FROM-REMOTE-SERVER NOT HONORING REWRITE_DB FILTERING
12508f21b63 Bug #24323288: MAIN.MYSQLBINLOG_DEBUG FAILS WITH A LEAKSANITIZER ERROR
e8e5ddbb707 Bug#24609402 MYSQLBINLOG --RAW DOES NOT GET ALL LATEST EVENTS
22eec68941f Bug#23540182:MYSQLBINLOG DOES NOT FREE THE EXISTING CONNECTION BEFORE OPENING NEW REMOTE ONE
567bb732bc0 Bug#22932576 MYSQL5.6 DOES NOT BUILD ON SOLARIS12
efc42d99469 Bug#22932576 MYSQL5.6 DOES NOT BUILD ON SOLARIS12
6772eb52d66 Bug#21697461 MEMORY LEAK IN MYSQLBINLOG

However, this is not always the case and some issues are more obvious than others! To help show this, here are a couple of the issues that you might happen to notice.

Warning: option ‘stop-never-slave-server-id’: unsigned value <xxxxxxxx> adjusted to <yyyyy>

The server_id that is used by a server in a replication topology should always be unique within the topology. One of the easy ways to ensure this is to use a conversion of the external IPv4 address to an integer, such as INET_ATON , which provides you with an unsigned integer.

The introduction of

--connection-server-id
 (which deprecates
--stop-never-slave-server-id
 ) changes the behaviour here (for the better). Prior to this you may experience warnings where your server_id was cast to the equivalent of an UNSIGNED SMALLINT. This didn’t seem to be a reported bug, just fixed as a by-product of the change.

ERROR: Could not find server version: Master reported unrecognized MySQL version ‘xxx’

When running mysqlbinlog, the version of MySQL is checked so that the event format is set correctly. Here is the code from MySQL 5.7:

switch (*version) {
 case '3':
   glob_description_event= new Format_description_log_event(1);
   break;
 case '4':
   glob_description_event= new Format_description_log_event(3);
   break;
 case '5':
   /*
     The server is soon going to send us its Format_description log
     event, unless it is a 5.0 server with 3.23 or 4.0 binlogs.
     So we first assume that this is 4.0 (which is enough to read the
     Format_desc event if one comes).
   */
   glob_description_event= new Format_description_log_event(3);
   break;
 default:
   glob_description_event= NULL;
   error("Could not find server version: "
         "Master reported unrecognized MySQL version '%s'.", version);
   goto err;
 }

This section of the code last changed in 2008, but of course there is another vendor that no longer uses a 5-prefixed-version number: MariaDB. With MariaDB, it is impossible to take a backup without using a MariaDB version of the program, as you are told that the version is unrecognised. The MariaDB source code contains a change to this section to resolve the issue when the version was bumped to 10:

83c02f32375b client/mysqlbinlog.cc (Michael Widenius    2012-05-31 22:39:11 +0300 1900) case 5:
83c02f32375b client/mysqlbinlog.cc (Michael Widenius    2012-05-31 22:39:11 +0300 1901) case 10:

Interestingly, MySQL 8.0 gets a little closer to not producing an error (although it still does), but finally sees off those rather old ancestral relatives:

 switch (*version) {
   case '5':
   case '8':
   case '9':
     /*
       The server is soon going to send us its Format_description log
       event.
     */
     glob_description_event = new Format_description_log_event;
     break;
   default:
     glob_description_event = NULL;
     error(
         "Could not find server version: "
         "Master reported unrecognized MySQL version '%s'.",
         version);
     goto err;
 }

These are somewhat trivial examples. In fact, you are more likely to suffer from more serious differences, perhaps ones that do not become immediately apparent, if you are not matching the mysqlbinlog version to the one provided by the version for the server producing the binary logs.

Sadly, it is not so easy to check the versions as the reported version was seemingly left unloved for quite a while (Ver 3.4), so you should check the binary package versions (e.g. using Percona-Server-client-57-5.7.23 with Percona-Server-server-57-5.7.23). Thankfully, the good news is that MySQL 8.0 fixes it!

So reduce the risk and match your package versions!

by Ceri Williams at October 04, 2018 02:01 PM

Percona Live Europe 2018 Session Programme Published

PLE 2018 Full Agenda Announced

PLE 2018 Full Agenda AnnouncedOffering over 110 conference sessions across Tuesday, 6 and Wednesday, 7 November, and a full tutorial day on Monday 5 November, we hope you’ll find that this fantastic line up of talks for Percona Live Europe 2018 to be one of our best yet! Innovation in technology continues to arrive at an accelerated rate, and you’ll find plenty to help you connect with the latest developments in open source database technologies at this acclaimed annual event.

Representatives from companies at the leading edge of our industry use the platform offered by Percona Live to showcase their latest developments and share plans for the future. If your career is dependent upon the use of open source database technologies you should not miss this conference!

Conference Session Schedule

Conference sessions will take place on Tuesday and Wednesday, November 6-7 and will feature more than 110 in-depth talks by industry experts. Conference session examples include:

  • Deep Dive on MySQL Databases on Amazon RDS – Chayan Biswas, AWS
  • MySQL 8.0 Performance: Scalability & Benchmarks – Dimitri Kravtchuk, Oracle
  • MySQL 8 New Features: Temptable engine – Pep Pla, Pythian
  • Artificial Intelligence Database Performance Tuning – Roel Van de Paar, Percona
  • pg_chameleon – MySQL to PostgreSQL replica made easy – Federico Campoli, Loxodata
  • Highway to Hell or Stairway to Cloud? – Alexander Kukushkin, Zalando
  • Zero to Serverless in 60 Seconds – Sandor Maurice, AWS
  • A Year in Google Cloud – Carmen Mason, Alan Mason, Vital Source Technologies
  • Advanced MySQL Data at Rest Encryption in Percona Server for MySQL – Iwo Panowicz, Percona, and Bartłomiej Oleś, Severalnines
  • Monitoring Kubernetes with Prometheus – Henri Dubois-Ferriere, Sysdig
  • How We Use and Improve Percona XtraBackup at Alibaba Cloud – Bo Wang, Alibaba Cloud
  • Shard 101 – Adamo Tonete, Percona
  • Top 10 Mistakes When Migrating From Oracle to PostgreSQL – Jim Mlodgenski, AWS
  • Explaining the Postgres Query Optimizer – Bruce Momjian, EnterpriseDB
  • MariaDB 10.3 Optimizer and Beyond – Vicentiu Ciorbaru, MariaDB FoundationHA and Clustering Solution: ProxySQL as an Intelligent Router for Galera and Group Replication – René Cannaò, ProxySQL
  • MongoDB WiredTiger WriteConflicts – Paul Agombin, ObjectRocket
  • PostgreSQL Enterprise Features – Michael Banck, credativ GmbH
  • What’s New in MySQL 8.0 Security – Georgi Kodinov, Oracle
  • The MariaDB Foundation and Security – Finding and Fixing Vulnerabilities the Open Source Way – Otto Kekäläinen, MariaDB Foundation
  • ClickHouse 2018: How to Stop Waiting for Your Queries to Complete and Start Having Fun – Alexander Zaitsev, Altinity
  • Open Source Databases and Non-Volatile Memory – Frank Ober, Intel Memory Group
  • MyRocks Production Case Studies at Facebook – Yoshinori Matsunobu, Facebook
  • Need for Speed: Boosting Apache Cassandra’s Performance Using Netty – Dinesh Joshi, Apache Cassandra
  • Demystifying MySQL Replication Crash Safety – Jean-François Gagné, Messagebird

See the full list of sessions

Tutorial schedule

Tutorials will take place throughout the day on Monday, November 5, 2018. Tutorial session examples include:

  • Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics – Jaime Crespo, Wikimedia Foundation
  • ElasticSearch 101 – Antonios Giannopoulos, ObjectRocket
  • MySQL InnoDB Cluster in a Nutshell: The Saga Continues with 8.0 – Frédéric Descamps, Oracle
  • High Availability PostgreSQL and Kubernetes with Google Cloud – Alexis Guajardo, Google
  • Best Practices for High Availability – Alex Rubin and Alex Poritskiy, Percona

See the full list of tutorials.

Sponsors

We are grateful for the support of our sponsors:

  • Platinum – AWS
  • Silver – Altinity, PingCap
  • Start Up – Severalnines
  • Branding – Intel, Idera
  • Expo – Postgres EU

If you would like to join them Sponsorship opportunities for Percona Live Open Source Database Conference Europe 2018 are available and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

Ready to register? What are you waiting for? Costs will only get higher!
Register now!

 

 

by Bronwyn Campbell at October 04, 2018 04:18 AM

October 03, 2018

Peter Zaitsev

Percona Blog Poll: How Do You Run Your Database in the Cloud?

Cloud database poll

Cloud database pollPercona’s latest blog poll asks how you run your database in the cloud?

Join in!

Are you using a fully managed service or are you self-managing your databases in the cloud? And what provider are you relying on? Perhaps you’re using more than one. Don’t worry, you can tick multiple boxes, so please choose up to four answers. If you don’t see your solution listed, use the comments box on this blog to feedback your thoughts.

If you’d like to, you’re also welcome leave a comment to tell us about your choice—or maybe why you’re NOT planning on moving to a cloud solution in the near future. Likewise if you want to share how you’ve found your cloud deployment so far, feel free to send a comment. Thanks!

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

by Lorraine Pocklington, Community Manager at October 03, 2018 03:49 PM

Finding Table Differences on Nullable Columns Using MySQL Generated Columns

MySQL generated columns

MySQL generated columnsSome time ago, a customer had a performance issue with an internal process. He was comparing, finding, and reporting the rows that were different between two tables. This is simple if you use a LEFT JOIN and an 

IS NULL
  comparison over the second table in the WHERE clause, but what if the column could be null? That is why he used UNION, GROUP BY and a HAVING clauses, which resulted in poor performance.

The challenge was to be able to compare each row using a LEFT JOIN over NULL values.

The challenge in more detail

I’m not going to use the customer’s real table. Instead, I will be comparing two sysbench tables with the same structure:

CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned DEFAULT NULL,
  `c` char(120) DEFAULT NULL,
  `pad` char(60) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k` (`k`,`c`,`pad`)
) ENGINE=InnoDB

It is sightly different from the original sysbench schema, as this version can hold NULL values. Both tables have the same number of rows. We are going to set to NULL one row on each table:

update sbtest1 set k=null where limit 1;
update sbtest2 set k=null where limit 1;

If we execute the comparison query, we get this result:

mysql> select "sbtest1",a.* from
    -> sbtest1 a left join
    -> sbtest2 b using (k,c,pad)
    -> where b.id is null union
    -> select "sbtest2",a.* from
    -> sbtest2 a left join
    -> sbtest1 b using (k,c,pad)
    -> where b.id is null;
+---------+------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| sbtest1 | id   | k    | c                                                                                                                       | pad                                                         |
+---------+------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| sbtest1 | 4462 | NULL | 64568100364-99474573987-46567807085-85185678273-10829479379-85901445105-43623848418-63872374080-59257878609-82802454375 | 07052127207-33716235481-22978181904-76695680520-07986095803 |
| sbtest2 | 4462 | NULL | 64568100364-99474573987-46567807085-85185678273-10829479379-85901445105-43623848418-63872374080-59257878609-82802454375 | 07052127207-33716235481-22978181904-76695680520-07986095803 |
+---------+------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
2 rows in set (3.00 sec)

As you can see, column k is NULL. In both cases it failed and reported those rows to be different. This is not new in MySQL, but it would be nice to have a way to sort this issue out.

Solution

The solution is based on GENERATED COLUMNS with a hash function (md5) and stored in a binary(16) column:

ALTER TABLE sbtest1
ADD COLUMN `real_id` binary(16) GENERATED ALWAYS AS (unhex(md5(concat(ifnull(`k`,'NULL'),ifnull(`c`,'NULL'),ifnull(`pad`,'NULL'))))) VIRTUAL,
ADD INDEX (real_id);
ALTER TABLE sbtest2
ADD COLUMN `real_id` binary(16) GENERATED ALWAYS AS (unhex(md5(concat(ifnull(`k`,'NULL'),ifnull(`c`,'NULL'),ifnull(`pad`,'NULL'))))) VIRTUAL,
ADD INDEX (real_id);

Adding the index is also part of the solution. Now, let’s execute the query using the new column to join the tables:

mysql> select "sbtest1",a.k,a.c,a.pad from
    -> sbtest1 a left join
    -> sbtest2 b using (real_id)
    -> where b.id is null union
    -> select "sbtest2",a.k,a.c,a.pad from
    -> sbtest2 a left join
    -> sbtest1 b using (real_id)
    -> where b.id is null;
Empty set (2.31 sec)

We can see an improvement in the query performance—it now takes 2.31 sec whereas before it was 3.00 sec—and that the result is as expected. We could say that that’s all, and no possible improvement can be made. However, is not true. Even though the query is running faster, it is possible to optimize it in this way:

mysql> select "sbtest1",a.k,a.c,a.pad
    -> from sbtest1 a
    -> where a.id in (select a.id
    ->   from sbtest1 a left join
    ->   sbtest2 b using (real_id)
    ->   where b.id is null) union
    -> select "sbtest2",a.k,a.c,a.pad
    -> from sbtest2 a
    -> where a.id in (select a.id
    ->   from sbtest2 a left join
    ->   sbtest1 b using (real_id)
    ->   where b.id is null);
Empty set (1.60 sec)

Why is this faster? The first query is performing two subqueries. Each subquery is very similar. Let’s check the explain plan:

mysql> explain select "sbtest1",a.k,a.c,a.pad from
    -> sbtest1 a left join
    -> sbtest2 b using (real_id)
    -> where b.id is null;
+----+-------------+-------+------------+------+---------------+---------+---------+------------------+--------+----------+--------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref              | rows   | filtered | Extra                                |
+----+-------------+-------+------------+------+---------------+---------+---------+------------------+--------+----------+--------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL    | NULL    | NULL             | 315369 |   100.00 | NULL                                 |
|  1 | SIMPLE      | b     | NULL       | ref  | real_id       | real_id | 17      | sbtest.a.real_id |     27 |    10.00 | Using where; Not exists; Using index |
+----+-------------+-------+------------+------+---------------+---------+---------+------------------+--------+----------+--------------------------------------+

As you can see, it is performing a full table scan over the first table and using real_id to join the second table. The real_id is a generated column, so it needs to execute the function to get the value to join the second table. That means that it’s going to take time.

If we analyze the subquery of the second query:

mysql> explain select "sbtest1",a.k,a.c,a.pad
    -> from sbtest1 a
    -> where a.id in (select a.id
    ->   from sbtest1 a left join
    ->   sbtest2 b using (real_id)
    ->   where b.id is null);
+----+--------------+-------------+------------+--------+---------------+------------+---------+------------------+--------+----------+--------------------------------------+
| id | select_type  | table       | partitions | type   | possible_keys | key        | key_len | ref              | rows   | filtered | Extra                                |
+----+--------------+-------------+------------+--------+---------------+------------+---------+------------------+--------+----------+--------------------------------------+
|  1 | SIMPLE       | a           | NULL       | index  | PRIMARY       | k          | 187     | NULL             | 315369 |   100.00 | Using where; Using index             |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_key>    | <auto_key> | 4       | sbtest.a.id      |      1 |   100.00 | NULL                                 |
|  2 | MATERIALIZED | a           | NULL       | index  | PRIMARY       | real_id    | 17      | NULL             | 315369 |   100.00 | Using index                          |
|  2 | MATERIALIZED | b           | NULL       | ref    | real_id       | real_id    | 17      | sbtest.a.real_id |     27 |    10.00 | Using where; Not exists; Using index |
+----+--------------+-------------+------------+--------+---------------+------------+---------+------------------+--------+----------+--------------------------------------+

We are going to see that it is performing a full index scan over the first table, and that the generated column has never been executed. That is how we can go from an inconsistent result of three seconds, to a consistent result of 2.31 seconds, to finally reach a performant query using the faster time of 1.60 seconds.

Conclusions

This is not the first blog post that I’ve done about generated columns. I think that it is a useful feature for several scenarios where you need to improve performance. In this particular case, it’s also presenting a workaround to expected inconsistencies with LEFT JOINS with NULL values. It is also important to mention that this improved a process in a real world scenario.

by David Ducos at October 03, 2018 01:56 PM

Jean-Jerome Schmidt

New White Paper on State-of-the-Art Database Management: ClusterControl - The Guide

Today we’re happy to announce the availability of our first white paper on ClusterControl, the only management system you’ll ever need to automate and manage your open source database infrastructure!

Download ClusterControl - The Guide!

Most organizations have databases to manage, and experience the headaches that come with that: managing performance, monitoring uptime, automatically recovering from failures, scaling, backups, security and disaster recovery. Organizations build and buy numerous tools and utilities for that purpose.

ClusterControl differs from the usual approach of trying to bolt together performance monitoring, automatic failover and backup management tools by combining – in one product – everything you need to deploy and operate mission-critical databases in production. It automates the entire database environment, and ultimately delivers an agile, modern and highly available data platform based on open source.

All-in-one management software - the ClusterControl features set:

Since the inception of Severalnines, we have made it our mission to provide market-leading solutions to help organisations achieve optimal efficiency and availability of their open source database infrastructures.

With ClusterControl, as it stands today, we are proud to say: mission accomplished!

Our flagship product is an integrated deployment, monitoring, and management automation system for open source databases, which provides holistic, real-time control of your database operations in an easy and intuitive experience, incorporating the best practices learned from thousands of customer deployments in a comprehensive system that helps you manage your databases safely and reliably.

Whether you’re a MySQL, MariaDB, PostgreSQL or MongoDB user (or a combination of these), ClusterControl has you covered.

Deploying, monitoring and managing highly available open source database clusters is not a small feat and requires either just as highly specialised database administration (DBA) skills … or professional tools and systems that non-DBA users can wield in order to build and maintain such systems, though these typically come with an equally high learning curve.

The idea and concept for ClusterControl was born out of that conundrum that most organisations face when it comes to running highly available database environments.

It is the only solution on the market today that provides that intuitive, easy to use system with the full set of tools required to manage such complex database environments end-to-end, whether one is a DBA or not.

The aim of this Guide is to make the case for comprehensive open source database management and the need for cluster management software. And explains in a just as comprehensive fashion why ClusterControl is the only management system you will ever need to run highly available open source database infrastructures.

Download ClusterControl - The Guide!

by jj at October 03, 2018 12:11 PM

Oli Sennhauser

MariaDB/MySQL Environment MyEnv 2.0.1 has been released

FromDual has the pleasure to announce the release of the new version 2.0.1 of its popular MariaDB, Galera Cluster and MySQL multi-instance environment MyEnv.

The new MyEnv can be downloaded here. How to install MyEnv is described in the MyEnv Installation Guide.

In the inconceivable case that you find a bug in the MyEnv please report it to the FromDual bug tracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.1.x to 2.0

Please look at the MyEnv 2.0.0 Release Notes.

Upgrade from 2.0.0 to 2.0.1

shell> cd ${HOME}/product
shell> tar xf /download/myenv-2.0.1.tar.gz
shell> rm -f myenv
shell> ln -s myenv-2.0.1 myenv

Plug-ins

If you are using plug-ins for showMyEnvStatus create all the links in the new directory structure:

shell> cd ${HOME}/product/myenv
shell> ln -s ../../utl/oem_agent.php plg/showMyEnvStatus/

Upgrade of the instance directory structure

From MyEnv 1.0 to 2.0 the directory structure of instances has fundamentally changed. Nevertheless MyEnv 2.0 works fine with MyEnv 1.0 directory structures.

Changes in MyEnv 2.0.1

MyEnv

  • CloudLinux was added as supported distribution.
  • Introduced different brackets for up () and down [] MariaDB/MySQL Instances in up output.
  • Script setMyEnv.sh added to set environment for instance (e.g. via ssh).
  • MyEnv should not complain any more when default my.cnf with include/includedir directives is used.
  • Missing instancedir configuration variable in myenv.conf is complaining now. This could be a left over from 1.x to 2.y migration.
  • OpenSuSE Leap 42.3 support added.

MyEnv Installer

  • Instance name with dot '.' is not allowed any more.
  • basedir without bin/mysqld is stripped out from installer overview.

MyEnv Utilities

  • Utilities cluster_conflict.php, cluster_conflict.sh, galera_monitor.sh, haproxy_maint.sh, group_replication_monitor.sh for Galera and Group Replication Cluster added.

For subscriptions of commercial use of MyEnv please get in contact with us.

by Shinguz at October 03, 2018 07:06 AM

October 02, 2018

Peter Zaitsev

CRITICAL UPDATE for Percona XtraDB Cluster users: 5.7.23-31.31.2 Is Now Available

Percona XtraDB Cluster 5.7

High AvailabilityTo resolve a critical regression, Percona announces the release of Percona XtraDB Cluster 5.7.23-31.31.2 on October 2, 2018 Binaries are available from the downloads section or from our software repositories.

This release resolves a critical regression in Galera’s wsrep library and supersedes 5.7.23-31.31

Percona XtraDB Cluster 5.7.23-31.31.2 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs

  • #2254: A cluster conflict could cause a crash in Percona XtraDB Cluster 5.7.23 if autocommit=off.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Borys Belinsky at October 02, 2018 05:15 PM

Scaling PostgreSQL using Connection Poolers and Load Balancers for an Enterprise Grade environment

PostgreSQL Load balancing connection pooling

PostgreSQL® logo

In the previous blog posts in this series we’ve covered some of the essential aspects of an Enterprise-grade solution: security, high availability, and backups. Another important aspect is the scalability of the solution: as our application grows how do we accommodate an increase in traffic while maintaining the quality of the service (response time)? The answer to this question depends on the nature of the workload at play but it is often shaped around:

(a) improving its efficiency and
(b) increasing the resources available.

Why connection pooling ?

When it comes to improving the efficiency of a database workload, one of the first places we start looking at is the list of slow queries; if the most popular ones can be optimized to run faster then we can easily gain some overall performance back. Arguably, we may look next at the number and frequency of client connections: is the workload composed of a high number of very frequent but short-lived connections? Or are clients connections of a more moderate number, and tend to stick around for longer ?

If we consider the first scenario further–a high number of short lived connections–and that each connection spawns a new OS process, the server may hit a practical limit as to the number of transactions—or connections—it can manage per second, considering the hardware available and the workload being processed. Remember that PostgreSQL is process-based, as opposed to thread-based, which is itself an expensive operation in terms of resources, both CPU and memory.

A possible remedy for this would be the use of a connection pooler, acting as a mediator between the application and the database. The connection pooler keeps a number of connections permanently opened with the database, and receives and manages all incoming requests from clients itself, allowing them to temporarily use one of the connections it already has established with PostgreSQL. This removes the burden of creating a new process each time a client establishes a connection with PostgreSQL, and allows it to employ the resources that it would otherwise use for this into serving more requests (or completing them faster).

Rule of thumb?

A general rule of thumb that we often hear is that you may need a connection pooler once you reach around 350 concurrent connections. However, the actual threshold is highly dependent on your database traffic and server configuration: as we find out recently, you may need one much sooner.

You may implement connection pooling using your native application connection pooler (if there is one available) or through an external connection pooler such as PgBouncer and pgPool-II. For the solution we have built, which we demonstrate in our webinar of October 10, we have used PgBouncer as our connection pooler.

PgBouncer

PgBouncer is a lightweight (thread-based) connection pooler that has been widely used in PostgreSQL based environments. It “understands” the PostgreSQL connection protocol and has been a stable project for over a decade.

PgBouncer allows you to configure the pool of connections to operate in three distinct modes: session, statement and transaction. Unless you’ve a good reason to reserve a connection in the pool to a single user for the duration of its session, or are operating with single-statements exclusively, transaction mode is the one you should investigate.

A feature that is central to our enterprise-grade solution is that you can add multiple connection strings using unique alias names (referred to as database names). This allows greater flexibility when mediating connections with multiple database servers. We can then have an alias named “master_db” that will route connections to a master/primary server and another alias named “slave_db” that will route connections to a slave/standby server.

Scaling up

Once efficiency is taken care of, we can then start working on increasing the resources, or computing power, available to process database requests. Scaling vertically means, in short, upgrading the server: more and faster cores, memory, storage. It’s a simple approach, but one that reaches a practical limitation rather quickly. It is not inline with other requirements of an enterprise grade solution, such as high availability. The alternative is scaling horizontally. As briefly introduced above, a common way for implementing horizontal scalability is to redirect reads to standby servers (replicas) with the help of a proxy, which can also act as a load balancer, such as HAProxy. We’ll be discussing these ideas further here, and showcase their integration in our webinar.

HAProxy 

HAProxy is a popular open source TCP/HTTP load balancer that can distribute the workload across multiple servers. It can be leveraged in a PostgreSQL replication cluster that has been built using streaming replication. When you build replication using streaming replication method standby replicas are open for reads. With the help of HAProxy you can efficiently utilize the computing power of all database servers, distributing read requests among the available replicas using algorithms such as Least Connection and Round Robin.

A combination of connection pooler and load balancer to scale PostgreSQL

The following diagram represents a simplified part of the architecture that composes the enterprise grade solution we’ve designed, where we employ PgBouncer and HAProxy to scale our PostgreSQL cluster:

Our PgBouncer contains two database (alias) names, one for redirecting writes to the master and another for balancing reads across standby replicas, as discussed above. Here is how the database section looks in the pgbouncer.ini, PgBouncer’s main configuration file:

[databases]
master = host=haproxy port=5002 dbname=postgres
slave = host=haproxy port=5003 dbname=postgres

Notice that both database entries redirect their connections to the HAProxy server, but each to a different port. The HAProxy, in turn, is configured to route the connections in functions of the incoming port they reach. Considering the above pgBouncer config file as a reference, writes (master connections) are redirected to port 5002 and reads (slave connections) to port 5003. Here is how the HAProxy config file looks:

# Connections to port 5002
listen Master
    bind *:5002
    option tcp-check
    tcp-check send GET\ / HTTP/1.0\r\n
    tcp-check send HOST\r\n
    tcp-check send \r\n
    tcp-check expect string "role":\ "master"
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server pg0 pg0:5432 maxconn 100 check port 8008
    server pg1 pg1:5432 maxconn 100 check port 8008
    server pg2 pg2:5432 maxconn 100 check port 8008
# Connections to port 5003
listen Slaves
    bind *:5003
    option tcp-check
    tcp-check send GET\ / HTTP/1.0\r\n
    tcp-check send HOST\r\n
    tcp-check send \r\n
    tcp-check expect string "role":\ "replica"
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server pg0 pg0:5432 maxconn 100 check port 8008
    server pg1 pg1:5432 maxconn 100 check port 8008
    server pg2 pg2:5432 maxconn 100 check port 8008

As seen above:

  • connections to port 5002 are sent to server with role: “master”
  • connections to port 5003 are sent to server with role : “replica”

In a previous post, we discussed using Patroni in our high availability setup. HAProxy relies on Patroni to determine the role of the PostgreSQL server. Patroni is being used here for cluster management and automatic failover. By using Patroni’s REST API (on port 8008 in this scenario) we can obtain the role of a given PostgreSQL server. The example below shows this in practice, the IP addresses denoting the PostgreSQL servers in this setup:

$ curl -s 'http://192.168.50.10:8008' | python -c "import sys, json; print json.load(sys.stdin)['role']"
replica
$ curl -s 'http://192.168.50.20:8008' | python -c "import sys, json; print json.load(sys.stdin)['role']"
master
$ curl -s 'http://192.168.50.30:8008' | python -c "import sys, json; print json.load(sys.stdin)['role']"
replica

HAProxy can thus rely on Patroni’s REST API to redirect connections from the master alias in PgBouncer to a server with role master. Similarly, HAProxy uses server role information to redirect connections from a slave alias to one of the servers with role replica, using the appropriate load balancer algorithm.

This way, we ensure that the application uses the advantage of a connection pooler to leverage connections to the database, and also of the load balancer which distributes the read load to multiple database servers as configured.

There are many other open source connection poolers and load balancers available to build a similar setup. You can choose the one that best suits your environment—just make sure to test your custom solution appropriately before bringing it to production.

by Avinash Vallarapu at October 02, 2018 04:15 PM

Jean-Jerome Schmidt

Introducing Agent-Based Database Monitoring with ClusterControl 1.7

We are excited to announce the 1.7 release of ClusterControl - the only management system you’ll ever need to take control of your open source database infrastructure!

ClusterControl 1.7 introduces new exciting agent-based monitoring features for MySQL, Galera Cluster, PostgreSQL & ProxySQL, security and cloud scaling features ... and more!

Release Highlights

Monitoring & Alerting

  • Agent-based monitoring with Prometheus
  • New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

Security & Compliance

  • Enable/disable Audit Logging on your MariaDB databases
  • Enable policy-based monitoring and logging of connection and query activity

Deployment & Scaling

  • Automatically launch cloud instances and add nodes to your cloud deployments

Additional Highlights

  • Support for MariaDB v10.3

View the ClusterControl ChangeLog for all the details!

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

View Release Details and Resources

Release Details

Monitoring & Alerting

Agent-based monitoring with Prometheus

ClusterControl was originally designed to address modern, highly distributed database setups based on replication or clustering. It provides a systems view of all the components of a distributed cluster, including load balancers, and maintains a logical topology view of the cluster.

So far we’d gone the agentless monitoring route with ClusterControl, and although we love the simplicity of not having to install or manage agents on the monitored database hosts, an agent-based approach can provide higher resolution of monitoring data and has certain advantages in terms of security.

With that in mind, we’re happy to introduce agent-based monitoring as a new feature added in ClusterControl 1.7!

It makes use of Prometheus, a full monitoring and trending system that includes built-in and active scraping and storing of metrics based on time series data. One Prometheus server can be used to monitor multiple clusters. ClusterControl takes care of installing and maintaining Prometheus as well as exporters on the monitored hosts.

Users can now enable their database clusters to use Prometheus exporters to collect metrics on their nodes and hosts, thus avoiding excessive SSH activity for monitoring and metrics collections and use SSH connectivity only for management operations.

Monitoring & Alerting

New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

ClusterControl users now have access to a set of new dashboards that have Prometheus as the data source with its flexible query language and multi-dimensional data model, where time series data is identified by metric name and key/value pairs. This allows for greater accuracy and customization options while monitoring your database clusters.

The new dashboards include:

  • Cross Server Graphs
  • System Overview
  • MySQL Overview, Replication, Performance Schema & InnoDB Metrics
  • Galera Cluster Overview & Graphs
  • PostgreSQL Overview
  • ProxySQL Overview

Security & Compliance

Audit Log for MariaDB

Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. Usually, it is achieved by logging information about DB operations on the database to an external log file.

With ClusterControl 1.7 users can now enable a plugin that will log all of their MariaDB database connections or queries to a file for further review; it also introduces support for version 10.3 of MariaDB.

Additional New Functionalities

View the ClusterControl ChangeLog for all the details!

Download ClusterControl today!

Happy Clustering!

by jj at October 02, 2018 02:41 PM

MariaDB Foundation

Hello World from the new MariaDB Foundation CEO: Arjen Lentz

It’s both my pleasure and honour to write this post.  Quite a few of you will know (of) me, but since that’s not necessarily the case I’ll try and make this a quick intro, as well as laying out some aspects of my thoughts and plans for the near future. Who is Arjen? A long […]

The post Hello World from the new MariaDB Foundation CEO: Arjen Lentz appeared first on MariaDB.org.

by Arjen Lentz at October 02, 2018 12:15 AM

October 01, 2018

Peter Zaitsev

ClickHouse: Two Years!

historical trend of ClickHouse popularity

Following my post from a year ago https://www.percona.com/blog/2017/07/06/clickhouse-one-year/, I wanted to review what happened in ClickHouse during this year.
There is indeed some interesting news to share.

1. ClickHouse in DB-Engines Ranking. It did not quite get into the top 100, but the gain from position 174 to 106 is still impressive. Its DB-Engines Ranking score tripled from 0.54 last September to 1.57 this September

And indeed, in my conversation with customers and partners, the narrative has changed from: “ClickHouse, what is it?” to “We are using or considering ClickHouse for our analytics needs”.

2. ClickHouse changed their versioning schema. Unfortunately it changed from the unconventional …; 1.1.54390; 1.1.54394 naming structure to the still unconventional 18.6.0; 18.10.3; 18.12.13 naming structure, where “18.” is a year of the release.

Now to the more interesting technical improvements.

3. Support of the more traditional JOIN syntax. Now if you join two tables you can use SELECT ... FROM tab1 ANY LEFT JOIN tab2 ON tab1_col=tab2_col .

So now, if we take a query from the workload described in https://www.percona.com/blog/2017/06/22/clickhouse-general-analytical-workload-based-star-schema-benchmark/

We can write this:

SELECT     C_REGION,     sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder ANY INNER JOIN customer
ON LO_CUSTKEY=C_CUSTKEY
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;

instead of the monstrous:

SELECT
    C_REGION,
    sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder
ANY INNER JOIN
(
    SELECT
        C_REGION,
        C_CUSTKEY AS LO_CUSTKEY
    FROM customer
) USING (LO_CUSTKEY)
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;

4. Support for DELETE and UPDATE operations. This has probably been the most requested feature since the first ClickHouse release.
ClickHouse uses an LSM-tree like structure—MergeTree—and it is not friendly to single row operations. To highlight this specific limitation, ClickHouse uses ALTER TABLE UPDATE / ALTER TABLE DELETE syntax to highlight this will be executed as a bulk operation, so please consider it as such. Updating or deleting rows in ClickHouse should be an exceptional operation, rather than a part of your day-to-day workload.

We can update a column like this: ALTER TABLE lineorder UPDATE LO_DISCOUNT = 5 WHERE LO_CUSTKEY = 199568

5. ClickHouse added a feature which I call Dictionary Compression, but ClickHouse uses the name “LowCardinality”. It is still experimental, but I hope soon it will be production ready. Basically it allows internally to replace long strings with a short list of enumerated values.

For example, consider the table from our example lineorder which contains 600037902 rows, but has only five different values for the column LO_ORDERPRIORITY:

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder
┌─LO_ORDERPRIORITY─┐
│ 1-URGENT         │
│ 5-LOW            │
│ 4-NOT SPECIFIED  │
│ 2-HIGH           │
│ 3-MEDIUM         │
└──────────────────┘

So we can define our table as:

CREATE TABLE lineorder_dict (
        LO_ORDERKEY             UInt32,
        LO_LINENUMBER           UInt8,
        LO_CUSTKEY              UInt32,
        LO_PARTKEY              UInt32,
        LO_SUPPKEY              UInt32,
        LO_ORDERDATE            Date,
        LO_ORDERPRIORITY        LowCardinality(String),
        LO_SHIPPRIORITY         UInt8,
        LO_QUANTITY             UInt8,
        LO_EXTENDEDPRICE        UInt32,
        LO_ORDTOTALPRICE        UInt32,
        LO_DISCOUNT             UInt8,
        LO_REVENUE              UInt32,
        LO_SUPPLYCOST           UInt32,
        LO_TAX                  UInt8,
        LO_COMMITDATE           Date,
        LO_SHIPMODE             LowCardinality(String)
)Engine=MergeTree(LO_ORDERDATE,(LO_ORDERKEY,LO_LINENUMBER),8192);

How does this help? Firstly, it offers space savings. The table will take less space in storage, as it will use integer values instead of strings. And secondly, performance. The filtering operation will be executed faster.

For example: here’s a query against the table with LO_ORDERPRIORITY stored as String:

SELECT count(*)
FROM lineorder
WHERE LO_ORDERPRIORITY = '2-HIGH'
┌───count()─┐
│ 119995822 │
└───────────┘
1 rows in set. Elapsed: 0.859 sec. Processed 600.04 million rows, 10.44 GB (698.62 million rows/s., 12.16 GB/s.)

And now the same query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT count(*)
FROM lineorder_dict
WHERE LO_ORDERPRIORITY = '2-HIGH'
┌───count()─┐
│ 119995822 │
└───────────┘
1 rows in set. Elapsed: 0.350 sec. Processed 600.04 million rows, 600.95 MB (1.71 billion rows/s., 1.72 GB/s.)

This is 0.859 sec vs 0.350 sec (for the LowCardinality case).

Unfortunately this feature is not optimized for all use cases, and actually in aggregation it performs slower.

An aggregation query against table with LO_ORDERPRIORITY as String:

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder
┌─LO_ORDERPRIORITY─┐
│ 4-NOT SPECIFIED  │
│ 1-URGENT         │
│ 2-HIGH           │
│ 3-MEDIUM         │
│ 5-LOW            │
└──────────────────┘
5 rows in set. Elapsed: 1.200 sec. Processed 600.04 million rows, 10.44 GB (500.22 million rows/s., 8.70 GB/s.)

Versus an aggregation query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder_dict
┌─LO_ORDERPRIORITY─┐
│ 4-NOT SPECIFIED  │
│ 1-URGENT         │
│ 2-HIGH           │
│ 3-MEDIUM         │
│ 5-LOW            │
└──────────────────┘
5 rows in set. Elapsed: 2.334 sec. Processed 600.04 million rows, 600.95 MB (257.05 million rows/s., 257.45 MB/s.)

This is 1.200 sec vs 2.334 sec (for the LowCardinality case)

6. And the last feature I want to mention is the better support of Tableau Software: this required ODBC drivers. It may not seem significant, but Tableau is the number one software for data analysts, and by supporting this, ClickHouse will reach a much wider audience.

Summing up: ClickHouse definitely became much more user friendly since a year ago!

by Vadim Tkachenko at October 01, 2018 02:47 PM

How To Fix MySQL Replication After an Incompatible DDL Command

fix MySQL replication after incompatible DDL

fix MySQL replication after incompatible DDLMySQL supports replicating to a slave that is one release higher. This allows us to easily upgrade our MySQL setup to a new version, by promoting the slave and pointing the application to it. However, though unsupported, there are times when the MySQL version of slave deployed is one release lower. In this scenario, if your application has been performing much better on an older version of MySQL, you would like to have a convenient option to downgrade. You can simply promote the slave to get the old performance back.

The MySQL manual says that ROW based replication can be used to replicate to a lower version, provided that no DDLs replicated are incompatible with the slave. One such incompatible command is ALTER USER which is a new feature in MySQL 5.7 and not available on 5.6. :

ALTER USER 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Executing that command would break replication. Here is an example of a broken slave in non-GTID replication:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 36915649
               Relay_Log_File: mysql_sandbox5641-relay-bin.000006
                Relay_Log_Pos: 36174552
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
*** redacted ***
                   Last_Errno: 1064
                   Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 36174373
              Relay_Log_Space: 36916179
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1064
               Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
*** redacted ***
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 180918 22:03:40
*** redacted ***
                Auto_Position: 0
1 row in set (0.00 sec)

Skipping the statement does not resume replication:

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.02 sec)
mysql> SET GLOBAL sql_slave_skip_counter=1;
Query OK, 0 rows affected (0.00 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUS\G

Fixing non-GTID replication

When you check slave status, replication still isn’t fixed. To fix it, you must manually skip to the next binary log position. The current binary log (Relay_Master_Log_File) and position (Exec_Master_Log_Pos) executed are mysql-bin.000002 and 36174373 respectively. We can use mysqlbinlog on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=36174373 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000002 | head -n 30
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 36174373
#180918 22:03:40 server id 1  end_log_pos 36174438 CRC32 0xc7e1e553 	Anonymous_GTID	last_committed=19273	sequence_number=19277	rbr_only=no
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 36174438
#180918 22:03:40 server id 1  end_log_pos 36174621 CRC32 0x2e5bb235 	Query	thread_id=563	exec_time=0	error_code=0
SET TIMESTAMP=1537279420/*!*/;
SET @@session.pseudo_thread_id=563/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'
/*!*/;
# at 36174621
#180918 22:03:40 server id 1  end_log_pos 36174686 CRC32 0x86756b3f 	Anonymous_GTID	last_committed=19275	sequence_number=19278	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 36174686
#180918 22:03:40 server id 1  end_log_pos 36174760 CRC32 0x30e663f9 	Query	thread_id=529	exec_time=0	error_code=0
SET TIMESTAMP=1537279420/*!*/;
BEGIN
/*!*/;
# at 36174760
#180918 22:03:40 server id 1  end_log_pos 36174819 CRC32 0x48054daf 	Table_map: `sbtest`.`sbtest1` mapped to number 226

Based on the output above, the next binary log position is 36174621. To fix the slave, run:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=36174621;
START SLAVE;

Verify if the slave threads are now running by executing SHOW SLAVE STATUS\G

Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 306841423
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 190785290
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
*** redacted ***
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 226959625
              Relay_Log_Space: 270667273
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: 383
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Opening tables
           Master_Retry_Count: 86400
*** redacted ***
                Auto_Position: 0

To make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0;
GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Done.

GTID replication

For GTID replication, in addition to injecting an empty transaction for the offending statement, you’ll need skip it by using the non-GTID solution provided above. Once running, flip it back to GTID.

Here’s an example of a broken GTID slave:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 14364967
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 8630318
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
*** redacted ***
                   Last_Errno: 1064
                   Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 12468343
              Relay_Log_Space: 10527158
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1064
               Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 180918 22:32:28
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:2280-8530
            Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-7403
                Auto_Position: 1
1 row in set (0.00 sec)
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+---------------------------------------------+
| Variable_name | Value                                       |
+---------------+---------------------------------------------+
| gtid_executed | 00005723-0000-0000-0000-000000005723:1-7403 |
+---------------+---------------------------------------------+
1 row in set (0.00 sec)

Since the last position executed is 7403, so you’ll need to create an empty transaction for the offending sequence 7404.

STOP SLAVE;
SET GTID_NEXT='00005723-0000-0000-0000-000000005723:7404';
BEGIN;
COMMIT;
SET GTID_NEXT=AUTOMATIC;
START SLAVE;

Note: If you have MTS enabled, you can also get the offending GTID coordinates from Last_SQL_Error of SHOW SLAVE STATUS\G

The next step is to find the next binary log position. The current binary log(Relay_Master_Log_File) and position(Exec_Master_Log_Pos) executed are mysql-bin.000003 and 12468343 respectively. We can again use

mysqlbinlog
  on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=12468343 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000003 | head -n 30
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 12468343
#180918 22:32:19 server id 1  end_log_pos 12468408 CRC32 0x259ee085 	GTID	last_committed=7400	sequence_number=7404	rbr_only=no
SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7404'/*!*/;
# at 12468408
#180918 22:32:19 server id 1  end_log_pos 12468591 CRC32 0xb349ad80 	Query	thread_id=142	exec_time=0	error_code=0
SET TIMESTAMP=1537281139/*!*/;
SET @@session.pseudo_thread_id=142/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'
/*!*/;
# at 12468591
#180918 22:32:19 server id 1  end_log_pos 12468656 CRC32 0xb2019f3f 	GTID	last_committed=7400	sequence_number=7405	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7405'/*!*/;
# at 12468656
#180918 22:32:19 server id 1  end_log_pos 12468730 CRC32 0x76b5ea6c 	Query	thread_id=97	exec_time=0	error_code=0
SET TIMESTAMP=1537281139/*!*/;
BEGIN
/*!*/;
# at 12468730
#180918 22:32:19 server id 1  end_log_pos 12468789 CRC32 0x48f0ba6d 	Table_map: `sbtest`.`sbtest8` mapped to number 115

The next binary log position is 36174621. To fix the slave, run:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=12468591, MASTER_AUTO_POSITION=0;
START SLAVE;

Notice that I added MASTER_AUTO_POSITION=0 above to disable GTID replication for now. You can run SHOW SLAVE STATUS\G to determine that MySQL is running fine:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 446194575
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 12704248
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
*** redacted ***
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 25172522
              Relay_Log_Space: 433726939
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: 2018
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
*** redacted ***
           Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:7405-264930
            Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-14947
                Auto_Position: 0

Since it’s running fine you can now revert back to GTID replication:

STOP SLAVE;
CHANGE MASTER TO MASTER_AUTO_POSITION=1;
START SLAVE;

Finally, to make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0;
GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Summary

In this article, I’ve shared how to fix replication when it breaks due to an incompatible command being replicated to the slave. In fact, I’ve only identified ALTER USER as an incompatible command for 5.6. If there are other incompatible commands, please share them in the comment section. Thanks in advance.

Update: 
I filed a bug at https://bugs.mysql.com/bug.php?id=92629 to verify if the errors I’ve encountered here is a bug or undocumented behavior.

The post How To Fix MySQL Replication After an Incompatible DDL Command appeared first on Percona Database Performance Blog.

by Jaime Sicam at October 01, 2018 11:40 AM

September 28, 2018

Peter Zaitsev

High Availability for Enterprise-Grade PostgreSQL environments

High Availability PostgreSQL

PostgreSQL logoHigh availability (HA) and database replication is a major topic of discussion for database technologists. There are a number of informed choices to be made to optimize PostgreSQL replication so that you achieve HA. In this post we introduce an overview of the topic, and cover some options available to achieve high availability in PostgreSQL. We’ll then focus in on just one way to implement HA for postgres, using Patroni.  

In our previous blog posts, we have discussed the features available to build a secured PostgreSQL environment and the tools available to help you set up a reliable backup strategy. The series of articles is designed to provide a flavor of how you might go about building an enterprise-grade PostgreSQL environment using open source tools. If you’d like to see this implemented, then please do check our our webinar presentation of October 10 – we think you might find it both useful and intriguing! 

Replication in PostgreSQL

The first step towards achieving high availability is making sure you don’t rely on a single database server: your data should be replicated to at least one standby replica/slave. Database replication can be done using the two options available with PostgreSQL community software:

  1. Streaming replication
  2. Logical replication & logical decoding

When we setup streaming replication, a standby replica connects to the master (primary) and streams WAL records from it. Streaming replication is considered to be one of the safest and fastest methods for replication in PostgreSQL. A standby server becomes an exact replica of the primary with potentially minimal lag between primary and standby even on very busy transactional servers. PostgreSQL allows you to build synchronous and asynchronous replication while in streaming replication. Synchronous replication ensures that a client is given a success message only when the change is not only committed to the master but also successfully replicated on the standby server as well. As standby servers can accept read requests from clients, we can make more efficient use of our PostgreSQL setup by sparing the master from serving read requests and redirecting these to the replicas instead. You can read more about Streaming Replication in this blog post

Logical replication in PostgreSQL allows users to perform a selective replication of a subset of the tables found in the master. While streaming replication is implemented in PostgreSQL at block level—where every database in the master gets replicated to the replica, which remains read-only—logical replication suits such unique scenarios where you need to perform replication of a selection of tables in a database and (optionally) allow direct writes to your slave. A slave configured with logical replication can also be configured to replicate from multiple masters. One situation where this is helpful is when you need to replicate data from several PostgreSQL databases to a single PostgreSQL server for reporting and data warehousing tasks.

While it’s technically possible to employ standby servers configured with logical replication in an HA environment, this doesn’t fare well as a best practice. For such usage, a standby server should be able to take the place of another server “transparently” – the more they resemble the master the better. Logical replication opens the door for different data to be replicated to different servers, and this may break things

Here is a list of built-in features available in PostgreSQL that are designed to help achieve high availability:

  • Streaming replication
  • Cascaded replication
  • Asynchronous standby
  • Synchronous standby
  • Warm standby
  • Hot standby
  • pg_rewind and pg_basebackup 

Automatic failover and an always-on strategy

There are many more open source solutions that can help us achieve high availability with PostgreSQL, especially during critical moments, when a master (primary server) becomes unavailable. The following is a list of a few such open source solutions built for PostgreSQL: 

  1. Patroni
  2. Stolon
  3. repmgr
  4. PostgreSQL Automatic Failover (PAF)
  5. pglookout
  6. pgPool-II

However, the HA solutions available for PostgreSQL are not just limited to the list above. We would be interested to hear what you have implemented as an HA solution in the comments section below.

In our upcoming webinar we are going to show you a PostgreSQL replication cluster built using Patroni and how it provides a seamless failover that is transparent to the application. 

Patroni

Patroni is a PostgreSQL cluster management template/framework which stores and talks to a distributed consensus key-value store and decides on the state of the cluster. It started as a fork of Governor project. A Patroni PostgreSQL cluster is composed of many individual PostgreSQL instances running on bare metal, containers or virtual machines. In our setup we’ve used etcd for consensus management, which handles leader elections and decides the leader among a cluster of servers that are partitioned by network. This distributed consensus management can also be achieved by using other technologies, such as ZooKeeper and Consul. In the event of a failover, Patroni promotes the slave that has been assigned as a leader by etcd-like consensus key-value stores. Note that when we setup asynchronous replication, you have an option to specify maximum_lag_on_failover to restrict it from promoting a slave that is lagging by more than this value.

Here’s an architecture diagram of Patroni :  

Key Benefits of Patroni:

  1. Continuous monitoring and automatic failover
  2. Manual/scheduled switchover with a single command
  3. Built-in automation for bringing back a failed node to cluster again.
  4. REST APIs for entire cluster configuration and further tooling.
  5. Provides infrastructure for transparent application failover
  6. Distributed consensus for every action and configuration.
  7. Integration with Linux watchdog for avoiding split-brain syndrome.
If you found this post useful…

You are sure to enjoy our webinar of October 10, where we demonstrate live how to build an enterprise-grade PostgreSQL environment with open source tools. If you make it to the live presentation, you will also have the chance to ask questions of the team.

In the next blog post of this series we’ll be covering the scalability of our solution and how to accommodate an increase in traffic while maintaining the quality of the service. We’re moving ever closer to an enterprise-grade environment with open source tools!

The post High Availability for Enterprise-Grade PostgreSQL environments appeared first on Percona Database Performance Blog.

by Jobin Augustine at September 28, 2018 05:09 PM

This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

I consider this to be the biggest news for the week: Alpha Build of Percona Server for MySQL 8.0. Experiment with it in a Docker container. It is missing column compression with dictionary support, native partitioning for TokuDB and MyRocks (excited to see that this is coming!), and encryption key rotation and scrubbing. All in, this should be a fun release to try, test, and also to file bugs for!

Database paradigms are changing, and it is interesting to see Cloudflare introducing Workers KV a key-value store, that is eventually consistent and highly distributed (at their global network of 152+ data centers). You can have up to 1 billion keys per namespace, keys up to 2kB in size, values up to 64kB, and eventual global consistency within 10 seconds. Read more about the cost and other technicals too.

For some quick glossing, from a MySQL Federal Account Manager, comes Why MySQL is Harder to Sell Than Oracle (from someone who has done both). Valid concerns, and always interesting to hear the barriers MySQL faces even after 23 years in existence! For analytics, maybe this is where the likes of MariaDB ColumnStore or ClickHouse might come into play.

Lastly, for all of you asking me about when Percona Live Europe Frankfurt 2018 speaker acceptances and agendas are to be released, I am told by a good source that it will be announced early next week. So register already!

Releases

Link List

Upcoming Appearances

Feedback

I look forward to feedback/tips via Twitter @bytebot.

The post This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha appeared first on Percona Database Performance Blog.

by Colin Charles at September 28, 2018 03:22 PM

Scaling Percona Monitoring and Management (PMM)

PMM tested with 1000 nodes

Starting with PMM 1.13,  PMM uses Prometheus 2 for metrics storage, which tends to be heaviest resource consumer of CPU and RAM.  With Prometheus 2 Performance Improvements, PMM can scale to more than 1000 monitored nodes per instance in default configuration. In this blog post we will look into PMM scaling and capacity planning—how to estimate the resources required, and what drives resource consumption.

PMM tested with 1000 nodes

We have now tested PMM with up to 1000 nodes, using a virtualized system with 128GB of memory, 24 virtual cores, and SSD storage. We found PMM scales pretty linearly with the available memory and CPU cores, and we believe that a higher number of nodes could be supported with more powerful hardware.

What drives resource usage in PMM ?

Depending on your system configuration and workload, a single node can generate very different loads on the PMM server. The main factors that impact the performance of PMM are:

  1. Number of samples (data points) injected into PMM per second
  2. Number of distinct time series they belong to (cardinality)
  3. Number of distinct query patterns your application uses
  4. Number of queries you have on PMM, through the user interface on the API, and their complexity

These specifically can be impacted by:

  • Software version – modern database software versions expose more metrics)
  • Software configuration – some metrics are only exposed in certain configuration
  • Workload – a large number of database objects and high concurrency will increase both the number of samples ingested and their cardinality.
  • Exporter configuration – disabling collectors can reduce amount of data collectors
  • Scrape frequency –  controlled by METRICS_RESOLUTION

All these factors together may impact resource requirements by a factor of ten or more, so do your own testing to be sure. However, the numbers in this article should serve as good general guidance as a start point for your research.

On the system supporting 1000 instances we observed the following performance:

Performance PMM 1000 nodes load

As you can see, we have more than 2.000 scrapes/sec performed, providing almost two million samples/sec, and more than eight million active time series. These are the main numbers that define the load placed on Prometheus.

Capacity planning to scale PMM

Both CPU and memory are very important resources for PMM capacity planning. Memory is the more important as Prometheus 2 does not have good options for limiting memory consumption. If you do not have enough memory to handle your workload, then it will run out of memory and crash.

We recommend at least 2GB of memory for a production PMM Installation. A test installation with 1GB of memory is possible. However, it may not be able to monitor more than one or two nodes without running out of memory. With 2GB of memory you should be able to monitor at least five nodes without problem.

With powerful systems (8GB of more) you can have approximately eight systems per 1GB of memory, or about 15,000 samples ingested/sec per 1GB of memory.

To calculate the CPU usage resources required, allow for about 50 monitored systems per core (or 100K metrics/sec per CPU core).

One problem you’re likely to encounter if you’re running PMM with 100+ instances is the “Home Dashboard”. This becomes way too heavy with such a large number of servers. We plan to fix this issue in future releases of PMM, but for now you can work around it in two simple ways:

You can select the host, for example “pmm-server” in your home dashboard and save it, before adding a large amount of hosts to the system.

set home dashboard for PMM

Or you can make some other dashboard of your choice and set it as the home dashboard.

Summary

  • More than 1,000 monitored systems is possible per single PMM server
  • Your specific workload and configuration may significantly change the resources required
  • If deploying with 8GB or more, plan 50 systems per core, and eight systems per 1GB of RAM

The post Scaling Percona Monitoring and Management (PMM) appeared first on Percona Database Performance Blog.

by Peter Zaitsev at September 28, 2018 12:29 PM

September 27, 2018

Peter Zaitsev

Announcement: Alpha Build of Percona Server 8.0

Percona Server for MySQL

Percona server for MySQLAlpha Build of Percona Server 8.0 released

An alpha version of Percona Server 8.0 is now available in the Percona experimental software repositories. This is a 64-bit release only. 

You may experiment with this alpha release by running it in a Docker container:

$ docker run -d -e MYSQL_ROOT_PASSWORD=password -p 3306:3306 perconalab/percona-server:8.0.12.alpha

When the container starts, connect to it as follows:

$ docker exec -ti $(docker ps | grep -F percona-server:8.0.12.alpha | awk '{print $1}') mysql -uroot -ppassword

Note that this release is not ready for use in any production environment.

Percona Server 8.0 alpha is available for the following platforms:

  • RHEL/Centos 6.x
  • RHEL/Centos 7.x
  • Ubuntu 16.04 Xenial
  • Ubuntu 18.04 Bionic
  • Debian 8 Jessie
  • Debian 9 Stretch

Note: The list of supported platforms may be different in the GA release.

Fixed Bugs:

  • PS-4814: TokuDB ‘fast’ replace into is incompatible with 8.0 row replication
  • PS-4834: The encrypted system tablespace has empty uuid

Other fixed bugs: PS-4788PS-4631PS-4736, PS-4818PS-4755

Unfinished Features

The following features are work in progress and are not yet in a working state:

  • Column compression with Data Dictionaries
  • Native Partitioning for TokuDB and for MyRocks
  • Encryption
    • Key Rotation
    • Scrubbing

Known Issues

  • PS-4788: Setting log_slow_verbosity and enabling the slow_query_log could lead to a server crash
  • PS-4803: ALTER TABLE … ADD INDEX … LOCK crash | handle_fatal_signal (sig=11) in dd_table_has_instant_cols
  • PS-4896: handle_fatal_signal (sig=11) in THD::thread_id likely due to enabling innodb_print_lock_wait_timeout_info
  • PS-4820: PS crashes with keyring_vault encryption
  • PS-4796: 8.0 DD and atomic DDL breaks DROP DATABASE for engines that store files in database directory
  • PS-4898: Crash during PAM authentication plugin installation.
  • PS-1782: Optimizer chooses wrong plan when joining 2 tables
  • PS-4850: Toku hot backup plugin dumps tons of info to stdout with no way to disable it
  • PS-4797: rpl.rpl_master_errors failing, likely due to binlog encryption
  • PS-4800: Recovery of prepared XA transactions seems broken in 8.0
  • PS-4853: Installing audit_log plugin causes server to crash
  • PS-4855: Replace http with https in http://bugs.percona.com in server crash messages
  • PS-4857: Improve error message handling for compressed columns
  • PS-4895: Improve error message when encrypted system tablespace was started without keyring plugin
  • PS-3944: Single variable to control logging in QRT
  • PS-4705: crash on snapshot size check in RocksDB
  • PS-4885: Using ALTER … ROW_FORMAT=TOKUDB_QUICKLZ leads to InnoDB: Assertion failure: ha_innodb.cc:12198:m_form->s->row_type == m_create_info->row_type

The post Announcement: Alpha Build of Percona Server 8.0 appeared first on Percona Database Performance Blog.

by Borys Belinsky at September 27, 2018 06:42 PM

Percona Server for MongoDB 3.4.17-2.15 Is Now Available

Percona Server for MongoDB

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.17-2.15 on September 27, 2018. Download the latest version from the Percona website or the Percona Software Repositories.

Percona Server for MongoDB 3.4 is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.4 Community Edition. It supports MongoDB 3.4 protocols and drivers.

Percona Server for MongoDB extends MongoDB Community Edition functionality by including the Percona Memory Engine and MongoRocks storage engines, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.17. There are no additional improvements or new features on top of those upstream fixes.

The post Percona Server for MongoDB 3.4.17-2.15 Is Now Available appeared first on Percona Database Performance Blog.

by Borys Belinsky at September 27, 2018 01:58 PM

Automating MongoDB Log Rotation

MongoDB Log Rotation

MongoDB Log RotationIn this blog post, we will look at how to do MongoDB® log rotation in the right—and simplest—way.

Log writing is important for any application to track history. But when the log file size grows larger, it can cause disk space issues. For database servers especially, it may cause performance issues as the database needs to write to a large file continuously. By scheduling regular log rotation, we can avoid such situations proactively and keep the log file size below a predetermined threshold.

MongoDB Log File

In MongoDB, the log is not rotated automatically so we need to rotate it manually. Usually, the log size of the MongoDB server depends on the level of information configured and the slow log configured. By default, commands taking more than 100ms, or whatever the value set for the slowOpThresholdMs parameter, are written into the MongoDB log file. Let’s see how to automate log rotation on Linux based servers.

Rotate MongoDB Log Methods

The following two methods could be used to rotate the log in MongoDB. The first uses the command shown below from within mongo shell:

> use admin
> db.adminCommand( { logRotate : 1 } )

The alternative is to use SIGUSR1 signal to rotate the logs for a single process in Linux/Unix-based systems:

# kill -SIGUSR1 ${pidof mongod)

The behaviour of log rotation in MongoDB differs according to the value of the parameter logRotate which was introduced in version 3.0 (note that this should not be confused with the logRotate command that we’ve seen above). The two values are:

  • rename – renames the log file and creates a new file specified by the logpath parameter to write further logs
  • reopen – closes and reopens the log file following the typical Linux/Unix log rotation behavior. You also need to enable logAppend if you choose reopen. You should use reopen when using the Linux/Unix log rotate utility to avoid log loss.

In versions 2.6 and earlier, the default behavior when issuing the logRotate command is the same as when using rename, i.e. it renames the original log file from mongod.log to mongod.log.xxxx-xx-xxTxx-xx-xx format where x is filled with the rotation date time, and creates a new log file mongod.log to continue to write logs. You can see an example of this below:

mongodb-osx-x86_64-3.2.19/bin/mongod --fork \
  --dbpath=/usr/local/opt/mongo/data6 \
  --logpath=/usr/local/opt/mongo/data6/mongod.log \
  --logappend --logRotate rename
about to fork child process, waiting until server is ready for connections.
forked process: 59324
child process started successfully, parent exiting
ls -ltrh data6/mongod.log*
-rw-r--r-- 1 vinodhkrish admin 4.9K Sep 14 16:57 mongod.log
mongo
MongoDB shell version: 3.2.19
connecting to: test
>
> db.adminCommand( { logRotate : 1 } )
{ "ok" : 1 }
> exit
bye
ls -ltrh data6/mongod.log*
-rw-r--r-- 1 vinodhkrish admin 4.9K Sep 14 16:57 mongod.log.2018-09-14T11-29-54
-rw-r--r-- 1 vinodhkrish admin 1.0K Sep 14 16:59 mongod.log

Using the above method, we need to compress and move the rotated log file manually. You can of course write a script to automate this. But when using the logrotate=reopen option, the mongod.log is just closed and opened again. In this case, you need to use the command alongside with Linux’s logrotate utility to avoid the loss of log writing in the course of the log rotation operation. We will see more about this in the next section.

Automating MongoDB logRotate using the logrotate utility

I wasn’t a fan of this second method for long time! But MongoDB log rotation seems to work well when using Linux/Unix’s logrotate tool. Now I prefer this approach, since it doesn’t need the complex script writing that’s needed for the first log rotation method described above. Let’s see in detail how to configure log rotation with Linux/Unix’s logrotate utility.

MongoDB 3.x versions

Start MongoDB with the following options:

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log
  logRotate: reopen
processManagement:
  pidFilePath: /var/run/mongodb/mongod.pid

As mentioned in the section about Linux’s logrotation utility, you need to create a separate config file /etc/logrotate.d/mongod.conf for MongoDB’s log file rotation. Add the content shown below into that config file:

/var/log/mongodb/mongod.log {
  daily
  size 100M
  rotate 10
  missingok
  compress
  delaycompress
  notifempty
  create 640 mongod mongod
  sharedscripts
  postrotate
    /bin/kill -SIGUSR1 `cat /var/run/mongodb/mongod.pid 2>/dev/null` >/dev/null 2>&1
  endscript
}

In this config file, we assume that log path is set as /var/log/mongodb/mongod.log in /etc/mongod.conf file, and instruct Linux’s logrotation  utility to do the following:

  • Check the size, and start rotation if the log file is greater than 100M
  • Move the mongod.log file to mongod.log.1
  • Create a new mongod.log file with mongod permissions
  • Compress the files from mongod.log.2 but retain up to mongod.log.10 as per delaycompress and rotate 10
  • MongoDB continues to write to the old file mongod.log.1 (based on Linux’s inode) – remember that now there is no mongod.log file
  • In postrotate, it sends the kill -SIGUSR1 signal to mongod mentioned with the pid file, and thus mongod creates the mongod.log and starts writing to it. So make sure you have the pid file path set to the same as pidFilepath from the /etc/mongod.conf file

Please test the logrotate manually using the created /etc/logrotate.d/mongod.conf file to make sure it is working as expected. Here’s how:

cd /var/log/mongodb/
ls -ltrh
total 4.0K
-rw-r-----. 1 mongod mongod 1.3K Sep 14 12:45 mongod.log
logrotate -f /etc/logrotate.d/mongod
ls -ltrh
total 8.0K
-rw-r-----. 1 mongod mongod 1.4K Sep 14 12:58 mongod.log.1
-rw-r-----. 1 mongod mongod 1.3K Sep 14 12:58 mongod.log
logrotate -f /etc/logrotate.d/mongod
ls -ltrh
total 12K
-rw-r-----. 1 mongod mongod 491 Sep 14 12:58 mongod.log.2.gz
-rw-r-----. 1 mongod mongod 1.4K Sep 14 12:58 mongod.log.1
-rw-r-----. 1 mongod mongod 1.3K Sep 14 12:58 mongod.log

 

Adaptations for MongoDB 2.x and earlier, or when using logRotate=rename

Since the introduction of logRotate parameter in MongoDB 3.0, the log rotate script needs an extra step when you are using logRotate=rename or when using <=2.x versions.

Start the MongoDB with the following options (for 2.4 and earlier, ) :

logAppend=true
logpath=/var/log/mongodb/mongod.log
pidfilePath=/var/run/mongodb/mongod.pid

Start MongoDB with the following options YAML format (YAML config introduced from version 2.6) :

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log
processManagement:
  pidFilePath: /var/run/mongodb/mongod.pid

The config file /etc/logrotate.d/mongod.conf for MongoDB’s log file rotation should be set up like this:

/var/log/mongodb/mongod.log {
  daily
  size 100M
  rotate 10
  missingok
  compress
  delaycompress
  notifempty
  create 640 mongod mongod
  sharedscripts
  postrotate
    /bin/kill -SIGUSR1 `cat /var/run/mongodb/mongod.pid 2>/dev/null` >/dev/null 2>&1
    find /var/log/mongodb -type f -size 0 -regextype posix-awk \
-regex "^\/var\/log\/mongodb\/mongod\.log\.[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}-[0-9]{2}-[0-9]{2}$" \
-execdir rm {} \; >/dev/null 2>&1
  endscript
}

In this case, the logrotate utility behaves as follows:

  • Check for the size, and start rotation if the log file size exceeds 100M
  • Move mongod.log file to mongod.log.1
  • Create a new mongod.log file with mongod permissions
  • MongoDB continues to write to the old file, mongod.log.1
  • In postrotate, when the SIGUSR1 signal is sent, mongod rotates the log file. This includes renaming the new mongod.log file (0 bytes) created by logrotate to mongod.log.xxxx-xx-xxTxx-xx-xx format and creating a new mongod.log file to which, now, mongod starts writing the logs.
  • the Linux command
    find
      identifies mongod.log.xxxx-xx-xxTxx-xx-xx file formats that are sized at 0 bytes, and these are removed

If you enjoyed this blog…

You might also benefit from this recorded webinar led by my colleague Tim Vaillancourt MongoDB Backup and Recovery Field Guide or perhaps the Percona Solution Brief Security for MongoDB.

References:
https://jira.mongodb.org/browse/SERVER-11087
https://jira.mongodb.org/browse/SERVER-14053
https://jira.mongodb.org/browse/SERVER-16821
https://docs.mongodb.com/v2.6/reference/configuration-options/
https://docs.mongodb.com/v2.4/reference/configuration-options/
https://docs.mongodb.com/manual/reference/configuration-options/

The post Automating MongoDB Log Rotation appeared first on Percona Database Performance Blog.

by Vinodh Krishnaswamy at September 27, 2018 12:32 PM

September 26, 2018

Peter Zaitsev

Percona XtraDB Cluster 5.7.23-31.31 Is Now Available

Percona XtraDB Cluster 5.7

This release has been superseded by 5.7.23-31.31.2 after a critical regression was found. Please update to the latest release.

Percona XtraDB Cluster 5.7Percona is glad to announce the release of Percona XtraDB Cluster 5.7.23-31.31 on September 26, 2018. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.7.23-31.31 is now the current release, based on the following:

Deprecated

The following variables are deprecated starting from this release:

This variable, which defines whether locking sessions should be converted to transactions, is deprecated in Percona XtraDB Cluster 5.7.23-31.31 because it is rarely used in practice.

Fixed Bugs

  • PXC-1017: Memcached access to InnoDB was not replicated by Galera.
  • PXC-2164: The SST script prevented SELinux from being enabled.
  • PXC-2155wsrep_sst_xtrabackup-v2 did not delete all folders on cleanup.
  • PXC-2160: In some cases, the MySQL version was not detected correctly with the Xtrabackup-v2 method of SST (State Snapshot Transfer).
  • PXC-2199: When the DROP TRIGGER IF EXISTS statement was run for a not existing trigger, the node GTID was incremented instead of the cluster GTID.
  • PXC-2209: The compression dictionary was not replicated in PXC.
  • PXC-2202: In some cases, a disconnected cluster node was not shut down.
  • PXC-2165: SST could fail if either wsrep_node_address or wsrep_sst_receive_address were not specified.
  • PXC-2213: NULL/VOID DDL transactions could commit in a wrong order.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

 

The post Percona XtraDB Cluster 5.7.23-31.31 Is Now Available appeared first on Percona Database Performance Blog.

by Borys Belinsky at September 26, 2018 03:40 PM

Jean-Jerome Schmidt

Webinar Replay - Monitoring on Steroids for MySQL, MariaDB, PostgreSQL and MongoDB

Thanks for joining us this week for our webinar on monitoring MySQL, MariaDB, PostgreSQL and MongoDB with freely available community tools and more specifically one: ClusterControl Community Edition. The replay and slides are now available to watch on our website.

Monitoring is essential for operations teams to ensure that databases are up and running. However, as databases are increasingly being deployed in distributed topologies based on replication or clustering, what does it mean to our monitoring infrastructure? Is it ok to monitor individual components of a database cluster, or do we need a more holistic systems approach? Can we rely on SELECT 1 as health check when determining whether a database is up or down? Do we need high-resolution time-series charts of status counters? Are there ways to predict problems before they actually become one?

In this webinar replay, we discuss how to effectively monitor distributed database clusters or replication setups. We look at different types of monitoring infrastructures, from on-prem to cloud and from agent-based to agentless. Then we dive into the different monitoring features available in the free ClusterControl Community Edition - from time-series charts of metrics, dashboards, and queries to performance advisors.

If you would like to centralize the monitoring of your open source databases and achieve this at zero cost, please watch this webinar.

Agenda

  • Requirements for monitoring distributed database systems
  • Cloud-based vs On-prem monitoring solutions
  • Agent-based vs Agentless monitoring
  • Deep-dive into ClusterControl Community Edition
    • Architecture
    • Metrics Collection
    • Trending
    • Dashboards
    • Queries
    • Performance Advisors
    • Other features available to Community users

Speaker

Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.

We look forward to “seeing” you there!

by jj at September 26, 2018 02:54 PM

Peter Zaitsev

Scaling IO-Bound Workloads for MySQL in the Cloud – part 2

Rplot07-innodb-iops

This post is a followup to my previous article https://www.percona.com/blog/2018/08/29/scaling-io-bound-workloads-mysql-cloud/

In this instance, I want to show the data in different dimensions, primarily to answer questions around how throughput scales with increasing IOPS.

A recap: for the test I use Amazon instances and Amazon gp2 and io1 volumes. In addition to the original post, I also tested two gpl2 volumes combined in software RAID0. I did this for the following reason: Amazon cap the single gp2 volume throughput to 160MB/sec, and as we will see from the charts, this limits InnoDB performance.

Also, a reminder from the previous post: we can increase gp2 IOPS by increasing volume size (to the top limit 10000 IOPS), and for io1 we can increase IOPS by paying per additional IOPS.

Scaling with InnoDB

So for the first result, let’s see how InnoDB scales with increasing IOPS.

There are a few interesting observations here: InnoDB scales linearly with additional IOPS, but it faces a throughput limit that Amazon applies to volumes.

So besides considering IOPS, we should take into account the maximal throughout of volumes.

In the second chart we compare InnoDB performance vs the cost of volumes:

It’s interesting to see here the slope for gp2 volumes is steeper than for io1 volumes. This means we can get a bigger increase in InnoDB performance per dollar using gp2 volumes, but only until we reach the IOPS and throughput limits that are applied to gp2 volumes.

Scaling with MyRocks

And here’s the similar chart but for MyRocks:

Here we can also see that MyRocks scales linearly, showing identical results on gp2 and io1 volumes. This means that running on gp2 will be cheaper. Also, there is no plateau in throughput, as we saw for InnoDB, which means that MyRocks uses less IO throughput.

And the chart for the cost of running MyRocks:

This charts also shows that it is cheaper to run on gp2 volume but only while it provides enough IOPS. I assume that using two gp2 volumes would allow me to double the throughput. (I did not run the test for MyRocks using two volumes)

Conclusions

  • Both MyRocks and InnoDB can scale (linearly) with additional IOPS on gp2 and io1 Amazon volumes.
  • Take into account that IOPS is not the only factor to consider. There is also throughput limit, which affects InnoDB results, so for further scaling you might need to use multiple volumes.

The post Scaling IO-Bound Workloads for MySQL in the Cloud – part 2 appeared first on Percona Database Performance Blog.

by Vadim Tkachenko at September 26, 2018 12:40 PM

September 25, 2018

Peter Zaitsev

PostgreSQL Backup Strategy for an Enterprise-Grade Environment

PostgreSQL enterprise-grade backup and recovery

PostgreSQL® logoIn this post we cover the methods used to achieve an enterprise-grade backup strategy for the PostgreSQL cluster. In setting up our demonstration system, we use pg_basebackup and continuous archiving. The size of the database and your database environment—if it is on colo or cloud—matters. It is very important to understand how we can ensure minimalistic or no data loss at the time of disasters using our preferred backup tools.

As discussed in the introductory blog post, in our webinar of October 10, 2018 we highlight important aspects an enterprise should consider for their PostgreSQL environments. In our last blog post we looked at security.

Backing up…

The ability to recover a database to a certain point in time is always the ultimate aim of a backup strategy. Without a recoverable backup, you have no backup! So there are always two parts to a backup strategy: backup AND recovery. The backup tool you use to achieve your aims may vary. Let’s discuss some of the options.

pg_basebackup

The community version of PostgreSQL comes with a trusted backup tool called pg_basebackup. This utility can help you take a consistent online binary backup that is very fast, can be used for point-in-time-recovery, and also for setting up slaves/replicas.

Archiving and continuous archiving

Postgres does not flush data pages to disk upon every transaction commit. To ensure that PostgreSQL achieves durability and endures crash recovery, changes are written to transactions logs (a.k.a. WALs, Write-Ahead Logs) stored on disk.

WALs in PostgreSQL are similar to transaction log files in the InnoDB storage engine for MySQL. However, WAL files are recycled according to the values set to the parameters wal_keep_segments and max_wal_size. Hence, if WAL files are not copied to a safe location—such as a backup server or another file system—it won’t be possible to achieve point-in-time-recovery (PITR).

In order to archive WAL segments to a safe location, the parameter archive_mode must be set to ‘ON’ and we must pass an appropriate shell command to the archive_command parameter. The shell command can be a copy (cp command in Linux) or a bash script that tells postgres where to archive its WAL files. Administrators commonly use an NFS share or SAN partition for this, mounted locally to facilitate the storage of these files.You may refer to the manual page PostgreSQL Archiving to learn more about archiving.

Demonstration scenario

For our demonstration set up, and taking into account the size of the database and the number of transactions, we use the following backup strategy:

  1. Nightly pg_basebackup that runs every day at 01:00 am.
  2. Continuous archiving of WAL’s to a remote backup server that is redundant and avoids single point of failure.

Using the above strategy, we can ensure that we are storing full backups daily. At the same time, the changes/transactions are backed up using the archiving solution available with PostgreSQL. In this way, we can achieve recovery to any point-back-in-time, depending on our backup retention policies.

Be sure to check out our webinar, Enterprise-grade PostgreSQL Built on Open Source Tools, to see this in practice. If you are reading this in time to join us on October 10th, you’ll have the chance to ask questions, too.

Other options

Depending on your database size, number of transactions and the environment, you may also consider using pgBackRest, Barman or WAL-g. These are stable open source backup solutions that have been contributed and are continuously maintained by community members. They help us to achieve one or more of the following features:

  1. Incremental backups
  2. Differential backups
  3. Features for building standby replicas
  4. Ability to stream backups to another server
  5. Streaming backups to AWS S3 or object store, removing the need to store backups locally or on locally mounted network shares before uploading to the cloud.

Here is a snippet of what is/isn’t possible with each of these backup tools today.

postgresl backup feature comparison

 

Sources:

pg_basebackup
pgBackRest
Barman
WAL-g

The post PostgreSQL Backup Strategy for an Enterprise-Grade Environment appeared first on Percona Database Performance Blog.

by Avinash Vallarapu at September 25, 2018 05:13 PM

MariaDB Foundation

MariaDB 10.2.18 and MariaDB Connector/Node.js 2.0.0 now available

The MariaDB Foundation is pleased to announce the availability of MariaDB 10.2.18, the latest stable release in the MariaDB 10.2 series, as well as MariaDB Connector/Node.js 2.0.0, the second alpha release of the new 100% JavaScript non-blocking MariaDB client for Node.js, compatible with Node.js 6+. See the release notes and changelogs for details. Download MariaDB […]

The post MariaDB 10.2.18 and MariaDB Connector/Node.js 2.0.0 now available appeared first on MariaDB.org.

by Ian Gilfillan at September 25, 2018 03:00 PM

Peter Zaitsev

Why Optimization derived_merge can Break Your Queries

MySQL optimizer bugs

MySQL optimizer bugsLately, I worked on several queries which started returning wrong results after upgrading MySQL Server to version 5.7 The reason for the failure was derived merge optimization which is one of the default

optimizer_switch
  options. Issues were solved, though at the price of performance, when we turned it
OFF
 . But, more importantly, we could not predict if any other query would start returning incorrect data, to allow us to fix the application before it was too late. Therefore I tried to find reasons why
derived_merge
  can fail.

Analyzing the problem

In the first run, we turned SQL Mode

ONLY_FULL_GROUP_BY
on, and this removed most of the problematic queries. That said, few of the queries that were successfully working with
ONLY_FULL_GROUP_BY
  were affected.

A quick search in the MySQL bugs database gave me a not-so-short list of open bugs:

At first glance, the reported queries do not follow any pattern, and we still cannot quickly identify which would break and which would not.

Then I took a second look by running all of the provided test cases in my environment and found that for four bugs, the optimizer rewrote the query. For three of the bugs, it rewrote in both 5.7 and 8.0, and one case it rewrote in 8.0 only.

The remaining three buggy queries (Bug #85117, Bug #91418, Bug #91878) have things in common. Let’s first look at them:

  1. Bug #85117
    select
        temp.sel
    from
        table1 t1
        left join (
            select *,1 as sel from table1_use t1u where t1u.`table1id`=1
        ) temp on temp.table1id = t1.id
    order by t1.value
  2. Bug #91418
    select
        TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    from
        TEST_SUB_PROBLEM TST
        join (
            select uuid() as DIST_UID, vw.*
            from (
                select DISTINCT BID, THING_NAME
                from TEST_SUB_PROBLEM
            ) vw
        ) vw2
    on vw2.BID = TST.BID;
  3. Bug #91878
    SELECT
        Virtual_Table.T_FP AS T_FP,
        (
            SELECT COUNT(Virtual_Table.T_FP)
            FROM t1 t
            WHERE t.f1 = Virtual_Table.T_FP
            AND Virtual_Table.T_FP = 731834939448428685
        ) AS Test_Value
    FROM (
        SELECT t.f1 AS T_FP, tv.f1 AS TV_FP
        FROM t1 AS t
        JOIN t2 AS tv
        ON t.f1 = tv.t1_f1
    ) AS Virtual_Table
    GROUP BY
        Virtual_Table.TV_FP
    HAVING
        Test_Value > 0;

Two of the queries use

DISTINCT
  or
GROUP BY
 , one uses
ORDER BY
  clause. The cases do not have not the same clause in common—which is what I’d expect to see—and so, surprisingly, these are not the cause of the failure. However, all three queries use generated values: a constant in the first one;
UUID()
  and
COUNT()
  functions in the second and third respectively. This similarity is something we need to investigate.

To find out why

derived_merge
  might work incorrectly for these queries we need to understand how this optimization works and why it was introduced.

The intent behind derived_merge

First I recommend checking the official MySQL User Reference Manual and MariaDB knowledge base. It is correct to use both manuals: even if low-level implementations vary, the high-level architecture and the purpose of this optimization are the same.

In short:

derived_merge
  is used for queries that have subqueries in the 
FROM
  clause,  also called “derived tables” and practically converts them into
JOIN
 queries. This optimization allows avoiding unnecessary materialization (creating internal temporary tables to hold results). Virtually this is the same thing as a manual rewrite of a query with a subquery into a query that has
JOIN
 clause(s) only. The only difference is that when we rewrite queries manually, we can compare the expected and actual result, then adjust the resulting query if needed. The MySQL optimizer has to do a correct rewrite at the first attempt. And sometimes this effort fails.

Let’s check why this happens for these particular queries, reported in the MySQL Bugs Database.

Case Study 1: a Query from Bug #85117

Original query

select
      temp.sel
from
    table1 t1
    left join (
         select *,1 as sel from table1_use t1u where t1u.`table1id`=1
    ) temp on temp.table1id = t1.id
order by t1.value

was rewritten to:

Note (Code 1003):
/* select#1 */
select 1 AS `sel`
    from
        `test`.`table1` `t1`
    left join
        (`test`.`table1_use` `t1u`)
    on(((`test`.`t1`.`id` = 1) and (`test`.`t1u`.`table1id` = 1)))
    where 1
    order by `test`.`t1`.`value`;

You can always find a query that the optimizer converts the original one to in the

SHOW WARNINGS
 output following
EXPLAIN [EXTENDED]
 for the query.

In this case, the original query asks to return all rows from the table

table1
 , but selects only the generated field from the subquery. The subquery selects the only row with
table1id=1
 .

Avoiding derived merge optimization is practically the same as joining table

table1
 with a table with one row. You can see how it works in this code snippet:

mysql> create temporary table temp as select *,1 as sel from table1_use t1u where t1u.`table1id`=1;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0
mysql> select * from temp;
+----+----------+------+-----+
| id | table1id | uid  | sel |
+----+----------+------+-----+
|  1 |        1 |   99 |   1 |
+----+----------+------+-----+
1 row in set (0.00 sec)
mysql> select temp.sel from table1 t1 left join temp on temp.table1id = t1.id order by t1.value;
+------+
| sel  |
+------+
|    1 |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

However, when the optimizer uses derived-merge optimization, it completely ignores the fact that the resulting table has one row, and that the calculated value would be either

NULL
  or 1 depending if a row corresponding to
table1
  exists in the table. That it prints
select 1 AS `sel`
  in the
EXPLAIN
  output while uses
select NULL AS `sel`
  does not change anything: both are wrong. The correct query without a subquery should look like:

mysql> select if(`test`.`t1u`.`table1id`, 1, NULL) AS `sel`
    -> from `test`.`table1` `t1`
    -> left join (`test`.`table1_use` `t1u`)
    -> on(((`test`.`t1`.`id` = 1) and (`test`.`t1u`.`table1id` = 1)))
    -> where 1
    -> order by `test`.`t1`.`value`;
+------+
| sel  |
+------+
|    1 |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

This report is the easiest of the bugs we will discuss in this post, and is also fixed in MariaDB.

Case Study 2: a Query from Bug #91418

mysql> select * from TEST_SUB_PROBLEM;
+-----+--------+------------+---------------------+
| UID | BID    | THING_NAME | OTHER_IFO           |
+-----+--------+------------+---------------------+
|   1 | thing1 | name1      | look a chicken      |
|   2 | thing1 | name1      | look an airplane    |
|   3 | thing2 | name2      | look a mouse        |
|   4 | thing3 | name3      | look a taperecorder |
|   5 | thing3 | name3      | look an explosion   |
|   6 | thing4 | name4      | look at the stars   |
+-----+--------+------------+---------------------+
6 rows in set (0.00 sec)
mysql> select
    ->     TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    -> from
    ->     TEST_SUB_PROBLEM TST
    -> join (
    ->     select uuid() as DIST_UID, vw.*
    ->     from (
    ->         select DISTINCT BID, THING_NAME
    ->         from TEST_SUB_PROBLEM
    ->     ) vw
    -> ) vw2
    -> on vw2.BID = TST.BID;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | e4c288fd-b29c-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | e4c28aef-b29c-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | e4c28c47-b29c-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | e4c28d92-b29c-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | e4c28ed9-b29c-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | e4c29031-b29c-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.00 sec)

This query should create a unique

DIST_UID
  for each unique
BID
 name. But, instead, it generates a unique
ID
  for each row.

First, let’s split the query into a couple of queries using temporary tables, to confirm our assumption that it was written correctly in the first place:

mysql> create temporary table vw as select DISTINCT BID, THING_NAME from TEST_SUB_PROBLEM;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw;
+--------+------------+
| BID    | THING_NAME |
+--------+------------+
| thing1 | name1      |
| thing2 | name2      |
| thing3 | name3      |
| thing4 | name4      |
+--------+------------+
4 rows in set (0.00 sec)
mysql> create temporary table vw2 as select uuid() as DIST_UID, vw.* from vw;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw2;
+--------------------------------------+--------+------------+
| DIST_UID                             | BID    | THING_NAME |
+--------------------------------------+--------+------------+
| eb155f0e-b29d-11e8-b0d7-0242673a86b2 | thing1 | name1      |
| eb158c05-b29d-11e8-b0d7-0242673a86b2 | thing2 | name2      |
| eb159b28-b29d-11e8-b0d7-0242673a86b2 | thing3 | name3      |
| eb15a916-b29d-11e8-b0d7-0242673a86b2 | thing4 | name4      |
+--------------------------------------+--------+------------+
4 rows in set (0.00 sec)
mysql> select
    -> TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    -> from TEST_SUB_PROBLEM TST
    -> join vw2
    -> on vw2.BID = TST.BID;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | eb158c05-b29d-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | eb159b28-b29d-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | eb159b28-b29d-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | eb15a916-b29d-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.01 sec)
mysql> select distinct DIST_UID
    -> from (
    ->     select
    ->         TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    ->     from TEST_SUB_PROBLEM TST
    ->     join vw2
    ->     on vw2.BID = TST.BID
    -> ) t;
+--------------------------------------+
| DIST_UID                             |
+--------------------------------------+
| eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
| eb158c05-b29d-11e8-b0d7-0242673a86b2 |
| eb159b28-b29d-11e8-b0d7-0242673a86b2 |
| eb15a916-b29d-11e8-b0d7-0242673a86b2 |
+--------------------------------------+
4 rows in set (0.00 sec)

With temporary tables, we have precisely four unique

DIST_UID
  values unlike the six values that our original query returned.

Let’s check how the original query was rewritten:

Note (Code 1003):
/* select#1 */
select
    `test`.`TST`.`UID` AS `UID`,
    `test`.`TST`.`BID` AS `BID`,
    `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    uuid() AS `DIST_UID`
from `test`.`TEST_SUB_PROBLEM` `TST`
join
    (/* select#3 */
    select
        distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
        `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    from `test`.`TEST_SUB_PROBLEM`) `vw`
where (`vw`.`BID` = `test`.`TST`.`BID`)

You can see that the optimizer did not wholly remove the subquery here. Let’s run this modified query, and run a test with a temporary table one more time:

mysql> select
    ->     `test`.`TST`.`UID` AS `UID`,
    ->     `test`.`TST`.`BID` AS `BID`,
    ->     `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    ->     `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    ->     uuid() AS `DIST_UID`
    -> from
    ->     `test`.`TEST_SUB_PROBLEM` `TST`
    -> join
    -> (/* select#3 */
    ->     select
    ->         distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
    ->         `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    ->     from
    ->         `test`.`TEST_SUB_PROBLEM`
    -> ) `vw`
    -> where (`vw`.`BID` = `test`.`TST`.`BID`)
    -> ;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | 12c5f554-b29f-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | 12c5f73a-b29f-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | 12c5f894-b29f-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | 12c5f9de-b29f-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | 12c5fb20-b29f-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | 12c5fc7d-b29f-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.01 sec)

This time the changed query result is no different to the one we received from the original one. Let’s manually replace the subquery with temporary tables, and check if it affects the result again.

mysql> create temporary table vw
    -> select
    ->     distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
    ->     `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    -> from `test`.`TEST_SUB_PROBLEM`;
Query OK, 4 rows affected (0.01 sec)<br>Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw;
+--------+------------+
| BID    | THING_NAME |
+--------+------------+
| thing1 | name1      |
| thing2 | name2      |
| thing3 | name3      |
| thing4 | name4      |
+--------+------------+
4 rows in set (0.00 sec)
mysql> select
    ->     `test`.`TST`.`UID` AS `UID`,
    ->     `test`.`TST`.`BID` AS `BID`,
    ->     `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    ->     `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    ->      uuid() AS `DIST_UID`
    -> from `test`.`TEST_SUB_PROBLEM` `TST`
    -> join vw where (`vw`.`BID` = `test`.`TST`.`BID`) ;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | e11dbe61-b2a0-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | e11dc050-b2a0-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | e11dc1af-b2a0-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | e11dc2be-b2a0-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | e11dc3a8-b2a0-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | e11dc4e9-b2a0-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.00 sec)

In this case, the temporary table contains the correct number of rows: 4, but the outer query calculates a 

UUID
  value for all rows in the table
TEST_SUB_PROBLEM
 . It does not take into account that the user initially asks for a unique
UUID
  for each unique
BID
  and not each unique
UID
 . Instead, it just moves a call of
UUID()
  function into the outer query, which creates a unique value for each row in the table
TEST_SUB_PROBLEM
 . It does not take into account that the temporary table contains only four rows. In this case, it would not be easy to build an effective query that generates distinct
UUID
  values for rows with different
BID
 ‘s and the same
UUID
  values for rows with the same
BID
 .

Case Study 3: a Query from Bug #91878

This query is supposed to calculate a number of rows based on complex conditions:

SELECT
Virtual_Table.T_FP AS T_FP,
(SELECT COUNT(Virtual_Table.T_FP) FROM t1 t WHERE t.f1 = Virtual_Table.T_FP AND Virtual_Table.T_FP = 731834939448428685) AS Test_Value
FROM
(SELECT t.f1 AS T_FP, tv.f1 AS TV_FP FROM t1 AS t JOIN t2 AS tv ON t.f1 = tv.t1_f1) AS Virtual_Table
GROUP BY Virtual_Table.TV_FP
HAVING Test_Value > 0;

However, it returns no rows when it should return 22 (check the bug report for the full test case).

mysql> SELECT Virtual_Table.T_FP AS T_FP,
    -> (
    ->     SELECT
    ->         COUNT(Virtual_Table.T_FP)
    ->     FROM t1 t
    ->     WHERE
    ->         t.f1 = Virtual_Table.T_FP
    ->     AND
    ->         Virtual_Table.T_FP = 731834939448428685
    -> ) AS Test_Value
    -> FROM (
    ->     SELECT
    ->         t.f1 AS T_FP, tv.f1 AS TV_FP
    ->     FROM t1 AS t
    ->     JOIN t2 AS tv
    ->     ON t.f1 = tv.t1_f1
    -> ) AS Virtual_Table
    -> GROUP BY Virtual_Table.TV_FP
    -> HAVING Test_Value > 0;
Empty set (1.28 sec)

To find out why this happens let’s perform a temporary table check first.

mysql> create temporary table Virtual_Table SELECT t.f1 AS T_FP, tv.f1 AS TV_FP FROM t1 AS t JOIN t2 AS tv ON t.f1 = tv.t1_f1;
Query OK, 18722 rows affected (2.12 sec)
Records: 18722  Duplicates: 0  Warnings: 0
mysql> SELECT Virtual_Table.T_FP AS T_FP,
    -> (SELECT COUNT(Virtual_Table.T_FP) FROM t1 t
    -> WHERE t.f1 = Virtual_Table.T_FP AND Virtual_Table.T_FP = 731834939448428685) AS Test_Value
    -> FROM  Virtual_Table GROUP BY Virtual_Table.TV_FP HAVING Test_Value > 0;
+--------------------+------------+
| T_FP               | Test_Value |
+--------------------+------------+
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
+--------------------+------------+
22 rows in set (1.62 sec)

The rewritten query returned the correct result, as we expected.

To identify why the original query fails, let’s check how the optimizer rewrote it:

Note (Code 1003):
/* select#1 */
select
    `test`.`t`.`f1` AS `T_FP`,
    (/* select#2 */
        select
            count(`test`.`t`.`f1`)
        from
            `test`.`t1` `t`
        where
            (('731834939448428685' = 731834939448428685)
        and (`test`.`t`.`f1` = 731834939448428685))
    ) AS `Test_Value`
    from
        `test`.`t1` `t`
    join
        `test`.`t2` `tv`
    where
        (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    group by `test`.`tv`.`f1`
    having (`Test_Value` > 0)

Interestingly, when I run this query on the original tables it returned all 18722 rows that exist in table

t2
 .

This output means that we cannot entirely rely on the 

EXPLAIN
  output. But still we can see the same symptoms:

  • Subquery uses a function to generate a value
  • Subquery in the
    FROM
      clause is converted into a 
    JOIN
    , and its values are accessible by an outer subquery

We also see that the query has

GROUP BY
  and
HAVING
  clauses, thus adding a complication.

The query is almost correct, but in this case, the optimizer mixed aliases: it uses the same alias in the internal query as in the external one. If you change the alias from

t
  to
t2
  in the subquery, the rewritten query starts returning correct results:

mysql> select
    ->     `test`.`t`.`f1` AS `T_FP`,
    -> (/* select#2 */
    ->     select
    ->         count(`test`.`t`.`f1`)
    ->     from
    ->         `test`.`t1` `t`
    ->     where (
    ->         ('731834939448428685' = 731834939448428685)
    ->     and
    ->         (`test`.`t`.`f1` = 731834939448428685)
    ->     )
    -> ) AS `Test_Value`
    -> from
    ->     `test`.`t1` `t`
    -> join
    ->     `test`.`t2` `tv`
    -> where
    ->     (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    -> group by `test`.`tv`.`f1`
    -> having (`Test_Value` > 0);
...
| 731834939454553991 |          1 |
| 731834939453739998 |          1 |
+--------------------+------------+
18722 rows in set (0.49 sec)
mysql> select
    ->     `test`.`t`.`f1` AS `T_FP`,
    -> (/* select#2 */
    ->     select
    ->         count(`test`.`t`.`f1`)
    ->     from
    ->         `test`.`t1` `t2`
    ->     where (
    ->         (t2.f1=t.f1)
    ->     and
    ->         (`test`.`t`.`f1` = 731834939448428685)
    ->     )
    -> ) AS `Test_Value`
    -> from
    ->     `test`.`t1` `t`
    -> join
    ->     `test`.`t2` `tv`
    -> where
    ->     (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    -> group by `test`.`tv`.`f1`
    -> having (`Test_Value` > 0);
+--------------------+------------+
| T_FP               | Test_Value |
+--------------------+------------+
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
+--------------------+------------+
22 rows in set (1.82 sec)

While the calculated value is not the reason why this query returns incorrect results, it is similar to the previous examples because the optimizer does not take in account that the value of

`test`.`t`.`f1`
  in the outer query is not necessarily equal to 731834939448428685.

Is also interesting that neither Oracle nor PostgreSQL accept such a query, and instead complain of improper use of the 

GROUP BY
 clause. Meanwhile, MySQL accepts this query even with SQL Mode set to
ONLY_FULL_GROUP_BY
 . Reported as bug #92020.

Conclusion and recommendations

While

derived_merge
  is a very effective optimization, it can rewrite queries destructively. Safety measures when using this optimization are:

  1. Make sure that you use the latest version of MySQL/Percona/MariaDB servers which include all of the new bug fixes.
  2. Generated values for the subquery results either constant or returned values of functions is the red flag.
  3. Relaxing SQL Mode
    ONLY_FULL_GROUP_BY
      is always dangerous and should not be used together with
    derived_merge
    .

As a last resort, you can consider rewriting queries to

JOIN
  manually or turning
derived_merge
  optimization
OFF
 .

 

The post Why Optimization derived_merge can Break Your Queries appeared first on Percona Database Performance Blog.

by Sveta Smirnova at September 25, 2018 02:51 PM

Upcoming Webinar Thurs 9/27: Best Practices Using Indexes in MongoDB

Best Practices Using Indexes in MongoDB

Best Practices Using Indexes in MongoDBPlease join Percona’s Principal Architect Alex Rubin as he presents Best Practices Using Indexes in MongoDB on Thursday, September 27th at 11:00 AM PDT (UTC-7) / 2:00 PM EDT (UTC-4).

 

Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement.

In this webinar we’ll discuss the following:

  • How MongoDB uses indexes
  • What types of indexes exist in MongoDB
  • How to find slow queries
  • Which index to create to speed up your queries and how to verify that the index will be used
  • How to monitor and evaluate the performance of your indexes

Register for this webinar on Best Practices Using Indexes in MongoDB.

The post Upcoming Webinar Thurs 9/27: Best Practices Using Indexes in MongoDB appeared first on Percona Database Performance Blog.

by Alexander Rubin at September 25, 2018 01:41 PM

MariaDB AB

MariaDB Server 10.2.18 now available

MariaDB Server 10.2.18 now available dbart Tue, 09/25/2018 - 08:59

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.18 and MariaDB Connector/Node.js 2.0.0 Alpha. See the release notes and changelogs for details and visit mariadb.com/downloads to download.

 

Download MariaDB Server 10.2.18

Release Notes Changelog What is MariaDB 10.2?


Download MariaDB Connector/Node.js 2.0.0 Alpha

Release Notes Changelog About MariaDB Connector/Node.js

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.18. See the release notes and changelog for details.

Login or Register to post comments

by dbart at September 25, 2018 12:59 PM

September 24, 2018

Peter Zaitsev

Backing up Percona Server for MySQL with keyring_vault plugin enabled

Percona XtraBackup with keyring_vault

Percona XtraBackup with keyring_vaultTo use Percona XtraBackup with keyring_vault plugin enabled you need to take some special measures to secure a working backup. This post addresses how to backup Percona Server for MySQL with keyring_vault plugin enabled. We also run through the steps needed to restore the backup from the master to a slave.

This is the second of a two-part series on setting up Hashicorp Vault with Percona Server for MySQL with the keyring_vault plugin. First part is Using the keyring_vault plugin with Percona Server for MySQL 5.7.

Backing up from the master

First you need to install the latest Percona XtraBackup 2.4 package, in this tutorial I used this version:

[root@mysql1 ~]# xtrabackup --version
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --log_bin=mysqld-bin --server-id=1
xtrabackup version 2.4.12 based on MySQL server 5.7.19 Linux (x86_64) (revision id: 170eb8c)

Create a transition key using any method you prefer.  This transition key will be used by Percona XtraBackup to encrypt keys of the files being backed up. Make sure to keep the transition key and not lose it or else the backup will be unrecoverable.

[root@mysql1 ~]# openssl rand -base64 24
NSu7kfUgcTTIY2ym7Qu6jnYOotOuMIeT

You can store the transition-key in Vault and retrieve it later:

[root@mysql1 ~]# # Store the transition-key to the Vault server
[root@mysql1 ~]# curl -H "Content-Type: application/json" -H "X-Vault-Token: be515093-b1a8-c799-b237-8e04ea90ad7a" --cacert "/etc/vault_ca/vault.pem" -X PUT -d '{"value": "NSu7kfUgcTTIY2ym7Qu6jnYOotOuMIeT"}' "https://192.168.0.114:8200/v1/secret/dc1/master/transition_key"
[root@mysql1 ~]# # Retrieve the transition-key from the Vault server
[root@mysql1 ~]# curl -s -H "X-Vault-Token: be515093-b1a8-c799-b237-8e04ea90ad7a" --cacert "/etc/vault_ca/vault.pem" -X GET "https://192.168.0.114:8200/v1/secret/dc1/master/transition_key" | jq .data.value
"NSu7kfUgcTTIY2ym7Qu6jnYOotOuMIeT"
[root@mysql1 ~]# # Delete the transition-key from the Vault server
[root@mysql1 ~]# curl -H "Content-Type: application/json" -H "X-Vault-Token: be515093-b1a8-c799-b237-8e04ea90ad7a" --cacert "/etc/vault_ca/vault.pem" -X DELETE "https://192.168.0.114:8200/v1/secret/dc1/master/transition_key"

We will stream the backup to the slave server using netcat, first run this on the slave:

[root@mysql2 ~]# ncat -l 9999 | cat - > backup.xbstream

Then on the master I used --stream=xbstream  since it fails with --stream=tar reported here (PXB-1571). Run the xtrabackup command like this:

[root@mysql1 ~]# xtrabackup --stream=xbstream --backup --target-dir=backup/ --transition-key=NSu7kfUgcTTIY2ym7Qu6jnYOotOuMIeT | nc 192.168.0.117 9999
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --log_bin=mysqld-bin --server-id=1 --transition-key=*
xtrabackup: recognized client arguments: --datadir=/var/lib/mysql --log_bin=mysqld-bin --server-id=1 --transition-key=* --user=root --stream=xbstream --backup=1 --target-dir=backup/
180715 01:28:56  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup' as 'root'  (using password: NO).
180715 01:28:56  version_check Connected to MySQL server
180715 01:28:56  version_check Executing a version check against the server...
180715 01:28:56  version_check Done.
180715 01:28:56 Connecting to MySQL server host: localhost, user: root, password: not set, port: not set, socket: not set
Using server version 5.7.22-22-log
xtrabackup version 2.4.12 based on MySQL server 5.7.19 Linux (x86_64) (revision id: 170eb8c)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql
xtrabackup: open files limit requested 0, set to 65536
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
InnoDB: Number of pools: 1
180715 01:28:56 Added plugin 'keyring_vault.so' to load list.
180715 01:28:56 >> log scanned up to (2616858)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 2 for mysql/plugin, old maximum was 0
...
180715 01:28:58 Finished backing up non-InnoDB tables and files
180715 01:28:58 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
xtrabackup: The latest check point (for incremental): '2616849'
xtrabackup: Stopping log copying thread.
.180715 01:28:58 >> log scanned up to (2616865)
180715 01:28:58 Executing UNLOCK TABLES
180715 01:28:58 All tables unlocked
180715 01:28:58 [00] Streaming ib_buffer_pool to
180715 01:28:58 [00]        ...done
180715 01:28:58 Backup created in directory '/root/backup/'
180715 01:28:58 [00] Streaming
180715 01:28:58 [00]        ...done
180715 01:28:58 [00] Streaming
180715 01:28:58 [00]        ...done
180715 01:28:58 Saving xtrabackup_keys.
xtrabackup: Transaction log of lsn (2616849) to (2616865) was copied.
Shutting down plugin 'keyring_vault'
180715 01:28:58 completed OK!

Restoring the backup on the Slave server

Extract the backup to a temporary location:

[root@mysql2 backup]# xbstream -x < ../backup.xbstream

And then prepare it with the following command. Notice that we are still using the same transition key we used when backing up the database in the master server.

[root@mysql2 ~]# xtrabackup --prepare --target-dir=backup/ --transition-key=NSu7kfUgcTTIY2ym7Qu6jnYOotOuMIeT
xtrabackup: recognized server arguments: --innodb_checksum_algorithm=crc32 --innodb_log_checksum_algorithm=strict_crc32 --innodb_data_file_path=ibdata1:12M:autoextend --innodb_log_files_in_group=2 --innodb_log_file_size=50331648 --innodb_fast_checksum=0 --innodb_page_size=16384 --innodb_log_block_size=512 --innodb_undo_directory=./ --innodb_undo_tablespaces=0 --server-id=1 --redo-log-version=1 --transition-key=*
xtrabackup: recognized client arguments: --innodb_checksum_algorithm=crc32 --innodb_log_checksum_algorithm=strict_crc32 --innodb_data_file_path=ibdata1:12M:autoextend --innodb_log_files_in_group=2 --innodb_log_file_size=50331648 --innodb_fast_checksum=0 --innodb_page_size=16384 --innodb_log_block_size=512 --innodb_undo_directory=./ --innodb_undo_tablespaces=0 --server-id=1 --redo-log-version=1 --transition-key=* --prepare=1 --target-dir=backup/
xtrabackup version 2.4.12 based on MySQL server 5.7.19 Linux (x86_64) (revision id: 170eb8c)
xtrabackup: cd to /root/backup/
xtrabackup: This target seems to be not prepared yet.
...
xtrabackup: starting shutdown with innodb_fast_shutdown = 1
InnoDB: FTS optimize thread exiting.
InnoDB: Starting shutdown...
InnoDB: Shutdown completed; log sequence number 2617384
180715 01:31:10 completed OK!

Configure keyring_vault.conf on slave

Create the keyring_vault.conf file with the following contents:

[root@mysql2 ~]# cat /var/lib/mysql-keyring/keyring_vault.conf
vault_url = https://192.168.0.114:8200
secret_mount_point = secret/dc1/slave
token = be515093-b1a8-c799-b237-8e04ea90ad7a
vault_ca = /etc/vault_ca/vault.pem

Notice that it uses the same token as the master server but has a different secret_mount_point. The same CA certificate will be used across all servers connecting to this Vault server.

Use –copy-back option to finalize backup restoration

Next use the --copy-back option to copy the files from the temporary backup location to the mysql data directory on the slave. Observe that during this phase XtraBackup generates a new master key, stores it in the Vault server and re-encrypts tablespace headers using this key.

[root@mysql2 ~]# xtrabackup --copy-back --target-dir=backup/ --transition-key=NSu7kfUgcTTIY2ym7Qu6jnYOotOuMIeT --generate-new-master-key --keyring-vault-config=/var/lib/mysql-keyring/keyring_vault.conf
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --log_bin=mysqld-bin --server-id=2 --transition-key=* --generate-new-master-key=1
xtrabackup: recognized client arguments: --datadir=/var/lib/mysql --log_bin=mysqld-bin --server-id=2 --transition-key=* --generate-new-master-key=1 --copy-back=1 --target-dir=backup/
xtrabackup version 2.4.12 based on MySQL server 5.7.19 Linux (x86_64) (revision id: 170eb8c)
180715 01:32:28 Loading xtrabackup_keys.
180715 01:32:28 Loading xtrabackup_keys.
180715 01:32:29 Generated new master key with ID 'be1ba51c-87c0-11e8-ac1c-00163e79c097-2'.
...
180715 01:32:29 [01] Encrypting /var/lib/mysql/mysql/plugin.ibd tablespace header with new master key.
180715 01:32:29 [01] Copying ./mysql/servers.ibd to /var/lib/mysql/mysql/servers.ibd
180715 01:32:29 [01]        ...done
180715 01:32:29 [01] Encrypting /var/lib/mysql/mysql/servers.ibd tablespace header with new master key.
180715 01:32:29 [01] Copying ./mysql/help_topic.ibd to /var/lib/mysql/mysql/help_topic.ibd
180715 01:32:29 [01]        ...done
180715 01:32:29 [01] Encrypting /var/lib/mysql/mysql/help_topic.ibd tablespace header with new master key.
180715 01:32:29 [01] Copying ./mysql/help_category.ibd to /var/lib/mysql/mysql/help_category.ibd
180715 01:32:29 [01]        ...done
180715 01:32:29 [01] Encrypting /var/lib/mysql/mysql/help_category.ibd tablespace header with new master key.
180715 01:32:29 [01] Copying ./mysql/help_relation.ibd to /var/lib/mysql/mysql/help_relation.ibd
180715 01:32:29 [01]        ...done
...
180715 01:32:30 [01] Encrypting /var/lib/mysql/encryptedschema/t1.ibd tablespace header with new master key.
180715 01:32:30 [01] Copying ./encryptedschema/db.opt to /var/lib/mysql/encryptedschema/db.opt
180715 01:32:30 [01]        ...done
...
180715 01:32:31 [01] Copying ./xtrabackup_binlog_pos_innodb to /var/lib/mysql/xtrabackup_binlog_pos_innodb
180715 01:32:31 [01]        ...done
180715 01:32:31 [01] Copying ./xtrabackup_master_key_id to /var/lib/mysql/xtrabackup_master_key_id
180715 01:32:31 [01]        ...done
180715 01:32:31 [01] Copying ./ibtmp1 to /var/lib/mysql/ibtmp1
180715 01:32:31 [01]        ...done
Shutting down plugin 'keyring_vault'
180715 01:32:31 completed OK!

Once that’s done, change file/directory ownership to mysql.

[root@mysql2 ~]# chown -R mysql:mysql /var/lib/mysql/

Start the mysqld instance on the slave server configured similarly to the master configuration in the first part of this series.

early-plugin-load="keyring_vault=keyring_vault.so"
loose-keyring_vault_config="/var/lib/mysql-keyring/keyring_vault.conf"
encrypt_binlog=ON
innodb_encrypt_online_alter_logs=ON
innodb_encrypt_tables=ON
innodb_temp_tablespace_encrypt=ON
master_verify_checksum=ON
binlog_checksum=CRC32
log_bin=mysqld-bin
binlog_format=ROW
server-id=2
log-slave-updates

[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; disabled; vendor preset: disabled)
   Active: active (running) since Sun 2018-07-15 01:32:59 UTC; 6h ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 1390 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
  Process: 1372 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 1392 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─1392 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
Jul 15 01:32:58 mysql2 systemd[1]: Starting MySQL Server...
Jul 15 01:32:59 mysql2 systemd[1]: Started MySQL Server.

From here, you should be able to create the replication user on the master, and then configure slave replication based on the coordinates in the xtrabackup_binlog_info file. You can follow this section of the manual on starting replication.

For further reference, please read the manual related to Encrypted InnoDB tablespace backups.

Is validating your security strategy a concern?

Do you need to demonstrate that the security strategy that you have implemented for your databases is sufficient and appropriate? Perhaps you could benefit from a professional database audit? It could provide the reassurance that your organization needs.

The post Backing up Percona Server for MySQL with keyring_vault plugin enabled appeared first on Percona Database Performance Blog.

by Jericho Rivera at September 24, 2018 12:34 PM

September 23, 2018

Valeriy Kravchuk

On MySQL Replication Bugs

While writing about problematic Oracle MySQL features previously I concentrated mostly on InnoDB problems that I have to fight with really often and deliberately skipped replication from even the preliminary list of features to study in details for that blog post. First of all, I mostly work with MariaDB users now, and implementation of many replication features in MariaDB is notably different already (and has its own list of known problems). But this happened also because (asynchronous) replication plays a key role in most MySQL environments and deserves a detailed study in a separate post.

It took me some time to prepare lists of recent bugs reported for MySQL replication (row-based and statement-based, GTIDs, multi-threaded and crash safe, hopefully) and then try to summarize my findings from them. Looks like I have something related to share today, finally.

Gondolas in Venice often look very similar to each other. They all are probably replicas of few different original designs. But small differences exist. This should not happen to your data in a replication setup...
Let me start with a summary of problems I found and then proceed with the lists of bug reports I based my conclusions on:
  1. Both minor problems with mysqlbinlog outputs, small regressions, missing useful features and real bugs (that may lead to data differences on slaves or break replication), hangs (including MTS deadlocks), crashes and performance problems with replication that can affect production setups do not seem to get high priority over last 2 years. Even those reported in public bugs database by famous MySQL Community members or current and former Oracle MySQL engineers.
  2. GTIDs and replication crash safety features were supposed to make it easier to use replication in production. In reality there are some GTID-specific ways to break replication, and it is still not as crash safe as expected in many use cases.
  3. Multi-threaded slave (a.k.a MTS or parallel replication) bugs, even serious and verified, are ignored for way more than last 2 years! The feature is not completed and may not help as much as one may expect.
  4. Manual is still wrong or missing when describing some details of current replication implementation.
  5. Nobody cares much about debug assertions found by Percona QA. We may have some bombs in the replication code that may explode in a weird way any time. 
  6. It seems group replication bugs often get proper attention fast, but nobody in Oracle has enough time to deal with asynchronous replication bugs properly. Some of them hang around without even proper verification for months...
Now let's get into details. Here is the list of some recent row-based replication specific bugs and feature requests:
  • Bug #91822 - "incorrect datatype in RBR event when column is NULL and not explicit in query". The column missing in the INSERT statement is given the datatype of the column immediately preceding it. This bug was reported by Ernie Souhrada back in July.
  • Bug #89339 - "RBR slave index choosing logic is too simple". As Fungo Wang pointed out based on source code review, it is based on rules, not considered data distribution at all. Slave uses primary key if it exists, otherwise tries unique key that does not allow NULLs (if any), then tries other keys and finally falls back to full table scan if there are no indexes. When there is no primary or unique key on columns not allowing NULL values, no statistics is taken into account when picking up indexes vs full table scan, and this easily leads to bad plans. Surely, one has to have primary keys in all tables, especially those involved in replication, but it would be great if slave code was a bit smarter in corner cases (as MariaDB's one).
    By the way, setting slave_rows_search_algorithms to HASH_SCAN may not help, as it may be too slow in practice. See Bug #84887 by Dennis Gao for more details and an example of perf and gdb usage for the analysis.
  • Bug #89297 - "please allow number to varchar slave_type_conversions if target is large enough". Nice feature request by Shane Bester.
  • Bug #88791 - "Binary log for generated column contains new value as WHERE clause, not old value". This bug was reported by MC Brown. Surely, mysqlbinlog output for row-based events is not supposed to be 100% valid SQL for copy-paste, but it should not be misleading.
  • Bug #88595 - "Row-based master-master replication broken by add column or drop column". This is more a documentation issue. It's strange to see this limitation not documented here.
  • Bug #88223 - "Replication with no tmpdir space and InnoDB as tmp_storage_engine can break". It's really weird to see this bug reported by Sveta Smirnova hanging in "Analyzing" status for almost a year!
  • Bug #88057 - "Intermediary slave does not log master changes with binlog_rows_query_log_events". Yet another bug report from Sveta, this time "Verified". Row query log events for UPDATEs and DELETEs are NOT recorded in the binary log by the intermediate masters...
  • Bug #85292 - "unreachable statement in function process_event with event type QUERY_EVENT". Shane Bester found a part of the code that probably should be rewritten. No further public attention to this report for 1.5 years...
  • Bug #84287 - "row inserts, statement updates, persistent stats lead to table scans+ lag slaves". Yet another great finding by Shane Bester. As often happens, another problematic feature I truly hate, InnoDB's persistent statistics, is also involved in this case, leading to bad performance on slave.
  • Bug #83431 - "RBR slave's "BEGIN" binlog query event doesn't carry a session database name". It was reported almost 2 years ago by Teng Li.
  • Bug #83418 - "When binlog_row_image is MINIMAL, some updates queries logs redundant columns". This was reported by Abhinav Sharma also almost two years ago.
  • Bug #83090 - "AUTO_INCREMENT not updated when using INSERT ON DUPLICATE KEY UPDATE". This bug was reported by Monty Solomon two years ago. See also his related Bug #83030 - "AUTO_INCREMENT not updated when using REPLACE INTO".
  • Bug #80821 - "Replication breaks if multi-table DELETE is used in conjunction with Foreign Key". I do not understand why this perfect (regression?) bug report by Philip Stoev, with a clear MTR test case was immediately verified and then just hangs around for 2.5 years...
  • Bug #80770 - "Compression of row-based event". Great feature contributed by Davi Arnaut.
  • Bug #79831- "Unexpected error message on crash-safe slave with max_relay_log_size set". Let me just conclude this with a 5.6-spcific (probably) bug report by yours truly...
I have to stop at this point, as other RBR-related bug reports are known for way more time and probably are just forgotten for good already by MySQL development. They hardly have a chance for the fix any time soon.

Let's continue with replication bugs that are not closely related to row-based binlog format:
  • Bug #92398 - "point in time recovery using mysqlbinlog broken with temporary table -> errors". Shane Bester found yet another case when statement-based binlog format may lead to problems for point in time recovery.
  • Bug #92252 - "Mysql generates gtid gaps because of option slave-skip-errors". This bug was reported by Pin Lin. We get gaps in GTID values on slave when errors are skipped. If binlog files on master include transactions which were applied on slave and hit errors in slave-skip-errors were purged, slave can't reconnect to master and reports error 1236. Features that were supposed to make replication more robust lead to more breaks in replication.
  • Bug #92132 - "secure-file-priv breaks LOAD DATA INFILE replication in statement mode on 5.7.23". This regression bug was reported by Nicolai Plum. Recent MySQL 5.7 and 8.0 versions are affected.
  • Bug #92109 - "Please make replication crash safe with GTID and less durable setting (bis)." Jean-François Gagné keeps looking for options to make replication really crash safe in all practically important cases.
  • Bug #92073 - "int change_master(): Assertion `mi->rli->inited' failed." Roel Van de Paar continues his QA efforts in a hope to make MySQL better one day and reports all debug assertions in public. In this case we seem to have a kind of regression in 8.0 vs 5.7. Check also these his bug reports: Bug #91980 - "Assertion `head->variables.gtid_next.type == ANONYMOUS_GTID' failed." and Bug #90946 - "Assertion `get_state() >= 0' failed on RESET MASTER while tx_read_only". Debug assertions are in the code for a reason usually, so in a latter bug report Roel recently asked for some update...
  • Bug #91766 - "Log of replication position when running CHANGE MASTER TO is wrong.". Jean-François Gagné found yet another annoying detail - one would expect the error log to have data about the SQL Thread position, but it has data about the IO Thread position.
  • Bug #91744 - "START SLAVE UNTIL going further than it should." Now this bug reported by  Jean-François Gagné is really serious. Still it hangs around "Open" since July 20...
  • Bug #91633 - "Replication failure (errno 1399) on update in XA tx after deadlock". XA is yet another MySQL feature that is doomed to stay half-baked it seems... More on this you should expect in a separate blog post one day. The bug that causes replication break was reported by Lukas Sydorowski.
  • The last but not the least bug in this list (that is actually long if we try to check all active bugs reported over last 2 years) that I want to include is Bug #90448 - "INSERT IGNORE not ignored on SQL thread", by Simon Mudd. Restarting slave helps and it seems MySQL 8.0.x is not affected. See also his Bug #89876 - "mysqladmin flush-hosts is not safe in GTID mode". 
I have to stop at some point. In conclusion let me review a list of (less) recent bugs in a multi-threaded slave (MTS) implementation specifically:
  • Bug #91477 - "Slave writes less when connected intermediary master with blackhole engine". Just 2 threads are used in this case as found by Sveta Smirnova.
  • Bug #90342 - "MySQL parallel replication automatically restarts". Nobody cares about this crashing bug for 5+ months. It's still "Open".
  • Bug #89247 - "Deadlock with MTS when slave_preserve_commit_order = ON.Jean-François Gagné reported this 9 months ago, the bug was verifvied and based on last comments and pt-pmp outputs the hang really happen for many users. Still nobody cares in any obvious way.
  • Bug #89229 - "FLUSH PRIVILEGES may cause MTS deadlock". This bug was reported by Libing Song 9 months ago.
  • Bug #87796 - "Commit_order_manager can't terminate MTS worker properly when deadlock happens". This bug was reported by Seunguck Lee a year ago!
  • Bug #85747 - "Documentation on MTS statistics". Yet another case of missing manual reported by Juan Arruti:
    "I didn't find information in MySQL reference manual about MTS statistics, I had to look into source code in order to get more information about them."
  • Bug #85142 - "reducing MTS checkpointing causes high IO load". The bug reported by Trey Raymond is "Verified" for 1.5 years. Check also his Bug #81386 - "MTS hangs on file related *_EVENT forced checkpoint" and Bug #81385 - "relay log info not updated after using sql_after_mts_gaps" that are "Open" since May 2016!
  • Bug #73066 - "Replication stall with multi-threaded replication". Ovais Tariq reported this bug more than 4 years ago, when both still worked in Percona. Nobody cares to fix it.
  • Bug #71495 - "Logging of MTS breakage is insufficient." One of Oracle customers who use MTS in production, Simon Mudd had reported this practical problem 4.5+ years ago. No fix, no comments from Oracle engineers from more than 3 years.
I have to stop at this stage, as checking more/older replication related bugs makes me even more sad than I planned or want to be. We see fixes in every release, but the backlog seems huge. Dear Oracle MySQL engineers, please, review public MySQL replication bug reports collected over last 4-5 years and spend some more time and resources on processing and fixing them! Otherwise this Acqua alta of bugs will make this great legacy feature obsolete one sad day.

I had not planned to study semi-sync replication bugs in this post (maybe, some day). If you are interested in more details about this feature, crash safety of replication, parallel replication and replication in general, I suggest you to check J-F Gagné's MySQL Blog.

by Valeriy Kravchuk (noreply@blogger.com) at September 23, 2018 05:31 PM

September 21, 2018

Peter Zaitsev

Securing PostgreSQL as an Enterprise-Grade Environment

PostgreSQL enterprise-grade security

PostgreSQL® logoIn this post, we review how you can build an enhanced and secure PostgreSQL database environment using community software. We look at the features that are available in PostgreSQL that, when implemented, provide improved security.

As discussed in the introductory blog post of this series, in our webinar of October 10, 2018 we highlight important aspects an enterprise should consider for their PostgreSQL environments. This series of blogs addressing particular aspects of the enterprise-grade postgres environment complements the webinar. This post addresses security.

Authentication Layer

Client connections to PostgreSQL Server using host based authentication

PostgreSQL uses a host based authentication file (pg_hba.conf) to authorize incoming connections. This file contains entries with a combination of 5 categories: type, database, user, address, and method. A client is allowed to connect to a database only when the combination of username, database and the hostname of the client matches an entry in the pg_hba.conf file.

Consider the following entry in pg_hba.conf file :

# TYPE DATABASE USER ADDRESS METHOD
host percona pguser 192.168.0.14/32 md5

This entry says that connections from server 192.168.0.14 are only allowed from user pguser and only to the database percona. The method md5 forces password authentication.

The order of the entries in the pg_hba.conf file matters. If you have an entry that rejects connections from a given server followed by another that allows connections from it, the first entry in the order is considered. So, in this case, the connection is rejected.

This is the first layer of protection in authentication. If this criteria is not satisfied in this Access Control List (ACL), PostgreSQL will discard the request without considering even the server authentication.

Server Authentication

Historically, PostgreSQL uses MD5 digest as a password hash by default. The problem with pure MD5 hashing is that this function will always return the same hash for a given password, which renders a MD5 digest more susceptible for password cracking. Newer versions of PostgreSQL implement SCRAM Authentication (Simple Authentication and Secured Layer) that stores passwords in salted and iterated hash formats to strengthen PostgreSQL against offline attacks. SCRAM-SHA-256 support was introduced in PostgreSQL 10. What matters most in terms of “enterprise-grade” security is that PostgreSQL supports industry-standard authentication methods out of the box, like SSL certificates, PAM/LDAP, Kerberos, etc.

Authorization Layer

User management through roles and privileges

It is always recommended to implement segregation of users through roles and privileges. There may be several user accounts in your PostgreSQL server. Only a few of them may be application accounts while the rest are developers or admin accounts. In such cases, PostgreSQL allows you to create multiple roles. Those can be assigned with a set of privileges. Thus, instead of managing user privileges individually, standard roles can be maintained and the appropriate role from the list can be assigned to a user. Through roles, database access can be standardized, which helps in user management and avoids granting too much or too little privilege to a given user.

For example, we might have six roles:

app_read_write
app_read_only
dev_read_write
dev_read_only
admin_read_write
admin_read_only

Now, if you need to create a new dev user who can only have read access, grant one among the appropriate roles, such as dev_read_only:

GRANT dev_read_only to avi_im_developer;

Row level Security

Starting with version 9.5, PostgreSQL implements row level security, which can limit access to only a subset of records/rows in a table. Usually a user is granted a mix of SELECT, INSERT, DELETE and UPDATE privileges on a given table, which allows access to all records in the table. Through row level security, however, such privileges can be restricted to a subset of the records by means of a policy, which in turn can be  assigned to a role.

In the next example, we create an employee table and two manager accounts. We then enable row level security on the table and create a policy that allows the managers to only view/modify their own subordinates’ records:

CREATE TABLE scott.employee (id INT, first_name VARCHAR(20), last_name VARCHAR(20), manager VARCHAR(20));
INSERT INTO scott.employee VALUES (1,'avinash','vallarapu','carina');
INSERT INTO scott.employee VALUES (2,'jobin','augustine','stuart');
INSERT INTO scott.employee VALUES (3,'fernando','laudares','carina');
CREATE USER carina WITH ENCRYPTED PASSWORD 'carina';
CREATE USER stuart WITH ENCRYPTED PASSWORD 'stuart';
CREATE ROLE managers;
GRANT managers TO carina, stuart;
GRANT SELECT, INSERT, UPDATE, DELETE ON scott.employee TO managers;
GRANT USAGE ON SCHEMA scott TO managers;
ALTER TABLE scott.employee ENABLE ROW LEVEL SECURITY;
CREATE POLICY employee_managers ON scott.employee TO managers USING (manager = current_user);

In the log we can see that only certain records are visible to each manager:

$ psql -d percona -U carina
psql (10.5)
Type "help" for help.
percona=> select * from scott.employee ;
id | first_name | last_name | manager
----+------------+-----------+---------
 1 | avinash    | vallarapu | carina
 3 | fernando   | laudares | carina
(2 rows)
$ psql -d percona -U stuart
psql (10.5)
Type "help" for help.
percona=> select * from scott.employee ;
id | first_name | last_name | manager
----+------------+-----------+---------
 2 | jobin      | augustine | stuart
(1 row)

You can read more about row level security in the manual page.

Data Security

1. Encryption of data over the wire using SSL

PostgreSQL allows you to use SSL to enable encryption of data in motion. In addition, you may enable certification based authentication to ensure that the communication is happening between trusted parties. SSL is implemented by OpenSSL and thus it requires the OpenSSL package to be installed in your PostgreSQL server and PostgreSQL to be built –with-openssl support.

The following entry in a pg_hba.conf file says that connections to any database and from any user are allowed from server 192.68.0.13 as long as the communication is encrypted over SSL. Also, the connection is only established when a valid client certificate is provided:

# TYPE DATABASE USER ADDRESS METHOD
hostssl all all 192.168.0.13/32 md5

Optionally, you may also use Client Certificate Authentication using the following method:

# TYPE DATABASE USER ADDRESS METHOD
hostssl all all 192.168.0.13/32 cert clientcert=1

2. Encryption at Rest – pgcrypto

The pgcrypto module provides cryptographic functions for PostgreSQL, allowing certain fields to be stored encrypted. pgcrypto implements PGP encryption, which is part of the OpenPGP (RFC 4880) standard. It supports both symmetric-key and public-key encryption. Besides the advanced features offered by PGP for encryption, pgcrypto also offers functions for running simple encryption based on ciphers. These functions only run a cipher over data.

Accounting and Auditing

Logging in PostgreSQL

PostgreSQL allows you to log either all of the statements or a few statements based on parameter settings. You can log all the DDLs or DMLs or any statement running for more than a certain duration to the log file when logging_collector is enabled. To avoid write overload to the data directory, you may also move your log_directory to a different location. Here’s a few important parameters you should review when logging activities in your PostgreSQL server:

log_connections
log_disconnections
log_lock_waits
log_statement
log_min_duration_statement

Please note that detailed logging takes additional disk space and may impose an important overhead in terms of write IO depending on the activity in your PostgreSQL server. You should be careful when enabling logging and should only do so after understanding the overhead and performance degradation it may cause to your workload.

Auditing – pgaudit and set_user

Some essential auditing features in PostgreSQL are implemented as extensions, which can be enabled at will on highly secured environments with regulatory requirements.

pgaudit helps to audit the activities happening in the database. If any unauthorized user has intentionally obfuscated the DDL or DML, the statement the user has passed and the sub-statement that was actually executed in the database will be logged in the PostgreSQL log file.

set_user
  provides a method of privilege escalations. If properly implemented, it provides the highest level of auditing, which allows the monitoring of even SUPERUSER actions.

You can read more about pgaudit here.

Security Bug Fixes

PostgreSQL Global Development Group (PGDG) considers security bugs seriously. Any security vulnerabilities can be reported directly to security@postgresql.org. The list of security issues fixed for all the supported PostgreSQL versions can be found here. Security fixes to PostgreSQL are made available through minor version upgrades. This is the main reason why it is advised to always maintain PostgreSQL servers upgraded to the latest minor version.

If you liked this post…

Please join Percona’s PostgreSQL Support Technical Lead,  Avinash Vallarapu; Senior Support Engineer, Fernando Laudares; and Senior Support Engineer, Jobin Augustine, on Wednesday, October 10, 2018 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4), as they demonstrate an enterprise-grade PostgreSQL® environment built using a combination of open source tools and extensions.

Register Now

The post Securing PostgreSQL as an Enterprise-Grade Environment appeared first on Percona Database Performance Blog.

by Avinash Vallarapu at September 21, 2018 01:50 PM

End-of-Life Reminder for Percona Server for MongoDB 3.2

Percona Server for MongoDB

Percona Server for MongoDBAs noted on our Software and Platform Lifecycle page, Percona Server for MongoDB 3.2 will be end of life on September 30, 2018. According to the Percona Services Lifecycle Policy, customers with an active support contract with Percona will continue receiving support; however, end-of-life software receives no new builds, bug fixes, security fixes, or features.

Each software package Percona produces has a defined migration pathway for upgrading to a newer version. If you are currently running Percona Server for MongoDB 3.2, we encourage you to upgrade to version 3.4 or newer. Please reach out to us if you need assistance in migrating to a newer release, Percona is happy to help.

The post End-of-Life Reminder for Percona Server for MongoDB 3.2 appeared first on Percona Database Performance Blog.

by Jeff Sandstrom at September 21, 2018 12:45 PM

This Week in Data with Colin Charles 53: It’s MariaDB Week PLUS Percona Live Europe Update

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

This week is clearly what I’d call a “MariaDB week” — plenty of announcements coming from MariaDB Corporation and MariaDB Foundation.

It started with Alibaba Cloud and MariaDB Announce the Launch of ApsaraDB RDS for MariaDB TX, which makes Alibaba Cloud the first public cloud to offer the enterprise offering of MariaDB, MariaDB TX 3.0. It is not available yet as of this announcement for rolling out from the interface, but I expect it will be soon. Exciting, as you can already get MariaDB Server on Amazon RDS for MariaDB, and you can join the waitlist preview for Azure.

MariaDB Corporation has received more funding from ServiceNow Ventures in the Series C round, and has gained a new board member in Pat Casey. ServiceNow is a user of MariaDB, and “ServiceNow’s platform runs on up to 85,000 MariaDB databases that serve more than 25 billion queries per hour.” There was an excellent keynote session at M|18 about how ServiceNow uses MariaDB. The Register refers to this as “protecting ServiceNow’s toolchain”.

For good measure, MariaDB acquired Clustrix as well. This is the second acquisition after MammothDB earlier in the year. It is worth reading the TechCrunch take on this. Clustrix, a Y Combinator company, has been around since 2006 and raised $72 million. The price of the acquisition was not announced. For a bit of behind the scenes chatter from ex-employee shareholders, Hacker News delivers.

From a MariaDB Foundation standpoint, we see Otto Kekäläinen, the MariaDB Foundation CEO stepping down. Thanks for all your hard work Otto! And maybe you missed it, but not long ago, Percona Became a Bronze Sponsor of MariaDB Foundation.

Speaking of conferences, the tutorial schedule and a sneak peek of sessions for Percona Live Europe Frankfurt have been announced. In addition, the Call for Papers – 2019 Annual MariaDB User Conference closes October 31, 2018.

Releases

Link List

Upcoming Appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

 

The post This Week in Data with Colin Charles 53: It’s MariaDB Week PLUS Percona Live Europe Update appeared first on Percona Database Performance Blog.

by Colin Charles at September 21, 2018 10:44 AM

Jean-Jerome Schmidt

Become a ClusterControl DBA: Operational Reports for MySQL, MariaDB, PostgreSQL & MongoDB

The majority of DBA’s perform health checks on their databases every now and then. Usually, it would happen on a daily or weekly basis. We previously discussed why such checks are important and what they should include.

To make sure your systems are in a good shape, you’d need to go through quite a lot of information - host statistics, MySQL statistics, workload statistics, state of backups, database packages, logs and so forth. Such data should be available in every properly monitored environment, although sometimes it is scattered across multiple locations - you may have one tool to monitor MySQL state, another tool to collect system statistics, maybe a set of scripts, e.g., to check the state of your backups. This makes health checks much more time-consuming than they should be - the DBA has to put together the different pieces to understand the state of the system.

Integrated tools like ClusterControl have an advantage that all of the bits are located in the same place (or in the same application). It still does not mean they are located next to each other - they may be located in different sections of the UI and a DBA may have to spend some time clicking through the UI to reach all the interesting data.

The whole idea behind creating Operational Reports is to put all of the most important data into a single document, which can be quickly reviewed to get an understanding of the state of the databases.

Operational Reports are available from the menu Side Menu -> Operational Reports:

Once you go there, you’ll be presented with a list of reports created manually or automatically, based on a predefined schedule:

If you want to create a new report manually, you’ll use the 'Create' option. Pick the type of report, cluster name (for per-cluster report), email recipients (optional - if you want the report to be delivered to you), and you’re pretty much done:

The reports can also be scheduled to be created on a regular basis:

At this time, 5 types of reports are available:

  • Availability report - All clusters.
  • Backup report - All clusters.
  • Schema change report - MySQL/MariaDB-based cluster only.
  • Daily system report - Per cluster.
  • Package upgrade report - Per cluster.

Availability Report

Availability reports focuses on, well, availability. It includes three sections. First, availability summary:

You can see information about availability statistics of your databases, the cluster type, total uptime and downtime, current state of the cluster and when that state last changed.

Another section gives more details on availability for every cluster. The screenshot below only shows one of the database cluster:

We can see when a node switched state and what the transition was. It’s a nice place to check if there were any recent problems with the cluster. Similar data is shown in the third section of this report, where you can go through the history of changes in cluster state.

Backup Report

The second type of the report is one covering backups of all clusters. It contains two sections - backup summary and backup details, where the former basically gives you a short summary of when the last backup was created, if it completed successfully or failed, backup verification status, success rate and retention period:

ClusterControl also provides examples of backup policy if it finds any of the monitored database cluster running without any scheduled backup or delayed slave configured. Next are the backup details:

You can also check the list of backups executed on the cluster with their state, type and size within the specified interval. This is as close you can get to be certain that backups work correctly without running a full recovery test. We definitely recommend that such tests are performed every now and then. Good news is ClusterControl supports MySQL-based restoration and verification on a standalone host under Backup -> Restore Backup.

Daily System Report

This type of report contains detailed information about a particular cluster. It starts with a summary of different alerts which are related to the cluster:

Next section is about the state of the nodes that are part of the cluster:

You have a list of the nodes in the cluster, their type, role (master or slave), status of the node, uptime and the OS.

Another section of the report is the backup summary, same as we discussed above. Next one presents a summary of top queries in the cluster:

Finally, we see a “Node status overview” in which you’ll be presented with graphs related to OS and MySQL metrics for each node.

As you can see, we have here graphs covering all of the aspects of the load on the host - CPU, memory, network, disk, CPU load and disk free. This is enough to get an idea whether anything weird happened recently or not. You can also see some details about MySQL workload - how many queries were executed, which type of query, how the data was accessed (via which handler)? This, on the other hand, should be enough to pick most of the issues on MySQL side. What you want to look at are all spikes and dips that you haven’t seen in the past. Maybe a new query has been added to the mix and, as a result, handler_read_rnd_next skyrocketed? Maybe there was an increase of CPU load and a high number of connections might point to increased load on MySQL, but also to some kind of contention. An unexpected pattern might be good to investigate, so you know what is going on.

Package Upgrade Report

This report gives a summary of packages available for upgrade by the repository manager on the monitored hosts. For an accurate reporting, ensure you always use stable and trusted repositories on every host. In some undesirable occasions, the monitored hosts could be configured with an outdated repository after an upgrade (e.g, every MariaDB major version uses different repository), incomplete internal repository (e.g, partial mirrored from the upstream) or bleeding edge repository (commonly for unstable nightly-build packages).

The first section is the upgrade summary:

It summarizes the total number of packages available for upgrade as well as the related managed service for the cluster like load balancer, virtual IP address and arbitrator. Next, ClusterControl provides a detailed package list, grouped by package type for every host:

This report provides the available version and can greatly help us plan our maintenance window efficiently. For critical upgrades like security and database packages, we could prioritize it over non-critical upgrades, which could be consolidated with other less priority maintenance windows.

Schema Change Report

This report compares the selected MySQL/MariaDB database changes in table structure which happened between two different generated reports. In the MySQL/MariaDB older versions, DDL operation is a non-atomic operation (pre 8.0) and requires full table copy (pre 5.6 for most operations) - blocking other transactions until it completes. Schema changes could become a huge pain once your tables get a significant amount of data and must be carefully planned especially in a clustered setup. In a multi-tiered development environment, we have seen many cases where developers silently modify the table structure, resulting in significant impact to query performance.

In order for ClusterControl to produce an accurate report, special options must be configured inside CMON configuration file for the respective cluster:

  • schema_change_detection_address - Checks will be executed using SHOW TABLES/SHOW CREATE TABLE to determine if the schema has changed. The checks are executed on the address specified and is of the format HOSTNAME:PORT. The schema_change_detection_databases must also be set. A differential of a changed table is created (using diff).
  • schema_change_detection_databases - Comma separated list of databases to monitor for schema changes. If empty, no checks are made.

In this example, we would like to monitor schema changes for database "myapp" and "sbtest" on our MariaDB Cluster with cluster ID 27. Pick one of the database nodes as the value of schema_change_detection_address. For MySQL replication, this should be the master host, or any slave host that holds the databases (in case partial replication is active). Then, inside /etc/cmon.d/cmon_27.cnf, add the two following lines:

schema_change_detection_address=10.0.0.30:3306
schema_change_detection_databases=myapp,sbtest

Restart CMON service to load the change:

$ systemctl restart cmon

For the first and foremost report, ClusterControl only returns the result of metadata collection, similar to below:

With the first report as the baseline, the subsequent reports will return the output that we are expecting for:

Take note only new tables or changed tables are printed in the report. The first report is only for metadata collection for comparison in the subsequent rounds, thus we have to run it for at least twice to see the difference.

With this report, you can now gather the database structure footprints and understand how your database has evolved across time.

Final Thoughts

Operational report is a comprehensive way to understand the state of your database infrastructure. It is built for both operational or managerial staff, and can be very useful in analysing your database operations. The reports can be generated in-place or can be delivered to you via email, which make things conveniently easy if you have a reporting silo.

We’d love to hear your feedback on anything else you’d like to have included in the report, what’s missing and what is not needed.

by ashraf at September 21, 2018 08:18 AM

September 20, 2018

Peter Zaitsev

Prometheus 2 Times Series Storage Performance Analyses

cpu saturation and max core usage

Prometheus 2 time series database (TSDB) is an amazing piece of engineering, offering a dramatic improvement compared to “v2” storage in Prometheus 1 in terms of ingest performance, query performance and resource use efficiency. As we’ve been adopting Prometheus 2 in Percona Monitoring and Management (PMM), I had a chance to look into the performance of Prometheus 2 TSDB. This blog post details my observations.

Understanding the typical Prometheus workload

For someone who has spent their career working with general purpose databases, the typical workload of Prometheus is quite interesting. The ingest rate tends to remain very stable: typically, devices you monitor will send approximately the same amount of metrics all the time, and infrastructure tends to change relatively slowly.

Queries to the data can come from multiple sources. Some of them, such as alerting, tend to be very stable and predictable too. Others, such as users exploring data, can be spiky, though it is not common for this to be largest part of the load.

The Benchmark

In my assessment, I focused on handling an ingest workload. I had deployed Prometheus 2.3.2 compiled with Go 1.10.1 (as part of PMM 1.14)  on Linode using this StackScript.  For a maximally realistic load generation, I spin up multiple MySQL nodes running some real workloads (Sysbench TPC-C Test) , with each emulating 10 Nodes running MySQL and Linux using this StackScript

The observations below are based on a Linode instance with eight virtual cores and 32GB of memory, running  20 load driving simulating the monitoring of 200 MySQL instances. Or, in Prometheus Terms, some 800 targets; 440 scrapes/sec 380K samples ingested per second and 1.7M of active time series.

Design Observations

The conventional approach of traditional databases, and the approach that Prometheus 1.x used, is to limit amount of memory. If this amount of memory is not enough to handle the load, you will have high latency and some queries (or scrapes) will fail. Prometheus 2 memory usage instead is configured by

storage.tsdb.min-block-duration
   which determines how long samples will be stored in memory before they are flushed (the default being 2h). How much memory it requires will depend on the number of time series, the number of labels you have, and your scrape frequency in addition to the raw ingest rate. On disk, Prometheus tends to use about three bytes per sample. Memory requirements, though, will be significantly higher.

While the configuration knob exists to change the head block size, tuning this by users is discouraged. So you’re limited to providing Prometheus 2 with as much memory as it needs for your workload.

If there is not enough memory for Prometheus to handle your ingest rate, then it will crash with out of memory error message or will be killed by OOM killer.

Adding more swap space as a “backup” in case Prometheus runs out of RAM does not seem to work as using swap space causes a dramatic memory usage explosion. I suspect swapping does not play well with Go garbage collection.

Another interesting design choice is aligning block flushes to specific times, rather than to time since start:

head block Prometheus 2

As you can see from this graph, flushes happen every two hours, on the clock. If you change min-block-duration  to 1h, these flushes will happen every hour at 30 minutes past the hour.

(If you want to see this and other graphs for your Prometheus Installation you can use this Dashboard. It has been designed for PMM but can work for any Prometheus installation with little adjustments.)

While the active block—called head block— is kept in memory, blocks containing older blocks are accessed through

mmap()
  This eliminates the need to configure cache separately, but also means you need to allocate plenty of memory for OS Cache if you want to query data older than fits in the head block.

It also means the virtual memory you will see Prometheus 2 using will get very high: do not let it worry you.

Prometheus process memory usage

Another interesting design choice is WAL configuration. As you can see in the storage documentation, Prometheus protects from data loss during a crash by having WAL log. The exact durability guarantees, though, are not clearly described. As of Prometheus 2.3.2, Prometheus flushes the WAL log every 10 seconds, and this value is not user configurable.

Compactions

Prometheus TSDB is designed somewhat similar to the LSM storage engines – the head block is flushed to disk periodically, while at the same time, compactions to merge a few blocks together are performed to avoid need to scan too many blocks for queries

Here is the number of data blocks I observed on my system after a 24h workload:

active data blocks

If you want more details about storage, you can check out the meta.json file which has additional information about the blocks you have, and how they came about.

{
       "ulid": "01CPZDPD1D9R019JS87TPV5MPE",
       "minTime": 1536472800000,
       "maxTime": 1536494400000,
       "stats": {
               "numSamples": 8292128378,
               "numSeries": 1673622,
               "numChunks": 69528220
       },
       "compaction": {
               "level": 2,
               "sources": [
                       "01CPYRY9MS465Y5ETM3SXFBV7X",
                       "01CPYZT0WRJ1JB1P0DP80VY5KJ",
                       "01CPZ6NR4Q3PDP3E57HEH760XS"
               ],
               "parents": [
                       {
                               "ulid": "01CPYRY9MS465Y5ETM3SXFBV7X",
                               "minTime": 1536472800000,
                               "maxTime": 1536480000000
                       },
                       {
                               "ulid": "01CPYZT0WRJ1JB1P0DP80VY5KJ",
                               "minTime": 1536480000000,
                               "maxTime": 1536487200000
                       },
                       {
                               "ulid": "01CPZ6NR4Q3PDP3E57HEH760XS",
                               "minTime": 1536487200000,
                               "maxTime": 1536494400000
                       }
               ]
       },
       "version": 1
}

Compactions in Prometheus are triggered at the time the head block is flushed, and several compactions may be performed at these intervals:Prometheus 2 compactions

Compactions do not seem to be throttled in any way, causing huge spikes of disk IO usage when they run:

spike in io activity for compactions

And a spike in CPU usage:

spike in CPU usage during compactions

This, of course, can cause negative impact to the system performance. This is also why it is one of the greatest questions in LSM engines: how to run compactions to maintain great query performance, but not cause too much overhead.

Memory utilization as it relates to the compaction process is also interesting:

Memory utilization during compaction process

We can see after compaction a lot of memory changes from “Cached”  to “Free”, meaning potentially valuable data is washed out from memory. I wonder if

fadvice()
 or other techniques to minimize data washout from cache are in use, or if this is caused by the fact that the blocks which were cached are destroyed by the compaction process

Crash Recovery

Crash recovery from the log file takes time, though it is reasonable. For an ingest rate of about 1 mil samples/sec, I observed some 25 minutes recovery time on SSD storage:

level=info ts=2018-09-13T13:38:14.09650965Z caller=main.go:222 msg="Starting Prometheus" version="(version=2.3.2, branch=v2.3.2, revision=71af5e29e815795e9dd14742ee7725682fa14b7b)"
level=info ts=2018-09-13T13:38:14.096599879Z caller=main.go:223 build_context="(go=go1.10.1, user=Jenkins, date=20180725-08:58:13OURCE)"
level=info ts=2018-09-13T13:38:14.096624109Z caller=main.go:224 host_details="(Linux 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 1bee9e9b78cf (none))"
level=info ts=2018-09-13T13:38:14.096641396Z caller=main.go:225 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-09-13T13:38:14.097715256Z caller=web.go:415 component=web msg="Start listening for connections" address=:9090
level=info ts=2018-09-13T13:38:14.097400393Z caller=main.go:533 msg="Starting TSDB ..."
level=info ts=2018-09-13T13:38:14.098718401Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536530400000 maxt=1536537600000 ulid=01CQ0FW3ME8Q5W2AN5F9CB7R0R
level=info ts=2018-09-13T13:38:14.100315658Z caller=web.go:467 component=web msg="router prefix" prefix=/prometheus
level=info ts=2018-09-13T13:38:14.101793727Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536732000000 maxt=1536753600000 ulid=01CQ78486TNX5QZTBF049PQHSM
level=info ts=2018-09-13T13:38:14.102267346Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536537600000 maxt=1536732000000 ulid=01CQ78DE7HSQK0C0F5AZ46YGF0
level=info ts=2018-09-13T13:38:14.102660295Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536775200000 maxt=1536782400000 ulid=01CQ7SAT4RM21Y0PT5GNSS146Q
level=info ts=2018-09-13T13:38:14.103075885Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1536753600000 maxt=1536775200000 ulid=01CQ7SV8WJ3C2W5S3RTAHC2GHB
level=error ts=2018-09-13T14:05:18.208469169Z caller=wal.go:275 component=tsdb msg="WAL corruption detected; truncating" err="unexpected CRC32 checksum d0465484, want 0" file=/opt/prometheus/data/.prom2-data/wal/007357 pos=15504363
level=info ts=2018-09-13T14:05:19.471459777Z caller=main.go:543 msg="TSDB started"
level=info ts=2018-09-13T14:05:19.471604598Z caller=main.go:603 msg="Loading configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499156711Z caller=main.go:629 msg="Completed loading of configuration file" filename=/etc/prometheus.yml
level=info ts=2018-09-13T14:05:19.499228186Z caller=main.go:502 msg="Server is ready to receive web requests."

The problem I observed with recovery is that it is very memory intensive. While the server may be capable of handling the normal load with memory to spare if it crashes, it may not be able to ever recover due to running out of memory.  The only solution I found for this is to disable scraping, let it perform crash recovery, and then restarting the server with scraping enabled

Warmup

Another behavior to keep in mind is the need for warmup – a lower performance/higher resource usage ratio immediately after start. In some—but not all—starts I can observe significantly higher initial CPU and memory usage

cpu usage during warmup

memory usage during warmup

The gaps in the memory utilization graph show that Prometheus is not initially able to perform all the scrapes configured, and as such some data is lost.

I have not profiled what exactly causes this extensive CPU and memory consumption. I suspect these might be happening when new time series entries are created, at head block, and at high rate.

CPU Usage Spikes

Besides compaction—which is quite heavy on the Disk IO—I also can observe significant CPU spikes about every 2 minutes. These are longer with a higher ingest ratio. These seem to be caused by Go Garbage Collection during these spikes: at least some CPU cores are completely saturated

cpu usage spikes maybe during Go Garbage collection

cpu saturation and max core usage

These spikes are not just cosmetic. It looks like when these spikes happen, the Prometheus internal /metrics endpoint becomes unresponsive, thus producing data gaps during the exact time that the spikes occur:

Prometheus 2 process memory usage

We can also see the Prometheus Exporter hitting a one second timeout:

scrape time by job

We can observe this correlates with garbage collection:

garbage collection in Prometheus processing

Conclusion

Prometheus 2 TSDB offers impressive performance, being able to handle a cardinality of millions of time series, and also to handle hundreds of thousands of samples ingested per second on rather modest hardware. CPU and disk IO usage are both very impressive. I got up to 200K/metrics/sec per used CPU core!

For capacity planning purposes you need to ensure that you have plenty of memory available, and it needs to be real RAM. The actual amount of memory I observed was about 5GB per 100K/samples/sec ingest rate, which with additional space for OS cache, makes it 8GB or so.

There is work that remains to be done to avoid CPU and IO usage spikes, though this is not unexpected considering how young Prometheus 2 TSDB is – if we look at InnoDB, TokuDB, RocksDB, WiredTiger all of them had similar problem in their initial releases.

The post Prometheus 2 Times Series Storage Performance Analyses appeared first on Percona Database Performance Blog.

by Peter Zaitsev at September 20, 2018 04:07 PM

ProxySQL 1.4.10 and Updated proxysql-admin Tool Now in the Percona Repository

ProxySQL 1.4.9

ProxySQL 1.4.5ProxySQL 1.4.10, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.10 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.10 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases.

Improvements

  • PSQLADM-12: Implemented the writer-is-reader option in proxysql-admin. This is now a text option: ‘always’, ‘never’, and ‘ondemand’
  • PSQLADM-64: Added the option --sync-multi-cluster-users which , that uses the same function as --sync-users but will not delete users on ProxySQL that don’t exist on MySQL
  • PSQLADM-90: Added testsuites for host priority/slave/loadbal/writer-is-reader features
  • Additional debugging support
    An additional --debug flag on scripts prints more output. All SQL calls are now logged if debugging is enabled.

Tool Enhancements

  • proxysql-status
    proxysql-status now reads the credentials from the proxysql-admin.cnf file. It is possible to look only at certain tables (--files, --main, --monitor, --runtime, --stats). Also added the ability to filter based on the table name (--table)
  • tests directory
    The proxysql-admin-testsuite.sh script can now be used to create test clusters (
    proxysql-admin-testsuite.sh <workdir> --no-test --cluster-one-only
     , this option will create a 3-node PXC cluster with 1 async slave and will also start proxyxql). Also added regression test suites.
  • tools directory
    Added extra tools that can be used for debugging (mysql_exec, proxysql_exec, enable_scheduler, and run_galera_checker).

Bug Fixes

  • PSQLADM-73: proxysql-admin did not check that the monitor user had been configured on the PXC nodes.
  • PSQLADM-82: the without-check-monitor-user option did check the monitor user (even if it was enabled). This option has been replaced with use-existing-monitor-password.
  • PSQLADM-83: proxysql_galera-checker could hang if there was no scheduler entry.
  • PSQLADM-87: in some cases, proxysql_galera_checker was not moving a node to OFFLINE_SOFT if pxc_maint_mode was set to “maintenance”
  • PSQLADM-88: proxysql_node_monitor was searching among all nodes, not just the read hostgroup.
  • PSQLADM-91: Nodes in the priority list were not being picked.
  • PSQLADM-93: If mode=’loadbal’, then the read_hostgroup setting was used from the config file, rather than being set to -1.
  • PSQLADM-96: Centos used /usr/share/proxysql rather than /var/lib/proxysql
  • PSQLADM-98: In some cases, checking the PXC node status could stall (this call now uses a TIMEOUT)

ProxySQL is available under OpenSource license GPLv3.

The post ProxySQL 1.4.10 and Updated proxysql-admin Tool Now in the Percona Repository appeared first on Percona Database Performance Blog.

by Dmitriy Kostiuk at September 20, 2018 11:06 AM

MyRocks Disk Full Edge Case

problem in MyRocks

MyRocks disk full bugRocksDB engine—and it’s MySQL implementation MyRocks—is a very good alternative engine for MySQL. It has proven to be very efficient and stable for many workloads, including those of large scale. However, it is still a relative newborn in the MySQL ecosystem, and has only a small fraction of the adoption rate of InnoDB. That means it is not so well tested at all possible edge cases, and may have many unreported bugs. One known bug is discussed here: if you are a MyRocks user, it’s important that you are aware of the possibility of lost data in the specific circumstances described below.

In writing this article, I want to broadcast a wider warning, as the problem I found is pretty serious and could lead to a very unpleasant situation.

The problem is related to not a very edge case after all – disk full. The result could be extremely bad, though. After printing some errors, the RocksDB engine continues to message clients that consecutive writes are accepted, while they are, in fact, lost! In addition, depending on the workload and at which moment the disk ran out of space, in a worst case scenario we may lose access to tables that use the RocksDB engine completely.

Let me show how bad the situation could be, using a basic sandbox and sysbench test example. I have a sandbox with MyRocks installed:

mysql [localhost] {msandbox} ((none)) > show engines\G
*************************** 1. row ***************************
Engine: ROCKSDB
Support: YES
Comment: RocksDB storage engine
Transactions: YES
XA: YES
Savepoints: YES
mysql [localhost] {msandbox} ((none)) > select @@version,@@version_comment;
+-----------+----------------------------------------------------+
| @@version | @@version_comment |
+-----------+----------------------------------------------------+
| 5.7.22-22 | Percona Server (GPL), Release 22, Revision f62d93c |
+-----------+----------------------------------------------------+
1 row in set (0.00 sec)

Once my test system had not so much free disk space remaining, I ran this simple sysbench prepare command:

$ sysbench /usr/share/sysbench/oltp_insert.lua --mysql_storage_engine=rocksdb --table-size=1000000 --tables=4 --mysql-db=db2 --mysql-user=root --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox5723.sock --threads=4 --time=200 --report-interval=1 --events=0 --db-driver=mysql prepare
sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
Initializing worker threads...
Creating table 'sbtest2'...
Creating table 'sbtest3'...
Creating table 'sbtest1'...
Creating table 'sbtest4'...
Inserting 1000000 records into 'sbtest2'
Inserting 1000000 records into 'sbtest1'
Inserting 1000000 records into 'sbtest3'
Inserting 1000000 records into 'sbtest4'
Creating a secondary index on 'sbtest4'...
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest4_k_4_4_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_4 ON sbtest4(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest4_k_4_4_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
Creating a secondary index on 'sbtest2'...
Creating a secondary index on 'sbtest1'...
Creating a secondary index on 'sbtest3'...
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest2_k_2_5_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_2 ON sbtest2(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest2_k_2_5_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest3_k_3_6_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_3 ON sbtest3(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest3_k_3_6_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest1_k_1_7_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_1 ON sbtest1(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest1_k_1_7_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device

This resulted in the following errors being printed to the error log:

2018-08-10T11:01:35.379611Z 5 [Note] RocksDB: Begin index creation (0,260)
2018-08-10T11:01:40.105596Z 3 [Note] RocksDB: Begin index creation (0,261)
2018-08-10T11:01:40.838902Z 4 [Note] RocksDB: Begin index creation (0,262)
2018-08-10T11:01:40.924093Z 6 [Note] RocksDB: Begin index creation (0,263)
2018-08-10T11:03:14.868958Z 0 [ERROR] RocksDB: Error detected in background, Status Code: 5, Status: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
2018-08-10T11:03:14.868986Z 0 [ERROR] RocksDB: BackgroundErrorReason: 0
2018-08-10T11:03:14.869003Z 0 [ERROR] LibRocksDB:[/mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/min-stretch-x64/percona-server-5.7.22-22/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1293] Waiting after background flush error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on deviceAccumulated background error counts: 1
2018-08-10T11:03:15.430064Z 0 [ERROR] LibRocksDB:[/mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/min-stretch-x64/percona-server-5.7.22-22/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1373] Waiting after background compaction error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000102.sst: No space left on device, Accumulated background error counts: 2
2018-08-10T11:03:18.538866Z 9 [Note] RocksDB: Begin index creation (0,268)
2018-08-10T11:03:19.897797Z 9 [ERROR] Error finishing bulk load.
2018-08-10T11:03:23.306957Z 10 [Note] RocksDB: Begin index creation (0,269)
2018-08-10T11:03:23.509448Z 8 [Note] RocksDB: Begin index creation (0,270)
2018-08-10T11:03:23.539705Z 7 [Note] RocksDB: Begin index creation (0,271)
2018-08-10T11:03:24.536092Z 10 [ERROR] Error finishing bulk load.
2018-08-10T11:03:24.659233Z 7 [ERROR] Error finishing bulk load.
2018-08-10T11:03:24.736380Z 8 [ERROR] Error finishing bulk load.

OK, so I resized my data partition online (no MySQL service restart) and gave it more free disk space. And here is what happened next. I tried to see if MySQL/MyRocks works now:

mysql [localhost] {msandbox} (db1) > select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.33 sec)
mysql [localhost] {msandbox} (db1) > insert into sbtest1 set k=1;
Query OK, 1 row affected (0.01 sec)
mysql [localhost] {msandbox} (db1) > select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.32 sec)
mysql [localhost] {msandbox} (db1) > delete from db1.sbtest2 where id<100;
Query OK, 99 rows affected (0.01 sec)
mysql [localhost] {msandbox} (db1) > select count(*) from db1.sbtest2;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.32 sec)

That’s the worst part – there’s no error returned to the client, and yet my new writes are basically gone!

The good thing is we can still read from the tables, so mysqldump or mydumper is the way to go now, BEFORE you attempt to restart the service:

mysql [localhost] {msandbox} (db1) > select id from db1.sbtest2 where id<10;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+----+
9 rows in set (0.00 sec)

When I later try to restart the service, it crashes on shutdown. In some cases, the database is able to recover and offer access to the old RocksDB data (though the new data, written above, are still lost).

In other cases, though, your RocksDB won’t load after restart:

mysql [localhost] {msandbox} (db1) > show create table sbtest1\G
ERROR 1286 (42000): Unknown storage engine 'ROCKSDB'

and using this option:

rocksdb_wal_recovery_mode = 3

…helps to recover the engine to usable state.

Ultimately, the worst case scenario is that you end up with the MyRocks engine dead and unable to recover, no matter what options you try. We can see this in the error log when I try to start the service:

2018-08-10T09:33:18.756868Z 0 [Note] RocksDB: 2 column families found
2018-08-10T09:33:18.756993Z 0 [Note] RocksDB: Column Families at start:
2018-08-10T09:33:18.757016Z 0 [Note] cf=default
2018-08-10T09:33:18.757022Z 0 [Note] write_buffer_size=67108864
2018-08-10T09:33:18.757027Z 0 [Note] target_file_size_base=67108864
2018-08-10T09:33:18.757043Z 0 [Note] cf=__system__
2018-08-10T09:33:18.757048Z 0 [Note] write_buffer_size=67108864
2018-08-10T09:33:18.757052Z 0 [Note] target_file_size_base=67108864
2018-08-10T09:33:18.811671Z 0 [ERROR] RocksDB: Could not get index information for Index Number (0,282), table db3.sbtest4
2018-08-10T09:33:18.811741Z 0 [ERROR] RocksDB: Failed to initialize DDL manager.
2018-08-10T09:33:18.811795Z 0 [ERROR] Plugin 'ROCKSDB' init function returned error.
2018-08-10T09:33:18.811812Z 0 [ERROR] Plugin 'ROCKSDB' registration as a STORAGE ENGINE failed.

Unfortunately, we were not able to find any workaround for this ultimate worst case situation, and so recovering data—including all other RocksDB tables not touched during the incident—may be very hard.

The problem, which is obviously a bug, is reported here: https://jira.percona.com/browse/PS-4706.

Fortunately, the engineers at Facebook—the original RocksDB founder—are aware of the issue and have already released a simple patch. This allows the engine simply to crash instead of allowing lost writes. While this is not a perfect solution, it at least protects you from losing data. Related commit details can be found here: https://github.com/facebook/mysql-5.6/commit/4a7d3fb8f96c96c2be10d61a7796ccac7610a5d6.

At Percona, we accepted that fix, and the latest Percona Server for MySQL 5.7.23 version, already has it incorporated: https://github.com/percona/percona-server/commit/d0d5635573ebc4f157e36a989d14f7ee6f87c410

So, with the upgraded version, we instead will see this kind of error log event:

2018-09-19T08:12:14.496412Z 0 [ERROR] RocksDB: Error detected in background, Status Code: 5, Status: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000104.sst: No space left on device
2018-09-19T08:12:14.496435Z 0 [ERROR] RocksDB: BackgroundErrorReason: 1
2018-09-19T08:12:14.496451Z 0 [ERROR] LibRocksDB:[/mnt/workspace/percona-server-5.7-binaries-release-rocks-new/label_exp/min-stretch-x64/test/percona-server-5.7.23-23/storage/rocksdb/rocksdb/db/db_impl_compactio
n_flush.cc:1472] Waiting after background compaction error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000104.sst: No space left on device, Accumulated background error counts: 1
2018-09-19T08:12:14.511910Z 12 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000104.sst: No space left on device
2018-09-19T08:12:14.511935Z 12 [ERROR] MyRocks: aborting on WAL write error.
08:12:14 UTC - mysqld got signal 6 ;

Therefore, all MyRocks users are advised to upgrade ASAP and if that’s not possible, you should at least double check the disk space monitoring and alerting.

The post MyRocks Disk Full Edge Case appeared first on Percona Database Performance Blog.

by Przemysław Malkowski at September 20, 2018 10:00 AM

September 19, 2018

MariaDB Foundation

MariaDB Foundation CEO steps down

The time has come for me, Otto Kekäläinen, after serving as the CEO of the MariaDB Foundation since January 2015, to step down. The board has already quietly been searching for a new CEO for some months, but no contract has been signed as of yet, so if you are interested in the position, please […]

The post MariaDB Foundation CEO steps down appeared first on MariaDB.org.

by Otto Kekäläinen at September 19, 2018 12:23 PM

Jean-Jerome Schmidt

How to Monitor Multiple MySQL Instances Running on the Same Machine - ClusterControl Tips & Tricks

Requires ClusterControl 1.6 or later. Applies to MySQL based instances/clusters.

On some occasions, you might want to run multiple instances of MySQL on a single machine. You might want to give different users access to their own MySQL servers that they manage themselves, or you might want to test a new MySQL release while keeping an existing production setup undisturbed.

It is possible to use a different MySQL server binary per instance, or use the same binary for multiple instances (or a combination of the two approaches). For example, you might run a server from MySQL 5.6 and one from MySQL 5.7, to see how the different versions handle a certain workload. Or you might run multiple instances of the latest MySQL version, each managing a different set of databases.

Whether or not you use distinct server binaries, each instance that you run must be configured with unique values for several operating parameters. This eliminates the potential for conflict between instances. You can use MySQL Sandbox to create multiple MySQL instances. Or you can use mysqld_multi available in MySQL to start or stop any number of separate mysqld processes running on different TCP/IP ports and UNIX sockets.

In this blog post, we’ll show you how to configure ClusterControl to monitor multiple MySQL instances running on one host.

ClusterControl Limitation

At the time of writing, ClusterControl does not support monitoring of multiple instances on one host per cluster/server group. It assumes the following best practices:

  • Only one MySQL instance per host (physical server or virtual machine).
  • MySQL data redundancy should be configured on N+1 server.
  • All MySQL instances are running with uniform configuration across the cluster/server group, e.g., listening port, error log, datadir, basedir, socket are identical.

With regards to the points mentioned above, ClusterControl assumes that in a cluster/server group:

  • MySQL instances are configured uniformly across a cluster; same port, the same location of logs, base/data directory and other critical configurations.
  • It monitors, manages and deploys only one MySQL instance per host.
  • MySQL client must be installed on the host and available on the executable path for the corresponding OS user.
  • The MySQL is bound to an IP address reachable by ClusterControl node.
  • It keeps monitoring the host statistics e.g CPU/RAM/disk/network for each MySQL instance individually. In an environment with multiple instances per host, you should expect redundant host statistics since it monitors the same host multiple times.

With the above assumptions, the following ClusterControl features do not work for a host with multiple instances:

Backup - Percona Xtrabackup does not support multiple instances per host and mysqldump executed by ClusterControl only connects to the default socket.

Process management - ClusterControl uses the standard ‘pgrep -f mysqld_safe’ to check if MySQL is running on that host. With multiple MySQL instances, this is a false positive approach. As such, automatic recovery for node/cluster won’t work.

Configuration management - ClusterControl provisions the standard MySQL configuration directory. It usually resides under /etc/ and /etc/mysql.

Workaround

Monitoring multiple MySQL instances on a machine is still possible with ClusterControl with a simple workaround. Each MySQL instance must be treated as a single entity per server group.

In this example, we have 3 MySQL instances on a single host created with MySQL Sandbox:

ClusterControl monitoring multiple instances on same host
ClusterControl monitoring multiple instances on same host

We created our MySQL instances using the following commands:

$ su - sandbox
$ make_multiple_sandbox mysql-5.7.23-linux-glibc2.12-x86_64.tar.gz

By default, MySQL Sandbox creates mysql instances that listen to 127.0.0.1. It is necessary to configure each node appropriately to make them listen to all available IP addresses. Here is the summary of our MySQL instances in the host:

[sandbox@master multi_msb_mysql-5_7_23]$ cat default_connection.json 
{
"node1":  
    {
        "host":     "master",
        "port":     "15024",
        "socket":   "/tmp/mysql_sandbox15024.sock",
        "username": "msandbox@127.%",
        "password": "msandbox"
    }
,
"node2":  
    {
        "host":     "master",
        "port":     "15025",
        "socket":   "/tmp/mysql_sandbox15025.sock",
        "username": "msandbox@127.%",
        "password": "msandbox"
    }
,
"node3":  
    {
        "host":     "master",
        "port":     "15026",
        "socket":   "/tmp/mysql_sandbox15026.sock",
        "username": "msandbox@127.%",
        "password": "msandbox"
    }
}

Next step is to modify the configuration of the newly created instances. Go to my.cnf for each of them and hash bind_address variable:

[sandbox@master multi_msb_mysql-5_7_23]$ ps -ef | grep mysqld_safe
sandbox  13086     1  0 08:58 pts/0    00:00:00 /bin/sh bin/mysqld_safe --defaults-file=/home/sandbox/sandboxes/multi_msb_mysql-5_7_23/node1/my.sandbox.cnf
sandbox  13805     1  0 08:58 pts/0    00:00:00 /bin/sh bin/mysqld_safe --defaults-file=/home/sandbox/sandboxes/multi_msb_mysql-5_7_23/node2/my.sandbox.cnf
sandbox  14065     1  0 08:58 pts/0    00:00:00 /bin/sh bin/mysqld_safe --defaults-file=/home/sandbox/sandboxes/multi_msb_mysql-5_7_23/node3/my.sandbox.cnf
[sandbox@master multi_msb_mysql-5_7_23]$ vi my.cnf
#bind_address = 127.0.0.1

Then install mysql on your master node and restart all instances using restart_all script.

[sandbox@master multi_msb_mysql-5_7_23]$ yum install mysql
[sandbox@master multi_msb_mysql-5_7_23]$ ./restart_all  
# executing "stop" on /home/sandbox/sandboxes/multi_msb_mysql-5_7_23
executing "stop" on node 1
executing "stop" on node 2
executing "stop" on node 3
# executing "start" on /home/sandbox/sandboxes/multi_msb_mysql-5_7_23
executing "start" on node 1
. sandbox server started
executing "start" on node 2
. sandbox server started
executing "start" on node 3
. sandbox server started

From ClusterControl, we need to perform ‘Import’ for each instance as we need to isolate them in a different group to make it work.

ClusterControl import existing server
ClusterControl import existing server

For node1, enter the following information in ClusterControl > Import:

ClusterControl import existing server
ClusterControl import existing server

Make sure to put proper ports (different for different instances) and host (same for all instances).

You can monitor the progress by clicking on the Activity/Jobs icon in the top menu.

ClusterControl import existing server details
ClusterControl import existing server details

You will see node1 in the UI once ClusterControl finishes the job. Repeat the same steps to add another two nodes with port 15025 and 15026. You should see something like the below once they are added:

ClusterControl Dashboard
ClusterControl Dashboard

There you go. We just added our existing MySQL instances into ClusterControl for monitoring. Happy monitoring!

PS.: To get started with ClusterControl, click here!

by Bart Oles at September 19, 2018 11:17 AM

Peter Zaitsev

Tutorial Schedule for Percona Live Europe 2018 Is Live

Percona Live Europe tutorials and sneak peak

Percona Live Europe Open Source Database Conference PLE 2018Percona has revealed the line-up of in-depth tutorials for the Percona Live Europe 2018 Open Source Database Conference, taking place November 5–7, 2018 at the Radisson Blu Hotel in Frankfurt, Germany. Secure your spot now with Advanced Registration prices. Be sure to buy your tickets soon as tickets prices will only head up, not down! Sponsorship opportunities for the conference are still available.

Percona Live Europe 2018 Open Source Database Conference is the premier open source database event. Our theme this year is “Connect. Accelerate. Innovate.”  Percona Live is the place to learn about how open source database technology can power your applications, improve your websites and solve your critical database issues.

Monday, November 5: Tutorial Day

Tutorials take place throughout the day on Monday, November 5. Tutorials are three hours and provide practical, in-depth knowledge exchange on critical open source database issues. The line up includes:

  • Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics with Jaime Crespo – Wikimedia Foundation
  • Hands on ProxySQL with René Cannaò – ProxySQL
  • ElasticSearch 101 with Antonios Giannopoulos – ObjectRocket
  • MySQL Performance Schema in Action: the Complete Tutorial with Sveta Smirnova and Alexander Rubin – Percona
  • MySQL InnoDB Cluster in a Nutshell : The Saga Continues with 8.0 with Frédéric Descamps – Oracle
  • Introduction to PostgreSQL for MySQL and Oracle DBAs with Avinash Vallarapu – Percona
  • InnoDB Architecture and Optimization with Peter Zaitsev – Percona
  • MongoDB: Replica Sets and Sharded Clusters with Adamo Tonete – Percona
  • High Availability PostgreSQL and Kubernetes with Google Cloud presented by Alexis Guajardo – Google
  • Open Source Database Performance Optimization and Monitoring with PMM with Michael Coburn, Vinicius Grippa, and Avinash Vallarapu – Percona
  • Percona XtraDB Cluster Tutorial presented by Tibor Köröcz – Percona
  • Mastering PostgreSQL Administration with Bruce Momjian – EnterpriseDB

and a sneak peak at some of the sessions

Of course, we have a stellar line up of talks, too! Here’s a tantalising glimpse of just some of the talks you could MISS if you don’t head to Frankfurt in November

Tuesday 6th November

  • Percona Server 8.0 – Laurynas Biveinis – Percona
  • MySQL 8.0 Performance: Scalability & Benchmarks – Dimitri Kravtchuk – Oracle
  • MySQL Group Replication : the magic explained – Frédéric Descamps – Oracle
  • Explaining the Postgres Query Optimizer – Bruce Momjian – EnterpriseDB
  • TLS for MySQL at large scale – Jaime Crespo – Wikimedia Foundation
  • BlaBlaCar – 100% Containers Powered Carpooling – Maxime Fouilleul – BlaBlaCar
  • A Year in Google Cloud – Carmen Mason, Alan Mason – VitalSource Technologies
  • Demystifying MySQL Replication Crash Safety – Jean-François Gagné
  • MongoDB Shard 101 – Adamo Tonete – Percona
  • Highway to Hell or Stairway to Cloud? – Alexander Kukushkin – Zalando
  • MongoDB administration cool tips – Gabriel Ciciliani – Pythian

Wednesday 7th November

  • PostgreSQL Enterprise Features – Michael Banck – credativ GmbH
  • Open Source Databases and Non-Volatile Memory – Frank Ober – Intel Memory Group
  • MariaDB 10.3 Optimizer and beyond – Vicentiu Ciorbaru – MariaDB Foundation
  • MariaDB system-versioned tables – Federico Razzoli – PayProp

A shout out to our fantastic Conference Committee who have been working hard to review the tutorial and talk submissions: we had over 200! Thank you!

The Radisson Blu Hotel, Frankfurt

Percona Live 2018 Open Source Database Conference will be held at the Radisson Blu Hotel, Frankfurt Franklinstraße 65, 60486 Frankfurt am Main, Germany

The Radisson Blu enjoys an enviable location in the Bockenheim District, just off the A66 motorway – only one kilometer from Messe Frankfurt, one of the world’s largest exhibition complexes. They’re also just 10 minutes from the city center, and Frankfurt International Airport (FRA) is a quick 15-minute drive away.

Book your hotel using Percona’s special room block rate, available only until September 20.

Sponsorships

Sponsorship opportunities for Percona Live Europe 2018 Open Source Database Conference are available and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors, and entrepreneurs who typically attend the event. Contact events@percona.com for sponsorship details.

The post Tutorial Schedule for Percona Live Europe 2018 Is Live appeared first on Percona Database Performance Blog.

by Bronwyn Campbell at September 19, 2018 07:53 AM

September 18, 2018

Peter Zaitsev

Percona Server for MongoDB 3.2.21-3.12 Is Now Available

Percona Server for MongoDB

Percona Server for MongoDB 3.2Percona announces the release of Percona Server for MongoDB 3.2.21-3.12 on September 18, 2018. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.2 Community Edition. It supports MongoDB 3.2 protocols and drivers.

Percona Server for MongoDB extends MongoDB Community Edition functionality by including the Percona Memory Engine, as well as several enterprise-grade features. It requires no changes to MongoDB applications or code.

This release updates the MongoDB portion of the Percona Server for MongoDB code base to
to 3.2.21.

The Percona Server for MongoDB 3.2.21-3.12 release notes are available in the official documentation.

The post Percona Server for MongoDB 3.2.21-3.12 Is Now Available appeared first on Percona Database Performance Blog.

by Dmitriy Kostiuk at September 18, 2018 06:28 PM

Percona XtraDB Cluster 5.6.41-28.28 Is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6Percona announces the release of Percona XtraDB Cluster 5.6.41-28.28 (PXC) on September 18, 2018. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.41-28.28 is now the current release, based on the following:

Fixed Bugs

  • PXC-1017: Memcached API is now disabled if node is acting as a cluster node, because InnoDB Memcached access is not replicated by Galera.
  • PXC-2164: SST script compatibility with SELinux was improved by forcing it to look for port associated with the said process only.
  • PXC-2155: Temporary folders created during SST execution are now deleted on cleanup.
  • PXC-2199: TOI replication protocol was fixed to prevent unexpected GTID generation caused by the  DROP TRIGGER IF EXISTS statement logged by MySQL as a successful one due to its IF EXISTS clause.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

The post Percona XtraDB Cluster 5.6.41-28.28 Is Now Available appeared first on Percona Database Performance Blog.

by Dmitriy Kostiuk at September 18, 2018 06:11 PM

Kentoku SHIBA

Jean-Jerome Schmidt

Custom Graphs to Monitor your MySQL, MariaDB, MongoDB and PostgreSQL Systems - ClusterControl Tips & Tricks

Graphs are important, as they are your window onto your monitored systems. ClusterControl comes with a predefined set of graphs for you to analyze, these are built on top of the metric sampling done by the controller. Those are designed to give you, at first glance, as much information as possible about the state of your database cluster. You might have your own set of metrics you’d like to monitor though. Therefore ClusterControl allows you to customize the graphs available in the cluster overview section and in the Nodes -> DB Performance tab. Multiple metrics can be overlaid on the same graph.

Cluster Overview tab

Let’s take a look at the cluster overview - it shows the most important information aggregated under different tabs.

Cluster Overview Graphs
Cluster Overview Graphs

You can see graphs like “Cluster Load” and “Galera - Flow Ctrl” along with couple of others. If this is not enough for you, you can click on “Dash Settings” and then pick “Create Board” option. From there, you can also manage existing graphs - you can edit a graph by double-clicking on it, you can also delete it from the tab list.

Dashboard Settings
Dashboard Settings

When you decide to create a new graph, you’ll be presented with an option to pick metrics that you’d like to monitor. Let’s assume we are interested in monitoring temporary objects - tables, files and tables on disk. We just need to pick all three metrics we want to follow and add them to our new graph.

New Board 1
New Board 1

Next, pick some name for our new graph and pick a scale. Most of the time you want scale to be linear but in some rare cases, like when you mix metrics containing large and small values, you may want to use logarithmic scale instead.

New Board 2
New Board 2

Finally, you can pick if your template should be presented as a default graph. If you tick this option, this is the graph you will see by default when you enter the “Overview” tab.

Once we save the new graph, you can enjoy the result:

New Board 3
New Board 3

Node Overview tab

In addition to the graphs on our cluster, we can also use this functionality on each of our nodes independently. On the cluster, if we go to the “Nodes” section and select some of them, we can see an overview of it, with metrics of the operating system:

Node Overview Graphs
Node Overview Graphs

As we can see, we have eight graphs with information about CPU usage, Network usage, Disk space, RAM usage, Disk utilization, Disk IOPS, Swap space and Network errors, which we can use as a starting point for troubleshooting on our nodes.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

DB Performance tab

When you take a look at the node and then follow into DB Performance tab, you’ll be presented with a default of eight different metrics. You can change them or add new ones. To do that, you need to use “Choose Graph” button:

DB Performance Graphs
DB Performance Graphs

You’ll be presented with a new window, that allows you to configure the layout and the metrics graphed.

DB Performance Graphs Settings
DB Performance Graphs Settings

Here you can pick the layout - two or three columns of graphs and number of graphs - up to 20. Then, you may want to modify which metrics you’d want to see plotted - use drop-down dialog boxes to pick whatever metric you’d like to add. Once you are ready, save the graphs and enjoy your new metrics.

We can also use the Operational Reports feature of ClusterControl, where we will obtain the graphs and the report of our cluster and nodes in a HTML report, that can be accessed through the ClusterControl UI, or schedule it to be sent by email periodically.

These graphs help us to have a complete picture of the state and behavior of our databases.

by Sebastian Insausti at September 18, 2018 01:26 PM

Peter Zaitsev

Monitoring Processes with Percona Monitoring and Management

Memory utilization during compaction process

A few months ago I wrote a blog post on How to Capture Per Process Metrics in PMM. Since that time, Nick Cabatoff has made a lot of improvements to Process Exporter and I’ve improved the Grafana Dashboard to match.

I will not go through installation instructions, they are well covered in original blog post.  This post covers features available in release 0.4.0 Here are a few new features you might find of interest:

Used Memory

Memory usage in Linux is complicated.  You can look at resident memory, which shows how much space is used in RAM. However, if you have a substantial part of process swapped out because of memory pressure, you would not see it. You can also look at virtual memory–but it will include a lot of address space which was allocated and never mapped either to RAM or to swap space.   Especially for processes written in Go, the difference can be extreme. Let’s look at the process exporter itself: it uses 20MB of resident memory but over 2GB of virtual memory.

top processes by resident memory

top processes by virtual memory

Meet the Used Memory dashboard, which shows the sum of resident memory used by the process and swap space used:

used memory dashboard

There is dashboard to see process by swap space used as well, so you can see if some processes that you expect to be resident are swapped out.

Processes by Disk IO

processes by disk io

Processes by Disk IO is another graph which I often find very helpful. It is the most useful for catching the unusual suspects, when the process causing the IO is not totally obvious.

Context Switches

Context switches, as shown by VMSTAT, are often seen as an indication of contention. With contention stats per process you can see which of the process are having those context switches.

top processes by voluntary context switches

Note: while large number of context switches can be a cause of high contention, some applications and workloads are just designed in such a way. You are better off looking at the change in the number of context switches, rather than at the raw number.

CPU and Disk IO Saturation

As Brendan Gregg tells us, utilization and saturation are not the same. While CPU usage and Disk IO usage graphs show us resource utilization by different processes, they do not show saturation.

top running processes graph

For example, if you have four CPU cores then you can’t get more than four CPU cores used by any process, whether there are four or four hundred concurrent threads trying to run.

While being rather volatile as gauge metrics, top running processes and top processes waiting on IO are good metrics to understand which processes are prone to saturation.

These graphs roughly provide a breakdown of “r” and “b”  VMSTAT columns per process

Kernel Waits

Finally, you can see which kernel function (WCHAN) the process is sleeping on, which can be very helpful to access processes which are not using a lot of CPU, but are not making much progress either.

I find this graph most useful if you pick the single process in the dashboard picker:

kernel waits for sysbench

In this graph we can see sysbench has most threads sleeping in

unix_stream_read_generic
  which corresponds to reading the response from MySQL from UNIX socket – exactly what you would expect!

Summary

If you ever need to understand what different processes are doing in your system, then Nick’s Process Exporter is a fantastic tool to have. It just takes few minutes to get it added into your PMM installation.

If you enjoyed this post…

You might also like my pre-recorded webinar MySQL troubleshooting and performance optimization with PMM.

The post Monitoring Processes with Percona Monitoring and Management appeared first on Percona Database Performance Blog.

by Peter Zaitsev at September 18, 2018 11:39 AM

Upcoming Webinar Wed 9/19: Percona Server for MongoDB vs MongoDB Enterprise

MongoDB Enterprise Advanced

MongoDB Enterprise AdvancedPlease join Percona’s Senior Support Engineer, Adamo Tonete, as he presents Percona Server for MongoDB vs MongoDB Enterprise on Wednesday, September 19th, 2018, at 12:30 PM PDT (UTC-7) / 3:30 PM EDT (UTC-4).

 

In this webinar we will evaluate MongoDB Community, MongoDB Enterprise Advanced, and Percona Server for MongoDB side-by-side to better inform the decision making process. Percona Server for MongoDB features enterprise-grade functionalities and runs on tools like Percona Monitoring and Management to monitor MongoDB. Even more, it is 100% free and open source. It is also worth mentioning that MongoDB Enterprise Advanced comes with encryption and LDAP authorization. Accordingly, we will walk through those features and much more.

Register for this webinar to learn more about Percona Server for MongoDB vs MongoDB Enterprise.

The post Upcoming Webinar Wed 9/19: Percona Server for MongoDB vs MongoDB Enterprise appeared first on Percona Database Performance Blog.

by Adamo Tonete at September 18, 2018 12:08 AM

September 17, 2018

Peter Zaitsev

Percona Server for MongoDB 3.6.7-1.5 Is Now Available

Percona Server for MongoDB

Percona Server for MongoDBPercona announces the release of Percona Server for MongoDB 3.6.7-1.5 on September 17, 2018. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.6 Community Edition. It supports MongoDB 3.6 protocols and drivers.

Percona Server for MongoDB extends MongoDB Community Edition functionality by including the Percona Memory Engine, as well as several enterprise-grade features. Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.6.7 and does not include any additional changes.

The Percona Server for MongoDB 3.6.7-1.5 release notes are available in the official documentation.

The post Percona Server for MongoDB 3.6.7-1.5 Is Now Available appeared first on Percona Database Performance Blog.

by Dmitriy Kostiuk at September 17, 2018 06:28 PM

Jean-Jerome Schmidt

Become a ClusterControl DBA: Performance and Health Monitoring

In the previous two blog posts we covered both deploying the four types of clustering/replication (MySQL/Galera, MySQL Replication, MongoDB & PostgreSQL) and managing/monitoring your existing databases and clusters. So, after reading these two first blog posts you were able to add your 20 existing replication setups to ClusterControl, expand them and additionally deployed two new Galera clusters while doing a ton of other things. Or maybe you deployed MongoDB and/or PostgreSQL systems. So now, how do you keep them healthy?

That’s exactly what this blog post is about: how to leverage ClusterControl performance monitoring and advisors functionality to keep your MySQL, MongoDB and/or PostgreSQL databases and clusters healthy. So how is this done in ClusterControl?

Database Cluster List

The most important information can already be found in the cluster list: as long as there are no alarms and no hosts are shown to be down, everything is functioning fine. An alarm is raised if a certain condition is met, e.g. host is swapping, and brings to your attention the issue you should investigate. That means that alarms not only are raised during an outage but also to allow you to proactively manage your databases.

Suppose you would log into ClusterControl and see a cluster listing like this, you will definitely have something to investigate: one node is down in the Galera cluster for example and every cluster has various alarms:

Once you click on one of the alarms, you will go to a detailed page on all alarms of the cluster. The alarm details will explain the issue and in most cases also advise the action to resolve the issue.

You can set up your own alarms by creating custom expressions, but that has been deprecated in favor of our new Developer Studio that allows you to write custom Javascripts and execute these as Advisors. We will get back to this topic later in this post.

Cluster Overview - Dashboards

When opening up the cluster overview, we can immediately see the most important performance metrics for the cluster in the tabs. This overview may differ per cluster type as, for instance, Galera has different performance metrics to watch than traditional MySQL, PostgreSQL or MongoDB.

Both the default overview and the pre-selected tabs are customizable. By clicking on Overview -> Dash Settings you are given a dialogue that allows you to define the dashboard:

By pressing the plus sign you can add and define your own metrics to graph the dashboard. In our case we will define a new dashboard featuring the Galera specific send and receive queue average:

This new dashboard should give us good insight in the average queue length of our Galera cluster.

Once you have pressed save, the new dashboard will become available for this cluster:

Similarly you can do this for PostgreSQL as well, for example we can monitor the shared blocks hit versus blocks read:

So as you can see, it is relatively easy to customize your own (default) dashboard.

Cluster Overview - Query Monitor

The Query Monitor tab is available for both MySQL and PostgreSQL based setups and consists out of three dashboards: Top Queries, Running Queries and Query Outliers.

In the Running Queries dashboard, you will find all current queries that are running. This is basically the equivalent of SHOW FULL PROCESSLIST statement in MySQL database.

Top Queries and Query Outliers both rely on the input of the slow query log or Performance Schema. Using Performance Schema is always recommended and will be used automatically if enabled. Otherwise, ClusterControl will use the MySQL slow query log to capture the running queries. To prevent ClusterControl from being too intrusive and the slow query log to grow too large, ClusterControl will sample the slow query log by turning it on and off. This loop is by default set to 1 second capturing and the long_query_time is set to 0.5 seconds. If you wish to change these settings for your cluster, you can change this via Settings -> Query Monitor.

Top Queries will, like the name says, show the top queries that were sampled. You can sort them on various columns: for instance the frequency, average execution time, total execution time or standard deviation time:

You can get more details about the query by selecting it and this will present the query execution plan (if available) and optimization hints/advisories. The Query Outliers is similar to the Top Queries but then allows you to filter the queries per host and compare them in time.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Cluster Overview - Operations

Similar to the PostgreSQL and MySQL systems the MongoDB clusters have the Operations overview and is similar to the MySQL's Running Queries. This overview is similar to issuing the db.currentOp() command within MongoDB.

Cluster Overview - Performance

MySQL/Galera

The performance tab is probably the best place to find the overall performance and health of your clusters. For MySQL and Galera it consists of an Overview page, the Advisors, status/variables overviews, the Schema Analyzer and the Transaction log.

The Overview page will give you a graph overview of the most important metrics in your cluster. This is, obviously, different per cluster type. Eight metrics have been set by default, but you can easily set your own - up to 20 graphs if needed:

The Advisors is one of the key features of ClusterControl: the Advisors are scripted checks that can be run on demand. The advisors can evaluate almost any fact known about the host and/or cluster and give its opinion on the health of the host and/or cluster and even can give advice on how to resolve issues or improve your hosts!

The best part is yet to come: you can create your own checks in the Developer Studio (ClusterControl -> Manage -> Developer Studio), run them on a regular interval and use them again in the Advisors section. We blogged about this new feature earlier this year.

We will skip the status/variables overview of MySQL and Galera as this is useful for reference but not for this blog post: it is good enough that you know it is here.

Now suppose your database is growing but you want to know how fast it grew in the past week. You can actually keep track of the growth of both data and index sizes from right within ClusterControl:

And next to the total growth on disk it can also report back the top 25 largest schemas.

Another important feature is the Schema Analyzer within ClusterControl:

ClusterControl will analyze your schemas and look for redundant indexes, MyISAM tables and tables without a primary key. Of course it is entirely up to you to keep a table without a primary key because some application might have created it this way, but at least it is great to get the advice here for free. The Schema Analyzer even recommends the necessary ALTER statement to fix the problem.

PostgreSQL

For PostgreSQL the Advisors, DB Status and DB Variables can be found here:

MongoDB

For MongoDB the Mongo Stats and performance overview can be found under the Performance tab. The Mongo Stats is an overview of the output of mongostat and the Performance overview gives a good graphical overview of the MongoDB opcounters:

Final Thoughts

We showed you how to keep your eyeballs on the most important monitoring and health checking features of ClusterControl. Obviously this is only the beginning of the journey as we will soon start another blog series about the Developer Studio capabilities and how you can make most of your own checks. Also keep in mind that our support for MongoDB and PostgreSQL is not as extensive as our MySQL toolset, but we are continuously improving on this.

You may ask yourself why we have skipped over the performance monitoring and health checks of HAProxy, ProxySQL and MaxScale. We did that deliberately as the blog series covered only deployments of clusters up till now and not the deployment of HA components. So that’s the subject we'll cover next time.

by ashraf at September 17, 2018 12:14 PM

Peter Zaitsev

Using the keyring_vault Plugin with Percona Server for MySQL 5.7

keyring_vault store database encryption keys

keyring_vault store database encryption keysThis is the first of a two-part series on using the keyring_vault plugin with Percona Server for MySQL 5.7. The second part, Backing up Percona Server for MySQL with keyring_vault plugin enabledwalks through how to use Percona Xtrabackup to backup from this instance and restore to another server and set it up as a slave with keyring_vault plugin.

What is the keyring_vault plugin?

The keyring_vault is a plugin that allows the database to interface with a Hashicorp Vault server to store and secure encryption keys. The Vault server then acts as a centralized encryption key management solution which is critical for security and for compliance with various security standards.

Configuring Vault

Create SSL certificates to be used by Vault. You can use the sample ssl.conf template below to generate the necessary files.

[root@vault1 ~]# cat /etc/sslkeys/ssl.conf
[req]
distinguished_name = req_distinguished_name
x509_extensions = v3_req
prompt = no
[req_distinguished_name]
C = US
ST = NC
L =  R
O = Percona
CN = *
[v3_req]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer
basicConstraints = CA:TRUE
subjectAltName = @alt_names
[alt_names]
IP = 192.168.0.114

Then run the two commands below to generated the cert and key files and the certificate chain:

$ openssl req -config ssl.conf -x509 -days 365 -batch -nodes -newkey rsa:2048 -keyout vault.key -out vault.crt
$ cat vault.key vault.crt > vault.pem

Once the SSL certificates are created start Vault with the sample configuration below. Take note that you should follow the suggested best practices when deploying Vault in production, this example is to get us by with a simple working setup.

[root@vault1 ~]# cat /etc/vault.hcl
listener "tcp" {
address = "192.168.0.114:8200"
tls_cert_file="/etc/sslkeys/vault.crt"
tls_key_file="/etc/sslkeys/vault.key"
}
storage "file" {
path = "/var/lib/vault"
}

Assuming Vault started up fine and you are able to unseal Vault, the next step is to create the policy file. For more details on initializing and unsealing Vault please read the manual here.

[root@vault1 ~]# cat /etc/vault/policy/dc1.hcl
path "secret/*" {
capabilities = ["list"]
}
path "secret/dc1/*" {
capabilities = ["create", "read", "delete", "update", "list"]
}

Create a Vault policy named dc1-secrets using the dc1.hcl file like this:

[root@vault1 ~]# vault policy write dc1-secrets /etc/vault/policy/dc1.hcl -ca-cert=/etc/sslkeys/vault.pem
Success! Uploaded policy: dc1-secrets

Next, create a token associated with the newly created policy:

[root@vault1 ~]# vault token create -policy=dc1-secrets -ca-cert=/etc/sslkeys/vault.pem > dc1-token
[root@vault1 ~]# cat dc1-token
Key                  Value
---                  -----
token                be515093-b1a8-c799-b237-8e04ea90ad7a
token_accessor       4c1ba5c5-3fed-e9bb-d230-5bf1392e2d7e
token_duration       8760h
token_renewable      true
token_policies       ["dc1-secrets" "default"]
identity_policies    []
policies             ["dc1-secrets" "default"]

Setting up MySQL

The following instructions should work starting from Percona Server for MySQL 5.7.20-18 and through later versions.

Configure my.cnf with the following variables:

early-plugin-load="keyring_vault=keyring_vault.so"
loose-keyring_vault_config="/var/lib/mysql-keyring/keyring_vault.conf"
encrypt_binlog=ON
innodb_encrypt_online_alter_logs=ON
innodb_encrypt_tables=ON
innodb_temp_tablespace_encrypt=ON
master_verify_checksum=ON
binlog_checksum=CRC32
log_bin=mysqld-bin
binlog_format=ROW
server-id=1
log-slave-updates

Create the keyring_vault.conf file in the path above with the following contents:

[root@mysql1 ~]# cat /var/lib/mysql-keyring/keyring_vault.conf
vault_url = https://192.168.0.114:8200
secret_mount_point = secret/dc1/master
token = be515093-b1a8-c799-b237-8e04ea90ad7a
vault_ca = /etc/vault_ca/vault.pem

Here we are using the vault.pem file generated by combining the vault.crt and vault.key files. Observe that our secret_mount_point is secret/dc1/master. We want to make sure that this mount point is unique across all servers, this is in fact advised in the manual here.

Ensure that the CA certificate is owned by mysql user:

[root@mysql1 ~]# ls -la /etc/vault_ca/
total 24
drwxr-xr-x  2 mysql mysql   41 Jul 14 11:39 .
drwxr-xr-x 63 root  root  4096 Jul 14 13:17 ..
-rw-------  1 mysql mysql 1139 Jul 14 11:39 vault.pem

Initialize the MySQL data directory on the Master:

[root@mysql1 ~]# mysqld --initialize-insecure --datadir=/var/lib/mysql --user=mysql

For production systems we do not recommend using --initialize-insecure option, this is just to skip additional steps in this tutorial.

Finally, start mysqld instance and then test the setup by creating an encrypted table.

[root@mysql1 ~]# systemctl status mysqld
● mysqld.service - MySQL Server
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; disabled; vendor preset: disabled)
Active: active (running) since Sat 2018-07-14 23:53:16 UTC; 2s ago
Docs: man:mysqld(8)
http://dev.mysql.com/doc/refman/en/using-systemd.html
Process: 1401 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Process: 1383 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
Main PID: 1403 (mysqld)
CGroup: /system.slice/mysqld.service
└─1403 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
Jul 14 23:53:16 mysql1 systemd[1]: Starting MySQL Server...
Jul 14 23:53:16 mysql1 systemd[1]: Started MySQL Server.

At this point you should have Percona Server for MySQL instance with tablespace encryption using Vault.

Researching database security?

You might also enjoy this pre-recorded webinar securing your database servers from external attacks presented by my colleague Colin Charles.

The post Using the keyring_vault Plugin with Percona Server for MySQL 5.7 appeared first on Percona Database Performance Blog.

by Jericho Rivera at September 17, 2018 12:02 PM

September 14, 2018

Peter Zaitsev

This Week in Data With Colin Charles 52: London MySQL Meetup

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

This week wraps up the London MySQL meetup, where there were four presentations, for the intimate yet diverse crowd. We saw representation from Oracle MySQL, MariaDB Corporation, Pythian, and Percona. Long-time organizer Ivan Zoratti has also handed off the baton to Maria Luisa Raviol, and going forward she will ensure meetups are at least once per quarter. It was a real pleasure to see MySQL Community Manager Dave Stokes at the event, too.

A new book to read: AWS System Administration: Best Practices for Sysadmins in the Amazon Cloud. There is coverage of RDS, from the standpoint of an example application stack as well as backups.

A most interesting tweet from the Chief Marketing Officer of MongoDB, Meagen Eisenberg, of an ad on a billboard: friend’s don’t let friends use relational databases.

Releases

Link List

Industry Updates

  • Catalyst IT Australia acquires Open Query – Arjen Lentz is a long-time MySQL community member, and he ran his company for the last 11 years pre-acquisition. Congratulations!
  • Elastic files for IPO – the financials are solid, *”Our revenue was $159.9 million and $88.2 million in fiscal 2018 and 2017, respectively, representing year-over-year growth of 81% for fiscal 2018.” The filing is worth reading.
  • New Cloud Unicorn: PagerDuty Scores $1.3 Billion Valuation In $90 Million Round – total raised now $173 million, valuing the company at $1.3 billion. Some of their competitors have been purchased recently, VictorOps by Splunk for $120 million and OpsGenie by Atlassian for $295 million. There are not many independents left in this space beyond PagerDuty and xMatters(who recently picked up Series D financing, total raised $96.5 million).
  • PingCap raises $50m in Series C funding. They are behind TiDB and TiKV. Raised a total of $72 million, and this is a significant increase from the $15m Series B raise in June 2017!
  • Tague Griffith has departed Redis Labs where he was Head of Developer Advocacy and is now at Google.
  • Manyi Lu who has been in the MySQL world for a very long time, leading much of the changes in the MySQL optimizer, who was most recently Director of Software Development at Oracle has departed to be a Senior Director at Alicloud.

Upcoming Appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

The post This Week in Data With Colin Charles 52: London MySQL Meetup appeared first on Percona Database Performance Blog.

by Colin Charles at September 14, 2018 10:17 PM

PostgreSQL Webinar Wed Oct 10th – Enterprise-Grade PostgreSQL: Built on Open Source Tools

Enterprise PostgreSQL built with open source tools

PostgreSQL® logoPlease join Percona’s PostgreSQL Support Technical Lead,  Avinash Vallarapu; Senior Support Engineer, Fernando Laudares; and Senior Support Engineer, Jobin Augustine, on Wednesday, October 10th, 2018 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4), as they demonstrate an enterprise-grade PostgreSQL® environment built using a combination of open source tools and extensions.

“We built our application on top of PostgreSQL. It works great but only now that we took it to the market and it became a hit we realize how much it relies on the database. How can we “harden” PostgreSQL? How can we make the solution we built around PostgreSQL enterprise-grade?”

“I migrated from a proprietary database software to PostgreSQL. I am curious to know whether I can get the same features I used to have in the proprietary database software.”

You’ll find the answer to these questions and more in a series of blog posts we will be publishing on this topic, which will be followed by a live demo we planned for our webinar on October 10th, 2018.

The market coined the term “enterprise grade” or “enterprise ready” to differentiate products and service offerings for licensed database software. For example: there may be a standard database software or an entry-level package that delivers the core functionality and basic features. Likewise, there may be an enterprise version, a more advanced package which goes beyond the essentials to include features and tools indispensable for running critical solutions in production. With such a differentiation found in commercial software, we may wonder whether a solution built on top of an open source database like PostgreSQL can satisfy all the enterprise requirements.

It starts with building a secured PostgreSQL environment, tuning the database for the production workload, building a high availability strategy that avoids single-point-of-failures, scaling PostgreSQL using connection poolers to avoid excessive usage of server resources, and finally load balancing the reads between master and all the available standby servers aka replicas to effectively use the computing power of all the database servers.

The operational aspect of maintaining an enterprise grade PostgreSQL database also includes the methods to configure a backup strategy that helps us achieve point-in-time-recovery as needed, detailed logging and monitoring PostgreSQL along with a real-time analysis of the database performance and finally maintaining the database health with optimal performance, such as making sure vacuuming is working as it should and at the right times.

“Can we build such an enterprise grade solution that satisfies all the above requirements around PostgreSQL with open source softwares only?”

Yes, we can. During the 20+ years PostgreSQL has been around, the open source community has created all sorts of complementary extensions and tools that can be used to build an enterprise grade solution with postgres.

We’ll be following this post with a series of posts covering each piece of such solution, culminating with a webinar that will take place on October 10th. During the webinar we’ll showcase the full project. Here’s the list of topics we are going to consider while building our enterprise grade PostgreSQL server.

  1. Securing your PostgreSQL database cluster
  2. High Availability
  3. Preparing a Backup strategy and the tools available to achieve it
  4. Scaling PostgreSQL using connection poolers and load balancers
  5. Tools/extensions available for your daily DBA life and detailed logging in PostgreSQL.
  6. Monitoring your PostgreSQL and real-time analysis.

Join us to see it in action!

The post PostgreSQL Webinar Wed Oct 10th – Enterprise-Grade PostgreSQL: Built on Open Source Tools appeared first on Percona Database Performance Blog.

by Avinash Vallarapu at September 14, 2018 05:23 PM

Encryption of the InnoDB System Tablespace and Parallel Doublewrite Buffer

encryption of InnoDB tablespace parallel doublewrite buffer

encryption of InnoDB tablespace parallel doublewrite bufferIn my last post I compared data at-rest encryption features available for MySQL and MariaDB. As noted at the time, some of the features available for Percona Server for MySQL were in development, and the latest version (5.7.23) sees two of them released as ALPHA quality.

Encrypting the InnoDB system tablespace

The first of the new features is InnoDB system tablespace encryption via innodb_sys_tablespace_encrypt, which would provide encryption of the following data:

  • the change buffer, which caches changes to secondary index pages as a result of DML operations for pages that are not in the InnoDB buffer pool
  • The undo logs if they have not been configured to be stored in separate undo tablespaces
  • data from any tables that exist in the main tablespace, which occurs when innodb_file_per_table is disabled

There are some related changes on the horizon that would allow this to be applied to an existing instance. However, for now this is only available for new instances as it can only be applied during bootstrap. This means that it would require a logical restore of your data to use it with an existing cluster–I should restate that this is an ALPHA feature and not production-ready.

There are some extra points to note about this new variable:

  • an instance with an encrypted tablespace cannot be downgraded to use a version prior to 5.7.23, due to the inability to read the tablespace
  • as noted, it is not currently possible to convert the tablespace between encrypted and unencrypted states, or vice versa
  • the key for the system tablespace can be manually rotated using ALTER INSTANCE ROTATE INNODB MASTER KEY as per any other tablespace

Encrypting the parallel doublewrite buffer

To complement the encryption of the system tablespace, it is also possible to encrypt the parallel doublewrite buffer using innodb_parallel_dblwr_encrypt, a feature unique to Percona Server for MySQL.  This means that any data for an encrypted tablespace is also only written in an encrypted form in the parallel doublewrite buffer; unencrypted tablespace data remains in plaintext. Unlike innodb_sys_tablespace_encrypt, you are able to set innodb_parallel_dblwr_encrypt dynamically on an existing instance.

There are more encryption features planned–or already in development–for Percona Server for MySQL so watch this space!

The post Encryption of the InnoDB System Tablespace and Parallel Doublewrite Buffer appeared first on Percona Database Performance Blog.

by Ceri Williams at September 14, 2018 11:29 AM

September 13, 2018

MariaDB AB

Call for Papers – 2019 Annual MariaDB User Conference

Call for Papers – 2019 Annual MariaDB User Conference Shane Johnson Thu, 09/13/2018 - 16:42

The annual MariaDB user conference will be taking place February 25-27, 2019, at New York City's Conrad Hotel. For the first time, the event will span three days, which means more room for new topics. We want to hear your ideas! 

Submitting a proposal is easy. Come up with your presentation title and abstract, then submit it here by October 31, 2018. 

Topics and Tracks

Are you an innovator? Have you been in the trenches with MariaDB? If so, we invite you to share your knowledge and experience with the global MariaDB community. You’re just who we’re looking for!

Submit your best ideas involving: 

  • How you're using MariaDB in the real world

  • Challenges you've faced and lessons you've learned

  • Best practices and expert strategies

And let us know which speaking track you want to contribute to:

Real World: Adopting and embracing MariaDB in the enterprise to solve business challenges and drive innovation

Operations: Gaining operational agility with cloud/container infrastructure, automated provisioning and transparent routing, sharding and failover

Development: Building modern applications with advanced SQL (e.g., temporal queries) and flexible schemas (e.g., hybrid relational/JSON data models)

Analytics: Driving faster time-to-insight with ad-hoc, interactive analytics on near real-time data, and scaling to support 100s of billions of rows

Bonus – Free 3-Day Conference Pass for Speakers

In addition to a presentation in front of MariaDB's global community, if your speaking proposal is accepted (you'll be notified by mid-December), you'll receive a free three-day pass to the show, including access to two deep-dive workshops. 

Submit your speaking proposal by October 31, 2018.

We look forward to seeing you in New York February 25-27, 2019!

 

The annual MariaDB user conference will be taking place February 25-27, 2019, at New York City's Conrad Hotel. For the first time, the event will span three days, which means more room for new topics. We want to hear your ideas – submit your speaking proposal by October 31, 2018! 

Login or Register to post comments

by Shane Johnson at September 13, 2018 08:42 PM

Peter Zaitsev

Percona Toolkit 3.0.12 Is Now Available

percona toolkit

percona toolkitPercona announces the release of Percona Toolkit 3.0.12 on September 13, 2018.

Percona Toolkit is a collection of advanced open source command-line tools, developed and used by the Percona technical staff, that are engineered to perform a variety of MySQL®, MongoDB® and system tasks that are too difficult or complex to perform manually. With over 1,000,000 downloads, Percona Toolkit supports Percona Server for MySQL, MySQL®, MariaDB®, Percona Server for MongoDB and MongoDB.

Percona Toolkit, like all Percona software, is free and open source. You can download packages from the website or install from official repositories.

This release includes the following changes:

Fixed bugs:

 

  • PT-1611: pt-archiver failed to output UTF-8 characters.
  • PT-1603: pt-table-sync incorrectly calculated chunk boundaries in case of unsorted ENUM fields in indexes.
  • PT-1574: pt-online-schema-change failed on tables with a nullable unique key and a row with NULL values.
  • PT-1572: ENUM fields usage in keys was improved, resulting in higher speed for expressions with sorted ENUM items.
  • PT-1422: pt-mysql-summary could hang when NULL values appear in the processlist Time column.

Documentation change:

  • PT-1321: The required MySQL privileges were detailed in pt-online-schema-change documentation

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.

The post Percona Toolkit 3.0.12 Is Now Available appeared first on Percona Database Performance Blog.

by Dmitriy Kostiuk at September 13, 2018 06:29 PM

Analyzing Amazon Aurora Slow Logs with pt-query-digest

Amazon Aurora MySQL slow query logs with pt-query-digest slow

Amazon Aurora MySQL slow query logs with pt-query-digest slowIn this blog post we shall discuss how you can analyze slow query logs from Amazon Aurora for MySQL, (referred to as Amazon Aurora in the remaining blog). The tools and techniques explained here apply to the other MySQL compatible services available under Amazon Aurora. However, we’ll focus specially on analyzing slow logs from Amazon Aurora version 2 (MySQL 5.7 compatible) using pt-query-digest. We believe there is a bug in Aurora where it logs really big numbers for query execution and lock times for otherwise really fast queries.

So, the main steps we need are:

  1. Enable slow query logging on your Amazon Aurora DB parameter group, apply the change when appropriate.
  2. Download the slow log(s) that match the time that you are interested to investigate, and optionally concatenate them.
  3. Run pt-query-digest on the downloaded logs and check the results.

Enable slow query logging

For our testing we decided to capture all the SELECT queries that were hitting our Amazon Aurora instance, mainly because we had a sysbench OLTP read only workload and that wouldn’t really have a lot of slow queries. An easy way to do so is to enable the capture of slow query logs and set long_query_time to 0 — you will need to enable slow query logging. To achieve that, we created a new DB parameter group and applied it to our test Aurora instance with the following three parameters set as below:

slow_query_log=1
long_query_time=0
min_examined_row_limit=0

Once you have the above configuration applied to Amazon RDS, you will be able to see slow query logs being created in the Amazon RDS console.

Download the log file

You can download the log file of your choice using either the Amazon RDS console OR you can use the following AWS CLI command to achieve the same:

$ aws rds download-db-log-file-portion --db-instance-identifier perconasupport  --starting-token 0 --output text --log-file-name slowquery/mysql-slowquery.log.2018-09-03.09 > mysql-slowquery.log.2018-09-03.09

Depending on the size of the chosen log file, the above command will take some time to complete the download.

Run pt-query-digest on the log file

Once the file has been downloaded you can analyse that using the following pt-query-digest command.

$ pt-query-digest --group-by fingerprint --order-by Query_time:sum mysql-slowquery.log.2018-09-03.09

On our Aurora test slow log file, the initial results didn’t look right so we had to apply a workaround. Here is the header of the initial results from pt-query-digest:

# 456.2s user time, 2.5s system time, 43.80M rss, 141.48M vsz
# Current date: Tue Sep 4 15:54:21 2018
# Hostname: aahmed-GL503VD
# Files: mysql-slowquery.log.2018-09-03.09
# Overall: 5.13M total, 60 unique, 1.43k QPS, 507.43Gx concurrency _______
# Time range: 2018-09-03T08:00:04 to 2018-09-03T09:00:03
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 1826227663297288s 1us 18446744073710s 355917782s 761us 80127878922s 93us
# Lock time 1401952549601936s 0 18446744073710s 273229812s 44us 70205933577s 23us
# Rows sent 94.71M 0 100 19.35 97.36 37.62 0.99
# Rows examine 216.26M 0 300 44.19 299.03 84.74 0.99
# Query size 196.24M 5 1.24k 40.08 72.65 18.90 36.69
# Profile
# Rank Query ID Response time Calls R/Call
# ==== ====================== =========================== ======= ========
# 1 0xE81D0B3DB4FB31BC5... 1346612317380813.0000 73.7% 3194111 421592210.5966 18... SELECT sbtest?
# 2 0x9934EF6887CC7A638... 147573952589685.0625 8.1% 319381 462062403.8051 18... SELECT sbtest?
# 3 0x8D589AFA4DFAEEED8... 110680464442264.1094 6.1% 319411 346514254.1812 18... BEGIN
# 4 0xFF7C69F51BBD3A736... 92233720368565.1875 5.1% 319388 288782673.0139 18... SELECT sbtest?
# 5 0xFFFCA4D67EA0A7888... 73786976294861.9844 4.0% 321238 229695665.8143 18... COMMIT
# MISC 0xMISC 55340232221335.8281 3.0% 657509 84166501.4796 0.0 <43 ITEMS>

What’s wrong with the above results is that the total query Exec time and Lock time are very large numbers. Digging deeper into the logs revealed a problem with the slow logs themselves that had very large numbers for Query time & Lock time for some queries. For instance in our case, of 5.13 million queries in the log file, only 111 had the anomaly. Even so, it was enough to skew the results.

# Time: 2018-09-03T08:41:47.363522Z
--
SELECT c FROM sbtest1 WHERE id=24278;
# Time: 2018-09-03T08:41:49.363224Z
# User@Host: perconasupport[perconasupport] @ [172.30.2.111] Id: 20869
# Query_time: 18446744073709.550781 Lock_time: 18446744073709.550781 Rows_sent: 1 Rows_examined: 1
SET timestamp=1535964109;
SELECT c FROM sbtest2 WHERE id=989322;
# Time: 2018-09-03T08:41:49.363296Z
--
BEGIN;
# Time: 2018-09-03T08:41:53.362947Z
# User@Host: perconasupport[perconasupport] @ [172.30.2.111] Id: 20873
# Query_time: 18446744073709.550781 Lock_time: 18446744073709.550781 Rows_sent: 1 Rows_examined: 1
SET timestamp=1535964113;
SELECT c FROM sbtest1 WHERE id=246889;
# Time: 2018-09-03T08:41:53.363003Z

Incorrect logging

The above two queries are, in fact, really fast, but for some reason the execution time & lock times are wrongly logged in the slow query log. Since the number of such query log records is statistically negligible compared to the total number of queries, we decided to ask pt-query-digest to ignore them using the command line parameter –attribute-value-limit . The default value of this parameter is 0. We decided to increase that to 2^32, and make it ignore the large numbers from the slow query log. So, the pt-query-digest command became:

$ pt-query-digest --group-by fingerprint --order-by Query_time:sum --attribute-value-limit=4294967296 mysql-slowquery.log.2018-09-03.09

This caused the 111 queries with the bad log times to be ignored and the results looked good. In our case, the ignored queries were bad variants of queries for which good versions existed. You can tell this because the number of unique queries remained the same as before after the bad variants were ignored. However, this may not always hold true and one should expect to lose some fidelity, especially if you are analyzing a smaller slow log.

# 441s user time, 450ms system time, 38.19M rss, 111.76M vsz
# Current date: Tue Sep 4 16:23:33 2018
# Hostname: aahmed-GL503VD
# Files: mysql-slowquery.log.2018-09-03.09
# Overall: 5.13M total, 60 unique, 1.43k QPS, 0.30x concurrency __________
# Time range: 2018-09-03T08:00:04 to 2018-09-03T09:00:03
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 1096s 1us 198ms 213us 761us 431us 93us
# Lock time 180s 0 103ms 34us 44us 161us 23us
# Rows sent 94.71M 0 100 19.35 97.36 37.62 0.99
# Rows examine 216.26M 0 300 44.19 299.03 84.74 0.99
# Query size 196.24M 5 1.24k 40.08 72.65 18.90 36.69
# Profile
# Rank Query ID Response time Calls R/Call V/M Ite
# ==== =========================== ============== ======= ====== ===== ===
# 1 0xE81D0B3DB4FB31BC558CAE... 400.1469 36.5% 3194111 0.0001 0.00 SELECT sbtest?
# 2 0xF0C5AE75A52E847D737F39... 161.4065 14.7% 319453 0.0005 0.00 SELECT sbtest?
# 3 0xFFFCA4D67EA0A788813031... 155.8740 14.2% 321238 0.0005 0.00 COMMIT
# 4 0x8D589AFA4DFAEEED85FFF5... 107.9827 9.9% 319411 0.0003 0.00 BEGIN
# 5 0x9934EF6887CC7A6384D1DE... 94.1002 8.6% 319381 0.0003 0.00 SELECT sbtest?
# 6 0xFF7C69F51BBD3A736EEB1B... 79.9279 7.3% 319388 0.0003 0.00 SELECT sbtest?
# 7 0xA729E7889F57828D3821AE... 75.3969 6.9% 319398 0.0002 0.00 SELECT sbtest?
# MISC 0xMISC 21.1212 1.9% 18658 0.0011 0.0 <41 ITEMS>
# Query 1: 1.27k QPS, 0.16x concurrency, ID 0xE81D0B3DB4FB31BC558CAEF5F387E929 at byte 358647353
# Scores: V/M = 0.00
# Time range: 2018-09-03T08:00:04 to 2018-09-03T08:42:00
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 62 3194111
# Exec time 36 400s 10us 198ms 125us 332us 300us 80us
# Lock time 74 134s 0 26ms 42us 49us 154us 27us
# Rows sent 3 3.01M 0 1 0.99 0.99 0.11 0.99
# Rows examine 1 3.01M 0 1 0.99 0.99 0.11 0.99
# Query size 57 112.37M 32 38 36.89 36.69 0.53 36.69
# String:
# Databases perconasupport
# Hosts 172.30.2.111
# Users perconasupport
# Query_time distribution
# 1us
# 10us ################################################################
# 100us ##############
# 1ms #
# 10ms #
# 100ms #
# 1s

That number looks familiar

The really big number 18446744073709.550781 seemed to ring a bell. A quick web search revealed that it could be a regression of an old bug in MySQL’s code. The following bugs were found to have the same value being reported for query exec time & query lock time.

  1. https://bugs.mysql.com/bug.php?id=59757
  2. https://bugs.mysql.com/bug.php?id=63524
  3. https://bugs.mysql.com/bug.php?id=35396
Once slow logs were enabled, we used this sysbench command  to generate the workload for the Amazon Aurora instance. You might like to try it yourselves. Please note that this used sysbench version 1.0.14.

$ sysbench --db-driver=mysql --mysql-user=perconasupport --mysql-host=perconasupport-1234567.cgmobiazycdv.eu-west-1.rds.amazonaws.com --mysql-password=XXXXXXX  --mysql-db=perconasupport --range_size=100 --table_size=1000000 --tables=2 --threads=6 --events=0 --time=600 --rand-type=uniform /usr/share/sysbench/oltp_read_only.lua run

If you are an Amazon Aurora user, have you found any problems analyzing slow query logs? You are welcome to use the comments section, below, to let me know.

Percona Toolkit

pt-query-digest is part of Percona Toolkit, a collection of advanced open source command-line tools, developed and used by the Percona technical staff. Percona Toolkit is open source and free to download and use.

The post Analyzing Amazon Aurora Slow Logs with pt-query-digest appeared first on Percona Database Performance Blog.

by Abbas Ahmed at September 13, 2018 01:17 PM

MariaDB Foundation

Improved MariaDB performance using Shannon Systems PCIe and atomic writes

We recently had a chance to test how the Shannon Direct-IO PCIe Flash G3i devices improves the performance of MariaDB. Anybody who cares about I/O performance has noticed that PCIe drives have become readily available in recent years and that they are significantly faster than normal SSD drives connected to the traditional SATA port (not to […]

The post Improved MariaDB performance using Shannon Systems PCIe and atomic writes appeared first on MariaDB.org.

by Otto Kekäläinen at September 13, 2018 05:30 AM

September 12, 2018

MariaDB AB

What Does It Take to Achieve Freedom from Oracle? 5 Factors That Made One Company Take the Leap

What Does It Take to Achieve Freedom from Oracle? 5 Factors That Made One Company Take the Leap MariaDB Team Wed, 09/12/2018 - 14:30

Financial Network, Inc. (FNI) was looking to evolve its credit strategy and loan-origination services after more than 30 years in business – but the "astronomical" fees they were paying to use Oracle Database were keeping them from it. They needed to move on, and chose MariaDB to power their growth. Let’s take a look at how (and why) they did it.

 

“We were doing business, we were making money, but we didn’t have money to put back into the business – to make the business thrive, to expand our marketing, to go after new customers, as well as to be competitive with other companies that fill the same little niche that FNI does. MariaDB has allowed us to do that.”
—William Wood, Director of Database Architecture, FNI

 

Here are five main factors that sealed FNI's decision to escape Oracle in favor of MariaDB:

1. ROI. As the quote above shows, FNI needed significantly greater ROI from its database solution in order to grow its business. Wood explains, "When we looked at the price to migrate to MariaDB from a licensing and vendor cost standpoint, it didn't compare. Oracle is out of the ballpark."

2. Scalability. Oracle's onerous $47,500-per-processor pricing model had FNI's hands tied. Wood explains, “We couldn’t afford to upgrade hardware because going from a quad-core processor to the latest and greatest that has 96 cores ... can you imagine the cost of that? Moving to MariaDB, the scalability is there.” Wood sums up MariaDB's enterprise open source model and per-server licensing fees this way: "It's simple. It just makes sense. It's a good way to do business."

3. Encryption of data at rest. "That was the clincher for us," says Wood. Getting a similar feature from Oracle required a pricy add-on, whereas advanced enterprise security is built into MariaDB TX, at no extra cost.

4. Truly supportive support. "You look at these big companies like Oracle and [IBM's] Informix. It's astronomical just to get that license, and then once you're licensed, then you're hit every year after for support," laments Wood. "MariaDB beats them out of the water. The support is great. They're responsive in a very timely manner. I can't say enough good about them."

5. Migration assistance. "You can greatly shorten your migration from any other database to MariaDB by leveraging onsite training – training from a developer standpoint, training from a database administrator standpoint – and we did that, and it helped us to be successful," Wood explains. And now migration is even easier: shortly after FNI took its first MariaDB server live in production, MariaDB TX 3.0 debuted, introducing built-in Oracle-compatibility features.

 

Learn more about FNI's journey from Oracle to MariaDB:

Financial Network, Inc. (FNI) was looking to evolve its credit strategy and loan-origination services after more than 30 years in business – but the "astronomical" fees they were paying to use Oracle Database were keeping them from it. They needed to move on, and chose MariaDB to power their growth. Learn how (and why) they did it.

Login or Register to post comments

by MariaDB Team at September 12, 2018 06:30 PM

Peter Zaitsev

Percona Server for MySQL 5.7.23-23 Is Now Available

Percona Server for MySQL 5.6

Percona Server for MySQL 5.7Percona announces the release of Percona Server for MySQL 5.7.23-23 on September 12, 2018 (downloads are available here and from the Percona Software Repositories). This release merges changes of MySQL 5.7.23, including all the bug fixes in it. Percona Server for MySQL 5.7.23-23 is now the current GA release in the 5.7 series. All of Percona’s software is open-source and free.

New Features
  • The max_binlog_files variable is deprecated, and the binlog_space_limit variable should be used instead of it. The behavior of binlog_space_limit is consistent with the variable relay-log-space-limit used for relay logs; both variables have the same semantics. For more information, see #275.
  • Starting with 5.7.23-23, it is possible to encrypt all data in the InnoDB system tablespace and in the parallel double write buffer. This feature is considered ALPHA quality. A new variable innodb_sys_tablespace_encrypt is introduced to encrypt the system tablespace. The encryption of the parallel double write buffer file is controlled by the variable innodb_parallel_dblwr_encrypt. Both variables are OFF by default. For more information, see #3822.
  • Changing rocksdb_update_cf_options returns any warnings and errors to the client instead of printing them to the server error log. For more information, see #4258.
  • rocksdb_number_stat_computers and rocksdb_rate_limit_delay_millis variables have been removed. For more information, see #4780.
  • A number of new variables were introduced for MyRocks: rocksdb_rows_filtered to show the number of rows filtered out for TTL in MyRocks tables, rocksdb_bulk_load_allow_sk to allow adding secondary keys using the bulk loading feature, rocksdb_error_on_suboptimal_collation toggling warning or error in case of an index creation on a char field where the table has a sub-optimal collation, rocksdb_stats_recalc_rate specifying the number of indexes to recalculate per second, rocksdb_commit_time_batch_for_recovery toggler of writing the commit time write batch into the database, and rocksdb_write_policy specifying when two-phase commit data are actually written into the database.
Bugs Fixed
  • The statement SELECT...ORDER BY produced inconsistent results with the euckr charset or euckr_bin collation. Bug fixed #4513 (upstream #91091).
  • InnoDB statistics could incorrectly report zeros in the slow query log. Bug fixed #3828.
  • With the FIPS mode enabled and performance_schema=off, the instance crashed when running the CREATE VIEW command. Bug fixed #3840.
  • The soft limit of the core file size was set incorrectly starting with PS 5.7.21-20. Bug fixed #4479.
  • The option innodb-optimize-keys could fail when a dumped table has two columns such that the name of one of them contains the other as as a prefix and is defined with the AUTO_INCREMENT attribute. Bug fixed #4524.
  • When innodb_temp_tablespace_encrypt was set to ON the CREATE TABLE command could ignore the value of the ENCRYPTION option. Bug fixed #4565.
  • If FLUSH STATUS was run from a different session, a statement could be counted twice in GLOBAL STATUS. Bug fixed #4570 (upstream #91541).
  • In some cases, it was not possible to set the flush_caches variable on systems that use systemd. Bug fixed #3796.
  • A message in the MyRocks log file did not clearly inform whether fast CRC32 was supported. Bug fixed #3988.
  • mysqld could not be started on Ubuntu if the database recovery had taken longer than ten minutes. Bug fixed #4546 (upstream #91423).
  • The ALTER TABLE command was slow when the number of dirty pages was high. Bug fixed #3702.
  • Setting the global variable version_suffix to NULL could lead to a server crash. Bug fixed #4785.
Other Bugs Fixed
  • #4620 “Enable encryption of temporary tablespace from foreground thread”
  • #4727 “intrinsic temp table behaviour shouldn’t depend on innodb_encrypt_tables”
  • #4046 “Ship assert failure: ‘res == 0’ (bulk loader)”
  • #3851 “Percona Ver 5.6.39-83.1 Failing assertion: sym_node->table != NULL”
  • #4533 “audit_log MTR tests should refer to include files without parent directories”
  • #4619 “main.flush_read_lock fails with timeout in wait_condition.inc.”
  • #4561 “Read after free at Binlog_crypt_data::load_latest_binlog_key()”
  • #4587 “ROCKSDB_INCLUDE_RFR macro in wrong file”

Find the release notes for Percona Server for MySQL 5.7.23-23 in our online documentation. Report bugs in the Jira bug tracker.

The post Percona Server for MySQL 5.7.23-23 Is Now Available appeared first on Percona Database Performance Blog.

by Borys Belinsky at September 12, 2018 06:01 PM

Jean-Jerome Schmidt

How to Deploy a Production-Ready MySQL or MariaDB Galera Cluster using ClusterControl

Deploying a database cluster is not rocket science - there are many how-to’s on how to do that. But how do you know what you just deployed is production-ready? Manual deployments can also be tedious and repetitive. Depending on the number of nodes in the cluster, the deployment steps may be time-consuming and error-prone. Configuration management tools like Puppet, Chef and Ansible are popular in deploying infrastructure, but for stateful database clusters, you need to perform significant scripting to handle deployment of the whole database HA stack. Moreover, the chosen template/module/cookbook/role has to be meticulously tested before you can trust it as part of your infrastructure automation. Version changes require the scripts to be updated and tested again.

The good news is that ClusterControl automates deployments of the entire stack - and for free as well! We’ve deployed thousands of production clusters, and take a number of precautions to ensure they are production-ready Different topologies are supported, from master-slave replication to Galera, NDB and InnoDB cluster, with different database proxies on top.

A high availability stack, deployed through ClusterControl, consists of three layers:

  • Database layer (e.g., Galera Cluster)
  • Reverse proxy layer (e.g., HAProxy or ProxySQL)
  • Keepalived layer, which, with use of Virtual IP, ensures high availability of the proxy layer

In this blog, we are going to show you how to deploy a production-grade Galera Cluster complete with load balancers for high availability setup. The complete setup consists of 6 hosts:

  • 1 host - ClusterControl (deployment, monitoring, management server)
  • 3 hosts - MySQL Galera Cluster
  • 2 hosts - Reverse proxies act as load balancers in front of the cluster.

The following diagram illustrates our end result once deployment is complete:

Prerequisites

ClusterControl must reside on an independant node which is not part of the cluster. Download ClusterControl, and the page will generate a license unique for you and show the steps to install ClusterControl:

$ wget -O install-cc https://severalnines.com/scripts/install-cc
$ chmod +x install-cc
$ ./install-cc # as root or sudo user

Follow the instructions where you will be guided with setting up MySQL server, MySQL root password on the ClusterControl node, cmon password for ClusterControl usage and so on. You should get the following line once the installation has completed:

Determining network interfaces. This may take a couple of minutes. Do NOT press any key.
Public/external IP => http://{public_IP}/clustercontrol
Installation successful. If you want to uninstall ClusterControl then run install-cc --uninstall.

Then, on the ClusterControl server, generate an SSH key which we will use to setup the passwordless SSH later on. You can use any user in the system but it must have the ability to perform super-user operations (sudoer). In this example, we picked the root user:

$ whoami
root
$ ssh-keygen -t rsa

Set up passwordless SSH to all nodes that you would like to monitor/manage via ClusterControl. In this case, we will set this up on all nodes in the stack (including ClusterControl node itself). On ClusterControl node, run the following commands and specify the root password when prompted:

$ ssh-copy-id root@192.168.55.160 # clustercontrol
$ ssh-copy-id root@192.168.55.161 # galera1
$ ssh-copy-id root@192.168.55.162 # galera2
$ ssh-copy-id root@192.168.55.163 # galera3
$ ssh-copy-id root@192.168.55.181 # proxy1
$ ssh-copy-id root@192.168.55.182 # proxy2

You can then verify if it's working by running the following command on ClusterControl node:

$ ssh root@192.168.55.161 "ls /root"

Make sure you are able to see the result of the command above without the need to enter password.

Deploying the Cluster

ClusterControl supports all vendors for Galera Cluster (Codership, Percona and MariaDB). There are some minor differences which may influence your decision for choosing the vendor. If you would like to learn about the differences between them, check out our previous blog post - Galera Cluster Comparison - Codership vs Percona vs MariaDB.

For production deployment, a three-node Galera Cluster is the minimum you should have. You can always scale it out later once the cluster is deployed, manually or via ClusterControl. We’ll open our ClusterControl UI at https://192.168.55.160/clustercontrol and create the first admin user. Then, go to the top menu and click Deploy -> MySQL Galera and you will be presented with the following dialog:

There are two steps, the first one is the "General & SSH Settings". Here we need to configure the SSH user that ClusterControl should use to connect to the database nodes, together with the path to the SSH key (as generated under Prerequisite section) as well as the SSH port of the database nodes. ClusterControl presumes all database nodes are configured with the same SSH user, key and port. Next, give the cluster a name, in this case we will use "MySQL Galera Cluster 5.7". This value can be changed later on. Then select the options to instruct ClusterControl to install the required software, disable the firewall and also disable the security enhancement module on the particular Linux distribution. All of these are recommended to be toggled on to maximize the potential of successful deployment.

Click Continue and you will be presented with the following dialog:

In the next step, we need to configure the database servers - vendor, version, datadir, port, etc - which are pretty self-explanatory. "Configuration Template" is the template filename under /usr/share/cmon/templates of the ClusterControl node. "Repository" is how ClusterControl should configure the repository on the database node. By default, it will use the vendor repository and install the latest version provided by the repository. However, in some cases, the user might have a pre-existing repository mirrored from the original repository due to security policy restriction. Nevertheless, ClusterControl supports most of them, as described in the user guide, under Repository.

Lastly, add the IP address or hostname (must be a valid FQDN) of the database nodes. You will see a green tick icon on the left of the node, indicating ClusterControl was able to connect to the node via passwordless SSH. You are now good to go. Click Deploy to start the deployment. This may take 15 to 20 minutes to complete. You can monitor the deployment progress under Activity (top menu) -> Jobs -> Create Cluster:

Once the deployment completed, at this point, our architecture can be illustrated as below:

Deploying the Load Balancers

In Galera Cluster, all nodes are equal - each node holds the same role and same dataset. Therefore, there is no failover within the cluster if a node fails. Only the application side requires failover, to skip the inoperational nodes while the cluster is partitioned. Therefore, it's highly recommended to place load balancers on top of a Galera Cluster to:

  • Unify the multiple database endpoints to a single endpoint (load balancer host or virtual IP address as the endpoint).
  • Balance the database connections between the backend database servers.
  • Perform health checks and only forward the database connections to healthy nodes.
  • Redirect/rewrite/block offending (badly written) queries before they hit the database servers.

There are three main choices of reverse proxies for Galera Cluster - HAProxy, MariaDB MaxScale or ProxySQL - all can be installed and configured automatically by ClusterControl. In this deployment, we picked ProxySQL because it checks all the above plus it understands the MySQL protocol of the backend servers.

In this architecture, we want to use two ProxySQL servers to eliminate any single-point-of-failure (SPOF) to the database tier, which will be tied together using a floating virtual IP address. We’ll explain this in the next section. One node will act as the active proxy and the other one as hot-standby. Whichever node that holds the virtual IP address at a given time is the active node.

To deploy the first ProxySQL server, simply go to the cluster action menu (right-side of the summary bar) and click on Add Load Balancer -> ProxySQL -> Deploy ProxySQL and you will see the following:

Again, most of the fields are self-explanatory. In the "Database User" section, ProxySQL acts as a gateway through which your application connects to the database. The application authenticates against ProxySQL, therefore you have to add all of the users from all the backend MySQL nodes, along with their passwords, into ProxySQL. From ClusterControl, you can either create a new user to be used by the application - you can decide on its name, password, access to which databases are granted and what MySQL privileges that user will have. Such user will be created on both MySQL and ProxySQL side. Second option, more suitable for existing infrastructures, is to use the existing database users. You need to pass username and password, and such user will be created only on ProxySQL.

The last section, "Implicit Transaction", ClusterControl will configure ProxySQL to send all of the traffic to the master if we started transaction with SET autocommit=0. Otherwise, if you use BEGIN or START TRANSACTION to create a transaction, ClusterControl will configure read/write split in the query rules. This is to ensure ProxySQL will handle transactions correctly. If you have no idea how your application does this, you can pick the latter.

Repeat the same configuration for the second ProxySQL node, except the "Server Address" value which is 192.168.55.182. Once done, both nodes will be listed under "Nodes" tab -> ProxySQL where you can monitor and manage them directly from the UI:

At this point, our architecture is now looking like this:

If you would like to learn more about ProxySQL, do check out this tutorial - Database Load Balancing for MySQL and MariaDB with ProxySQL - Tutorial.

Deploying the Virtual IP Address

The final part is the virtual IP address. Without it, our load balancers (reverse proxies) would be the weak link as they would be a single-point of failure - unless the application has the ability to automatically redirect failed database connections to another load balancer. Nevertheless, it's good practice to unify them both using virtual IP address and simplify the connection endpoint to the database layer.

From ClusterControl UI -> Add Load Balancer -> Keepalived -> Deploy Keepalived and select the two ProxySQL hosts that we have deployed:

Also, specify the virtual IP address and the network interface to bind the IP address. The network interface must exist on both ProxySQL nodes. Once deployed, you should see the following green checks in the summary bar of the cluster:

At this point, our architecture can be illustrated as below:

Our database cluster is now ready for production usage. You can import your existing database into it or create a fresh new database. You can use the Schemas and Users Management feature if the trial license hasn't expired.

To understand how ClusterControl configures Keepalived, check out this blog post, How ClusterControl Configures Virtual IP and What to Expect During Failover.

Connecting to the Database Cluster

From the application and client standpoint, they need to connect to 192.168.55.180 on port 6033 which is the virtual IP address floating on top of the load balancers. For example, the Wordpress database configuration will be something like this:

/** The name of the database for WordPress */
define( 'DB_NAME', 'wp_myblog' );

/** MySQL database username */
define( 'DB_USER', 'wp_myblog' );

/** MySQL database password */
define( 'DB_PASSWORD', 'mysecr3t' );

/** MySQL hostname - virtual IP address with ProxySQL load-balanced port*/
define( 'DB_HOST', '192.168.55.180:6033' );

If you would like to access the database cluster directly, bypassing the load balancer, you can just connect to port 3306 of the database hosts. This is usually required by the DBA staff for administration, management, and troubleshooting. With ClusterControl, most of these operations can be performed directly from the user interface.

Final Thoughts

As shown above, deploying a database cluster is no longer a difficult task. Once deployed, there a full suite of free monitoring features as well as commercial features for backup management, failover/recovery and others. Fast deployment of different types of cluster/replication topologies can be useful when evaluating high availability database solutions, and how they fit to your particular environment.

by ashraf at September 12, 2018 09:54 AM

Peter Zaitsev

Announcement: Experimental Build of Percona XtraBackup 8.0

Percona XtraBackup 8.0

Percona XtraBackup 8.0Experimental Build of Percona XtraBackup 8.0 released

An experimental alpha version of Percona XtraBackup 8.0.1 is now available in the Percona experimental software repositories.

A few things to note about this release:

  • We removed the deprecated innobackupex in this release
  • Due to the new MySQL redo log and data dictionary formats the Percona XtraBackup 8.0.x versions will only be compatible with MySQL 8.0.x and the upcoming Percona Server for MySQL 8.0.x
  • For experimental migrations from earlier database server versions, you will need to backup and restore and using XtraBackup 2.4 and then use mysql_upgrade from MySQL 8.0.x

PXB 8.0.1 alpha is available for the following platforms:

  • RHEL/Centos 6.x
  • RHEL/Centos 7.x
  • Ubuntu 14.04 Trusty*
  • Ubuntu 16.04 Xenial
  • Ubuntu 18.04 Bionic
  • Debian 8 Jessie*
  • Debian 9 Stretch

Information on how to configure the Percona repositories for apt and yum systems and access the Percona experimental software is here.

* We might drop these platforms before GA release.

The post Announcement: Experimental Build of Percona XtraBackup 8.0 appeared first on Percona Database Performance Blog.

by David Bennett at September 12, 2018 07:25 AM

September 11, 2018

Peter Zaitsev

How To Deploy PMM on Linode With StackScripts

rebuiild from a StackScript

In my previous blog post, I showed how to deploy Percona Monitoring and Management (PMM) on Linode manually. It is pretty simple, but with a little coding it can be done even more easily using StackScripts

Here’s how:

1. Click on the “Add a Linode” and pick a Linode type you want to deploy.

2. Click on the deployed Linode and then click on the “Rebuild” Link

Rebuild the Linode

3. Click on Deploy Using StackScripts

Deploy using StackScripts

4. On the resulting page search for “PMM” and pick PMMServer from PerconaLab.

Choose PMMServer from PerconaLab

5. Provide the host name for new Linode, pick the root password and click on “Rebuild”

6. Boot the server.

boot the server

7.  You’re done. Wait for about 5 minutes for the installation to complete, then you can see PMM interface by going to this Linode IP

view PMM on Linode IP

If you think that a manual deployment with StackScripts is not much less hassle than doing it manually, you’re right. The real benefit comes with using Linode API for deployment.

There are multiple way to access this API, though for basic scripting I prefer the linode-cli tool for using the Linode API from the command line.

With linode-cli  you can deploy your PMM Server on Linode using this one liner:

linode-cli linodes create --label pmm-test  --root_pass MyRootPassword123 --stackscript_id 338458  --stackscript_data '{"hostname": "pmm-test"}'

Summary

As you can see, with Linode StackScripts you can get going with Percona Monitoring and Management on Linode in no time, especially if you chose to use the Linode API.

You might also like:

Here’s an overview from the Percona Monitoring and Management manual on deploying PMM. If you are new to PMM and would like to know more, you will find lots of resources on this site including my webinar MySQL Troubleshooting and Performance Optimization with PMM.

The post How To Deploy PMM on Linode With StackScripts appeared first on Percona Database Performance Blog.

by Peter Zaitsev at September 11, 2018 02:02 PM

Open Query Pty Ltd

Press Release 2018-09-11: Open Query acquired by Catalyst IT Australia Pty Limited

We are pleased to announce that Open Query, Queensland-based provider of MySQL, MariaDB and related services which just celebrated its 11-th anniversary, has been acquired by Catalyst IT Australia.

Founded in New Zealand in 1997, Catalyst is an experienced and respected Open Source integrator.  Catalyst is looking forward to the opportunity to work with the current Open Query clients as well as with new prospects. Catalyst offers a broad suite of Enterprise services, including support and custom development for Drupal, SilverStripe CMS, Moodle, Samba and other software, as well as fully managed hosting on AWS and other platforms.

“Catalyst’s core values are very much aligned with those of Open Query, which is why we are particularly pleased with this outcome”, notes Arjen Lentz, Founder and Exec.Director of Open Query.

Catalyst IT Australia has offices in Sydney, Melbourne and Brisbane.

Contacts

For Open Query Pty Ltd

Arjen Lentz, Exec.Director
https://openquery.com.au

For Catalyst IT Australia Pty Limited

Andrew Boag, Managing Director
https://www.catalyst-au.net/
Phone (02) 8203 9777

by Arjen Lentz at September 11, 2018 05:00 AM