Planet MariaDB

September 21, 2017

Peter Zaitsev

Percona Live Europe Featured Talks: Modern sysbench – Teaching an Old Dog New Tricks with Alexey Kopytov

Percona Live Europe 2017

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Alexey Kopytov, sofware developer and maintainer of sysbench. His talk is Modern sysbench: Teaching an Old Dog New Tricks. His presentation present new features provided by recent releases and explain how they can be used to create complex benchmark scenarios and collect performance metrics with a simple Lua API. It will also run a live demo of some of the new sysbench features.

In our conversation, we discussed benchmarking your database environment:

Percona: How did you get into database technology? What do you love about it?

Alexey: It was 2003, and I was working as a software developer for a boring company providing hosted VoIP solutions. I was a big fan of the free and open source software philosophy, which was way less popular back then than it is today. I contributed to a number of open source projects in my free time, but I also had a dream of developing open source software as part of my paid job. This looked completely unrealistic at the time, until I came across a job posting on a Russian IT forum about a Swedish company called MySQL AB looking for software developers to work remotely on MySQL! That sounded like my dream job, so I applied.

I knew very little about database internals at the time, so looking back I was giving terrible answers during my job interviews. Nevertheless, I joined the High Performance Group at MySQL AB after a few months, and that has defined my professional life for many years.

I love database technology because it presents the toughest challenges in software development. Most problems and solutions related to ever-evolving hardware, scalability and data processing requirements are discovered first by people from the database world.

Percona: Your talk is called “Modern sysbench: Teaching an Old Dog New Tricks”. What is sysbench used for generally, why is it important and how have you used it in your career? 

Alexey: sysbench was an internal project that I took over as soon as I joined MySQL AB. We used it to troubleshoot customer issues, find performance bottlenecks in MySQL and evaluate new features. Of course it was an open source project, so over the years we’ve got many people from the MySQL community using sysbench for all kinds of performance research like testing new hardware, identifying performance-related issues and comparing MySQL configurations, versions and forks.

Percona: What are some of the important new developments in the latest release?

Alexey: This year sysbench got a major upgrade in terms of features and performance to meet the modern world of many-core CPUs, powerful storage devices and distributed database systems capable of processing millions of transactions per second. Some feature highlights from the latest release include simplified command-line interface, a revamped API which allows creating more complex benchmark scenarios with less code, new performance metrics, customizable reports and more!

Percona: What do you want attendees to take away from your session? Why should they attend?

Alexey: sysbench is quite popular, but most people rarely use it more than a few bundled OLTP-style benchmarks. I’d like to explain its full potential, especially the possibilities provided by the new features. I want people to use it to create their own benchmarks, not necessarily related to MySQL, and hopefully find sysbench useful in areas that I have not even envisioned myself.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Alexey: For me Percona Live conferences have always been the place where I can feel the pulse of the technology and learn from the smartest people in the industry. This is especially true now that Percona Live provides talks on diverse topics from communities and database management technologies other than MySQL. Which makes it an even greater event to share ideas, solutions and expertise.

Want to find out more about Alexey, sysbench and database benchmarking? Register for Percona Live Europe 2017, and see his talk Modern sysbench: Teaching an Old Dog New Tricks. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 21, 2017 04:40 PM

Oli Sennhauser

Galera Load Balancer the underestimated wallflower

There are some pretty sophisticated Load Balancers for Galera Clusters setups out in the market (ProxySQL, MaxScale, HAproxy, ...). They have many different exotic features. You can nearly do everything with them. But this comes at the cost of complexity. Non of them is simple any more.

A widely underestimated Load Balancer solution for Galera Cluster setups is the Galera Load Balancer from Codership. It is an simple Load Balancer solution which serves all of our daily needs when it comes to Galera Cluster. Unfortunately this product is not much promoted by the software vendor himself.

Installation of Galera Load Balancer

This starts with the installation. There are no packages ready to install. You have to compile Galera Load Balancer yourself. FromDual provides some compiled packages or can help you building and installing it.

You can get the Galera Load Balancer sources from Github. The binaries are built straight forward:

shell> git clone https://github.com/codership/glb
shell> cd glb/
shell> ./bootstrap.sh
shell> ./configure
shell> make
shell> make install

If you prefer a binary tar ball as I do, you can run the following commands instead of make install:

shell> TARGET=glb-1.0.1-linux-$(uname -m)
shell> mkdir -p ${TARGET}/sbin ${TARGET}/lib ${TARGET}/share/glb
shell> cp src/glbd ${TARGET}/sbin/
shell> cp src/.libs/libglb.a src/.libs/libglb.so* ${TARGET}/lib/
shell> cp files/* ${TARGET}/share/glb/
shell> cp README NEWS COPYING CONTRIBUTORS.txt CONTRIBUTOR_AGREEMENT.txt ChangeLog BUGS AUTHORS
shell> tar czf ${TARGET}.tar.gz ${TARGET}
shell> rm -rf ${TARGET}

Configuration of Galera Load Balancer

The Galera Load Balancer is configured in a file called glbd which must be located under /etc/sysconfig/gldb (Red Hat and its derivatives) or /etc/default/glbd (Debian and its derivatives). I did not find any option to tell Galera Load Balancer where to search for a configuration file.

The Galera Load Balancer parameters are documented here.

Starting and Stopping Galera Load Balancer

This means for me I have to specify all my parameters on the command line:

product/glb/sbin/glbd --threads 8 --max_conn 500 \
  --round --fifo /home/mysql/run/glbd.fifo --control 127.0.0.1:3333 \
  127.0.0.1:3306 \
  192.168.1.1:3306:1 192.168.1.2:3306:2 192.168.1.3:3306:1

An equivalent configuration file would look as follows:

#
# /etc/sysconfig/glbd.cfg
#
LISTEN_ADDR="127.0.0.1:3306"
CONTROL_ADDR="127.0.0.1:3333"
CONTROL_FIFO="/home/mysql/run/glbd.fifo"
THREADS="8"
MAX_CONN="500"
DEFAULT_TARGETS="192.168.1.1:3306:1 192.168.1.2:3306:2 192.168.1.3:3306:1"
OTHER_OPTIONS="--round"

Stopping Galera Load Balancer is simple:

killall glbd

Galera Load Balancer operations

Beside starting and stopping Galera Load Balancer you also want to look into it. This can be done with the following 2 commands:

echo getinfo | nc -q 1 127.0.0.1 3333
echo getstats | nc -q 1 127.0.0.1 3333

Or if you want to have it in a more top/vmstat like style:

watch -n 1 "echo getstats | nc -q 1 127.0.0.1 3333"

watch -n 1 -d "echo getinfo | nc -q 1 127.0.0.1 3333"

More interesting are operations like draining and undraining a Galera Cluster node from the Galera Load Balancer. To drain a Galera Cluster node for example for maintenance (kernel upgrade?) you can run the following command:

echo "192.168.1.2:3306:0" | nc 127.0.0.1 3333

To undrain the node again it works like this:

echo "192.168.1.2:3306:2" | nc 127.0.0.1 3333

Unfortunately Galera Load Balancer does not memorize the weight (:2).

If you want to remove or add a node from/to the Galera Load Balancer this works as follows:

echo "192.168.1.2:3306:-1" | nc 127.0.0.1 3333

echo "192.168.1.2:3306:1" | nc 127.0.0.1 3333

Further Galera Load Balancer operation tasks you can find in the documentation.

by Shinguz at September 21, 2017 01:25 PM

September 20, 2017

Peter Zaitsev

sysbench Histograms: A Helpful Feature Often Overlooked

Sysbench Histograms

Sysbench HistogramsIn this blog post, I will demonstrate how to run and use sysbench histograms.

One of the features of sysbench that I often I see overlooked (and rarely used) is its ability to produce detailed query response time histograms in addition to computing percentile numbers. Looking at histograms together with throughput or latency over time provides many additional insights into query performance.

Here is how you get detailed sysbench histograms and performance over time:

sysbench --rand-type=uniform --report-interval=1 --percentile=99 --time=300 --histogram --mysql-password=sbtest oltp_point_select --table_size=400000000 run

There are a few command line options to consider:

  • report-interval=1 prints out the current performance measurements every second, which helps see if performance is uniform, if you have stalls or otherwise high variance
  • percentile=99 computes 99 percentile response time, rather than 95 percentile (the default); I like looking at 99 percentile stats as it is a better measure of performance
  • histogram=on produces a histogram at the end of the run (as shown below)

The first thing to note about this histogram is that it is exponential. This means the width of the buckets changes with higher values. It starts with 0.001 ms (one microsecond) and gradually grows. This design is used so that sysbench can deal with workloads with requests that take small fractions of milliseconds, as well as accommodate requests that take many seconds (or minutes).

Next, we learn some us very interesting things about typical request response time distribution for databases. You might think that this distribution would be close to some to some “academic” distributions, such as normal distribution. In reality, we often observe is something of a “camelback” distribution (not a real term) – and our “camel” can have more than two humps (especially for simple requests such as the single primary key lookup shown here).

Why do request response times tend to have this distribution? It is because requests can take multiple paths inside the database. For example, certain requests might get responses from the MySQL Query Cache (which will result in the first hump). A second hump might come from resolving lookups using the InnoDB Adaptive Hash Index. A third hump might come from finding all the data in memory (rather than the Adaptive Hash Index). Finally, another hump might coalesce around the time (or times) it takes to execute on requests that require disk IO.    

You also will likely see some long-tail data that highlights the fact that MySQL and Linux are not hard, real-time systems. As an example, this very simple run with a single thread (and thus no contention) has an outlier at around 18ms. Most of the requests are served within 0.2ms or less.

As you add contention, row-level locking, group commit and other issues, you are likely to see even more complicated diagrams – which can often show you something unexpected:

Latency histogram (values are in milliseconds)
      value  ------------- distribution ------------- count
      0.050 |                                         1
      0.051 |                                         2
      0.052 |                                         2
      0.053 |                                         54
      0.053 |                                         79
      0.054 |                                         164
      0.055 |                                         883
      0.056 |*                                        1963
      0.057 |*                                        2691
      0.059 |**                                       4047
      0.060 |****                                     9480
      0.061 |******                                   15234
      0.062 |********                                 20723
      0.063 |********                                 20708
      0.064 |**********                               26770
      0.065 |*************                            35928
      0.066 |*************                            34520
      0.068 |************                             32247
      0.069 |************                             31693
      0.070 |***************                          41682
      0.071 |**************                           37862
      0.073 |********                                 22691
      0.074 |******                                   15907
      0.075 |****                                     10509
      0.077 |***                                      7853
      0.078 |****                                     9880
      0.079 |****                                     10853
      0.081 |***                                      9243
      0.082 |***                                      9280
      0.084 |***                                      8947
      0.085 |***                                      7869
      0.087 |***                                      8129
      0.089 |***                                      9073
      0.090 |***                                      8364
      0.092 |***                                      6781
      0.093 |**                                       4672
      0.095 |*                                        3356
      0.097 |*                                        2512
      0.099 |*                                        2177
      0.100 |*                                        1784
      0.102 |*                                        1398
      0.104 |                                         1082
      0.106 |                                         810
      0.108 |                                         742
      0.110 |                                         511
      0.112 |                                         422
      0.114 |                                         330
      0.116 |                                         259
      0.118 |                                         203
      0.120 |                                         165
      0.122 |                                         126
      0.125 |                                         108
      0.127 |                                         87
      0.129 |                                         83
      0.132 |                                         55
      0.134 |                                         42
      0.136 |                                         45
      0.139 |                                         41
      0.141 |                                         149
      0.144 |                                         456
      0.147 |                                         848
      0.149 |*                                        2128
      0.152 |**                                       4586
      0.155 |***                                      7592
      0.158 |*****                                    13685
      0.160 |*********                                24958
      0.163 |*****************                        44558
      0.166 |*****************************            78332
      0.169 |*************************************    98616
      0.172 |**************************************** 107664
      0.176 |**************************************** 107154
      0.179 |****************************             75272
      0.182 |******************                       49645
      0.185 |****************                         42793
      0.189 |*****************                        44649
      0.192 |****************                         44329
      0.196 |******************                       48460
      0.199 |*****************                        44769
      0.203 |**********************                   58578
      0.206 |***********************                  61373
      0.210 |**********************                   58758
      0.214 |******************                       48012
      0.218 |*************                            34533
      0.222 |**************                           36517
      0.226 |*************                            34645
      0.230 |***********                              28694
      0.234 |*******                                  17560
      0.238 |*****                                    12920
      0.243 |****                                     10911
      0.247 |***                                      9208
      0.252 |****                                     10556
      0.256 |***                                      7561
      0.261 |**                                       5047
      0.266 |*                                        3757
      0.270 |*                                        3584
      0.275 |*                                        2951
      0.280 |*                                        2078
      0.285 |*                                        2161
      0.291 |*                                        1747
      0.296 |*                                        1954
      0.301 |*                                        2878
      0.307 |*                                        2810
      0.312 |*                                        1967
      0.318 |*                                        1619
      0.324 |*                                        1409
      0.330 |                                         1205
      0.336 |                                         1193
      0.342 |                                         1151
      0.348 |                                         989
      0.354 |                                         985
      0.361 |                                         799
      0.367 |                                         671
      0.374 |                                         566
      0.381 |                                         537
      0.388 |                                         351
      0.395 |                                         276
      0.402 |                                         214
      0.409 |                                         143
      0.417 |                                         80
      0.424 |                                         85
      0.432 |                                         54
      0.440 |                                         41
      0.448 |                                         29
      0.456 |                                         16
      0.464 |                                         15
      0.473 |                                         11
      0.481 |                                         4
      0.490 |                                         9
      0.499 |                                         4
      0.508 |                                         3
      0.517 |                                         4
      0.527 |                                         4
      0.536 |                                         2
      0.546 |                                         4
      0.556 |                                         4
      0.566 |                                         4
      0.587 |                                         1
      0.597 |                                         1
      0.608 |                                         5
      0.619 |                                         3
      0.630 |                                         2
      0.654 |                                         2
      0.665 |                                         5
      0.677 |                                         26
      0.690 |                                         298
      0.702 |                                         924
      0.715 |*                                        1493
      0.728 |                                         1027
      0.741 |                                         1112
      0.755 |                                         1127
      0.768 |                                         796
      0.782 |                                         574
      0.797 |                                         445
      0.811 |                                         415
      0.826 |                                         296
      0.841 |                                         245
      0.856 |                                         202
      0.872 |                                         210
      0.888 |                                         168
      0.904 |                                         217
      0.920 |                                         163
      0.937 |                                         157
      0.954 |                                         204
      0.971 |                                         155
      0.989 |                                         158
      1.007 |                                         137
      1.025 |                                         94
      1.044 |                                         79
      1.063 |                                         52
      1.082 |                                         36
      1.102 |                                         25
      1.122 |                                         25
      1.142 |                                         16
      1.163 |                                         8
      1.184 |                                         5
      1.205 |                                         7
      1.227 |                                         2
      1.250 |                                         4
      1.272 |                                         3
      1.295 |                                         3
      1.319 |                                         2
      1.343 |                                         2
      1.367 |                                         1
      1.417 |                                         2
      1.791 |                                         1
      1.996 |                                         2
      2.106 |                                         2
      2.184 |                                         1
      2.264 |                                         1
      2.347 |                                         2
      2.389 |                                         1
      2.433 |                                         1
      2.477 |                                         1
      2.568 |                                         2
      2.615 |                                         1
      2.710 |                                         1
      2.810 |                                         1
      2.861 |                                         1
      3.187 |                                         1
      3.488 |                                         1
      3.816 |                                         1
      4.028 |                                         1
      6.913 |                                         1
      7.565 |                                         1
      8.130 |                                         1
     17.954 |                                         1

I hope you give sysbench histograms a try, and see what you can discover!

by Peter Zaitsev at September 20, 2017 08:11 PM

Percona XtraDB Cluster 5.6.37-26.21 is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6.34-26.19Percona announces the release of Percona XtraDB Cluster 5.6.37-26.21 on September 20, 2017. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.37-26.21 is now the current release, based on the following:

All Percona software is open-source and free.

Improvements

  • PXC-851: Added version compatibility check during SST with XtraBackup:
    • If donor is 5.6 and joiner is 5.7: A warning is printed to perform mysql_upgrade.
    • If donor is 5.7 and joiner is 5.6: An error is printed and SST is rejected.

Fixed Bugs

  • PXC-825: Fixed script for SST with XtraBackup (wsrep_sst_xtrabackup-v2) to include the --defaults-group-suffix when logging to syslog. For more information, see #1559498.
  • PXC-827: Fixed handling of different binlog names between donor and joiner nodes when GTID is enabled. For more information, see #1690398.
  • PXC-830: Rejected the RESET MASTER operation when wsrep provider is enabled and gtid_mode is set to ON. For more information, see #1249284.
  • PXC-833: Fixed connection failure handling during SST by making the donor retry connection to joiner every second for a maximum of 30 retries. For more information, see #1696273.
  • PXC-841: Added check to avoid replication of DDL if sql_log_bin is disabled. For more information, see #1706820.
  • PXC-853: Fixed cluster recovery by enabling wsrep_ready whenever nodes become PRIMARY.
  • PXC-862: Fixed script for SST with XtraBackup (wsrep_sst_xtrabackup-v2) to use the ssl-dhparams value from the configuration file.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at September 20, 2017 06:03 PM

Oli Sennhauser

Find evil developer habits with log_queries_not_using_indexes

Recently I switched on the MariaDB slow query logging flag log_queries_not_using_indexes just for curiosity on one of our customers systems:

mariadb> SHOW GLOBAL VARIABLES LIKE 'log_quer%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| log_queries_not_using_indexes | OFF   |
+-------------------------------+-------+

mariadb> SET GLOBAL log_queries_not_using_indexes = ON;

A tail -f on the MariaDB Slow Query Log caused a huge flickering on my screen.
I got to see about 5 times per second the following statement sequence in the Slow Query Log:

# User@Host: app_admin[app_admin] @  [192.168.1.42]  Id: 580195
# Query_time: 0.091731  Lock_time: 0.000028 Rows_sent: 273185 Rows_examined: 273185
SELECT LAST_INSERT_ID() FROM `placeholder`;
# Query_time: 0.002858  Lock_time: 0.000043 Rows_sent: 6856 Rows_examined: 6856
SELECT LAST_INSERT_ID() FROM `data`;

So at least 5 times 95 ms (5 x (92 + 3) = 475 ms) per 1000 ms (48%) where spent in these 2 statements which are running quite fast but do not use an index (long_query_time was set to 2 seconds).

So I estimate, that this load job can be speed up at least by factor 2 when using the LAST_INSERT_ID() function correctly not considering the possible reduction of network traffic (throughput and response time).

To show the problem I made a little test case:

mariadb> INSERT INTO test VALUES (NULL, 'Some data', NULL);

mariadb> SELECT LAST_INSERT_ID() from test;

+------------------+
| LAST_INSERT_ID() |
+------------------+
|          1376221 |
...
|          1376221 |
+------------------+
1048577 rows in set (0.27 sec)

The response time of this query will linearly grow with the amount of data as long as they fit into memory and the response time will explode as soon as the table does not fit into memory any more. In addition the network traffic would be reduced by about 8 Mbyte (1 Mio rows x BIGINT UNSIGNED (64-bit) + some header per row?) per second (6-8% of the network bandwidth of a 1 Gbit network link).

shell> ifconfig lo | grep bytes
          RX bytes:2001930826 (2.0 GB)  TX bytes:2001930826 (2.0 GB)
shell> ifconfig lo | grep bytes
          RX bytes:2027289745 (2.0 GB)  TX bytes:2027289745 (2.0 GB)

The correct way of doing the query would be:

mariadb> SELECT LAST_INSERT_ID();
+------------------+
| last_insert_id() |
+------------------+
|          1376221 |
+------------------+
1 row in set (0.00 sec)

The response time is below 10 ms.

So why is the first query taking so long an consuming so many resources? To get an answer to this question the MariaDB Optimizer can tell us more with the Query Execution Plan (QEP):

mariadb> EXPLAIN SELECT LAST_INSERT_ID() FROM test;
+------+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows    | Extra       |
+------+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
|    1 | SIMPLE      | test  | index | NULL          | PRIMARY | 4       | NULL | 1048577 | Using index |
+------+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+

mariadb> EXPLAIN FORMAT=JSON SELECT LAST_INSERT_ID() FROM test;
{
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "test",
      "access_type": "index",
      "key": "PRIMARY",
      "key_length": "4",
      "used_key_parts": ["id"],
      "rows": 1048577,
      "filtered": 100,
      "using_index": true
    }
  }
}

The database does a Full Index Scan (FIS, other call it a Index Fast Full Scan (IFFS)) on the Primary Key (column id).

The Query Execution Plan of the second query looks as follows:

mariadb> EXPLAIN SELECT LAST_INSERT_ID();
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+
|    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | No tables used |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------+

mariadb> EXPLAIN FORMAT=JSON SELECT LAST_INSERT_ID();
{
  "query_block": {
    "select_id": 1,
    "table": {
      "message": "No tables used"
    }
  }
}

by Shinguz at September 20, 2017 02:00 PM

September 19, 2017

Peter Zaitsev

Percona Live Europe Featured Talks: Automatic Database Management System Tuning Through Large-Scale Machine Learning with Dana Van Aken

Percona Live Europe 2017

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Dana Van Aken, a Ph.D. student in Computer Science at Carnegie Mellon University. Her talk is titled Automatic Database Management System Tuning Through Large-Scale Machine Learning. DBMSs are difficult to manage because they have hundreds of configuration “knobs” that control factors such as the amount of memory to use for caches and how often to write data to storage. Organizations often hire experts to help with tuning activities, but experts are prohibitively expensive for many. In this talk, Dana will present OtterTune, a new tool that can automatically find good settings for a DBMS’s configuration knobs. In our conversation, we discussed how machine learning helps DBAs manage DBMSs:

Percona: How did you get into database technology? What do you love about it?

Dana: I got involved with research as an undergrad and ended up working on a systems project with a few Ph.D. students. It turned out to be a fantastic experience and is what convinced me to go for my Ph.D. I visited potential universities and chatted with many faculty members. I met with my current advisor at Carnegie Mellon University, Andy Pavlo, for a half hour and left his office excited about databases and the research problems he was interested in. Three years later, I’m even more excited about databases and the progress we’ve made in developing smarter auto-tuning techniques.

Percona: You’re presenting a session called “Automatic Database Management System Tuning Through Large-Scale Machine Learning”. How does automation make DBAs life easier in a DBMS production environment?

Dana: The role of the DBA is becoming more challenging due to the advent of new technologies and increasing scalability requirements of data-intensive applications. Many DBAs are constantly having to adjust their responsibilities to manage more database servers or support new platforms to meet an organization’s needs as they change over time. Automation is critical for reducing the DBA’s workload to a manageable size so that they can focus on higher-value tasks. Many organizations now automate at least some of the repetitive tasks that were once DBA responsibilities: several have adopted public/private cloud-based services whereas others have built their own automated solutions internally.

The problem is that the tasks that have now become the biggest time sinks for DBAs are much harder to automate. For example, DBMSs have dozens of configuration options. Tuning them is an essential but tedious task for DBAs, because it’s a trial and error approach even for experts. What makes this task even more time-consuming is that the best configuration for one DBMS may not be the best for another. It depends on the application’s workload and the server’s hardware. Given this, successfully automating DBMS tuning is a big win for DBAs since it would streamline common configuration tasks and give DBAs more time to deal with other issues. This is why we’re working hard to develop smarter tuning techniques that are mature and practical enough to be used in a production environment.

Percona: What do you want attendees to take away from your session? Why should they attend?

Dana: I’ll be presenting OtterTune, a new tool that we’re developing at Carnegie Mellon University that can automatically find good settings for a DBMS’s configuration knobs. I’ll first discuss the practical aspects and limitations of the tool. Then I’ll move on to our machine learning (ML) pipeline. All of the ML algorithms that we use are popular techniques that have both practical and theoretical work backing their effectiveness. I’ll discuss each algorithm in our pipeline using concrete examples from MySQL to give better intuition about what we are doing. I will also go over the outputs from each stage (e.g., the configuration parameters that the algorithm find to be the most impactful on performance). I will then talk about lessons I learned along the way, and finally wrap up with some exciting performance results that show how OtterTune’s configurations compared to those created by top-notch DBAs!

My talk will be accessible to a general audience. You do not need a machine learning background to understand our research.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Dana: This is my first Percona Live conference, and I’m excited about attending. I’m looking forward to talking with other developers and DBAs about the projects they’re working on and the challenges they’re facing and getting feedback on OtterTune and our ideas.

Want to find out more about Dana and machine learning for DBMS management? Register for Percona Live Europe 2017, and see his talk Automatic Database Management System Tuning Through Large-Scale Machine Learning. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 19, 2017 11:09 PM

ProxySQL Improves MySQL SSL Connections

In this blog post, we’ll look at how ProxySQL improves MySQL SSL connection performance.

When deploying MySQL with SSL, the main concern is that the initial handshake causes significant overhead if you are not using connection pools (i.e., mysqlnd-mux with PHP, mysql.connector.pooling in Python, etc.). Closing and making new connections over and over can greatly impact on your total query response time. A customer and colleague recently educated me that although you can improve SSL encryption/decryption performance with the AES-NI hardware extension on modern Intel processors, the actual overhead when creating SSL connections comes from the handshake when multiple roundtrips between the server and client are needed.

With ProxySQL’s support for SSL on its backend connections and connection pooling, we can have it sit in front of any application, on the same server (illustrated below):

ProxySQL

With this setup, ProxySQL is running on the same server as the application and is connected to MySQL though local socket. MySQL data does not need to go through the TCP stream unsecured.

To quickly verify how this performs, I used a PHP script that simply creates 10k connections in a single thread as fast it can:

<?php
$i = 10000;
$user = 'percona';
$pass = 'percona';
while($i>=0) {
	$mysqli = mysqli_init();
	// Use SSL
	//$link = mysqli_real_connect($mysqli, "192.168.56.110", $user, $pass, "", 3306, "", MYSQL_CLIENT_SSL)
	// No SSL
	//$link = mysqli_real_connect($mysqli, "192.168.56.110", $user, $pass, "", 3306 )
	// OpenVPN
	//$link = mysqli_real_connect($mysqli, "10.8.99.1",      $user, $pass, "", 3306 )
	// ProxySQL
	$link = mysqli_real_connect($mysqli, "localhost",      $user, $pass, "", 6033, "/tmp/proxysql.sock")
		or die(mysqli_connect_error());
	$info = mysqli_get_host_info($mysqli);
	$i--;
	mysqli_close($mysqli);
	unset($mysqli);
}
?>

Direct connection to MySQL, no SSL:

[root@ad ~]# time php php-test.php
real 0m20.417s
user 0m0.201s
sys 0m3.396s

Direct connection to MySQL with SSL:

[root@ad ~]# time php php-test.php
real	1m19.922s
user	0m29.933s
sys	0m9.550s

Direct connection to MySQL, no SSL, with OpenVPN tunnel:

[root@ad ~]# time php php-test.php
real 0m15.161s
user 0m0.493s
sys 0m0.803s

Now, using ProxySQL via the local socket file:

[root@ad ~]# time php php-test.php
real	0m2.791s
user	0m0.402s
sys	0m0.436s

Below is a graph of these numbers:

ProxySQL

As you can see, the difference between SSL and no SSL performance overhead is about 400% – pretty bad for some workloads.

Connections through OpenVPN are also better than MySQL without SSL. While this is interesting, the OpenVPN server needs to be deployed on another server, separate from the MySQL server and application. This approach allows the application servers and MySQL servers (including replica/cluster nodes) to communicate on the same secured network, but creates a single point of failure. Alternatively, deploying OpenVPN on the MySQL server means if you have an additional high availability layer in place and it gets quite complicated when a new master is promoted. In short, OpenVPN adds many additional moving parts.

The beauty with ProxySQL is that you can just run it from all application servers and it works fine if you simply point it to a VIP that directs it to the correct MySQL server (master), or use the replication group feature to identify the authoritative master.

Lastly, it is important to note that these tests were done on CentOS 7.3 with OpenSSL 1.0.1e, Percona Server for MySQL 5.7.19, ProxySQL 1.4.1, PHP 5.4 and OpenVPN 2.4.3.

Happy ProxySQLing!

by Jervin Real at September 19, 2017 06:47 PM

Serge Frezefond

Probability perspective on MySQL Group replication and Galera Cluster

Comparing Oracle MySQL Group Replication and Galera Cluster through a probability perpective seems quite interesting.

At commit time both use a group certification process that requires network round trips. The required time for these network roundtrips is what will mainly determined the cost of a transaction. Let us try to compute an estimate of the [...]

by Serge at September 19, 2017 05:05 PM

September 18, 2017

MariaDB AB

Announcing MariaDB ColumnStore 1.1.0 Beta

Announcing MariaDB ColumnStore 1.1.0 Beta Dipti Joshi Mon, 09/18/2017 - 16:47

We are happy to announce that today we are releasing 1.1.0 beta software of MariaDB ColumnStore, the high performance, columnar storage engine for large scale analytics on MariaDB. Beta is an important time in our release and we encourage you to download this release today! Please note that we do not recommend running beta releases in production.

MariaDB ColumnStore 1.1 introduces the following key new capabilities:

  • Streaming and Customized Analytics

  • Improved Operational Resiliency:

  • Data Types:

    • TEXT and BLOB: Now you can store unstructured data columns requiring larger than 64KB size as TEXT or BLOB data type.

  • MariaDB Server 10.2 compatibility:

    • MariaDB ColumnStore 1.1.0 is based on MariaDB Server 10.2.8.

    • The Window functions have been re-implemented with MariaDB Server 10.2.8 code.

    • MariaDB Server Audit Plugin Integration: Now queries sent to MariaDB ColumnStore can be audited in the same way as all other storage engines.

    • Non-recursive Common Table Expressions are now supported.

  • Performance: Several improvements in string handling and memory utilization.

The release notes for MariaDB ColumnStore 1.1.0 can be found here and the list of bugs fixed can be found in the release notes. MariaDB ColumnStore documentation can be found in our Documentation Library.

Try out MariaDB ColumnStore 1.1.0 Beta software and share your feedback. For any questions on the new features of MariaDB ColumnStore 1.1, please email me at dipti.joshi@mariadb.com.

Login or Register to post comments

by Dipti Joshi at September 18, 2017 08:47 PM

Peter Zaitsev

Webinar Tuesday, September 19, 2017: A Percona Support Engineer Walkthrough for pt-stalk

pt-stalkJoin Percona’s, Principal Support Engineer, Markus Albe as he presents A Percona Support Engineer Walkthrough for pt-stalk on Tuesday, September 19, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

As a support engineer, I get dozens of pt-stalk captures from our customers containing samples of iostat, vmstat, top, ps, SHOW ENGINE INNODB STATUS, SHOW PROCESSLIST and a multitude of other diagnostics outputs.

These are the tools of the trade for performance and troubleshooting, and we must learn to digest these outputs in an effective and systematic way. This allows us to provide high-quality service to a large volume of customers.

In this presentation, I will share the knowledge we’ve gained working with this data, and how to apply it to your database environment. We will learn to setup, capture data, write plugins to trigger collection and to capture custom data, look at our systematic approach and learn what data to read first and how to unwind the tangled threads of pt-stalk.

By the end of this presentation, you will have expert knowledge on how to capture diagnostic metrics at the right time and have a generic approach to digest the captured data. This allows you to diagnose and solve many of problems common to MySQL setups.

Resister for the webinar here.

Marcos AlbeMarcos Albe, Principal Technical Services Engineer

Marcos Albe has been doing web development for over ten years, providing solutions for various media and technology companies of different sizes. He is now a member of the Percona Support Team. Born and raised in the city of Montevideo, Uruguay, he became passionate about computers at the age of 11, when he got a 25Mhz i386-SX. Ten years later, he became one of the pioneers in telecommuting in Uruguay while leading the IT efforts for the second largest newspaper in the country.

by Dave Avery at September 18, 2017 07:02 PM

Percona Live Europe Featured Talks: Debugging with Logs (and Other Events) Featuring Charity Majors

Percona Live Europe 2017

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Charity Majors, CEO/Cofounder of Honeycomb. Her talk is Debugging with Logs (and Other Events). Her presentation covers some of the lessons every engineer should know (and often learns the hard way): why good logging solutions are so expensive, why treating your logs as strings can be costly and dangerous, how logs can impact code efficiency and add/fix/change race conditions in your code. In our conversation, we discussed debugging your database environment:

Percona: How did you get into database technology? What do you love about it?

Charity: Oh dear, I don’t. I hate databases. Data is the scariest, hardest part of computing. The stakes are highest and the mistakes the most permanent. Data is where you can kill any company with the smallest number of errors. That’s why I always end up in charge of the databases – I just don’t trust anybody else with the power. (Also, I’m an adrenaline junkie who gets off on high stakes. I could gamble or I could do databases, and I know too much math to gamble.) Literally, nobody loves databases. If they tell you anything different, they are either lying to you or they’re nowhere near production.

I got into databases from operations. I’ve been on call since I was 17, over half my life. I am really stubborn, have an inflated sense of my own importance and like solving problems, so operations was a natural fit. I started diving on the databases grenades when I worked at Linden Lab and MySQL was repeatedly killing us. It seemed impossible, so I volunteered to own it. I’ve been doing that ever since.

Percona: You’re presenting a session called “Debugging with Logs (and Other Events)”. What is the importance of logs for databases and DBAs?

Charity: I mean, it’s not really about logs. I might change my title. It’s about understanding WTF is going on. Logs are one way of extracting events in a format that humans can understand. My startup is all about “what’s happening right now; what’s just happened?” Which is something we are pretty terrible at as an industry. Databases are just another big complex piece of software, and the only reason we have DBAs is because the tooling has historically been so bad that you had to specialize in this piece of software as your entire career.

The tooling is getting better. With the right tools, you don’t have to skulk around like a detective trying to model and predict what might be happening, as though it were a living organism. You can simply sum up the lock time being held, and show what actor is holding it. It’s extremely important that we move away from random samples and pre-aggregated metrics, toward dynamic sampling and rich events. That’s the only way you will ever truly understand what is happening under the hood in your database. That’s part of what my company was built to do.

Percona: How can logging be used in debugging to track down database issues? Can logging affect performance?

Charity: Of course logging can affect performance. For any high traffic website, you should really capture your logs (events) by streaming tcpdump over the wire. Most people know how to do only one thing with db logs: look for slow queries. But those slow queries can be actively misleading! A classic example is when somebody says “this query is getting slow” and they look at source control and the query hasn’t been modified in years. The query is getting slower either because the data volume is growing (or data shape is changing), or because reads can yield but writes can’t, and the write volume has grown to the point where reads are spending all their time waiting on the lock.

Yep, most db logs are terrible.

Percona: What do you want attendees to take away from your session? Why should they attend?

Charity: Lots of cynicism. Everything in computers is terrible, but especially so with data. Everything is a tradeoff, all you can hope to do is be aware of the tradeoffs you are making, and what costs you are incurring whenever you solve a given problem. Also, I hope people come away trembling at the thought of adding any more strings of logs to production. Structure your logs, people! Grep is not the answer to every single question! It’s 2017, nearly 2018, and unstructured logs do not belong anywhere near production.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Charity: My coauthor Laine and I are going to be signing copies of our book Database Reliability Engineering and giving a short keynote on the changes in our field. I love the db community, miss seeing Mark Callaghan and all my friends from the MongoDB and MySQL world, and cannot wait to laugh at them while they cry into their whiskey about locks or concurrency or other similar nonsense. Yay!

Want to find out more about Charity and database debugging? Register for Percona Live Europe 2017, and see her talk Debugging with Logs (and Other Events). Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 18, 2017 06:25 PM

September 15, 2017

Peter Zaitsev

Percona Blog Poll Results: What Database Engine Are You Using to Store Time Series Data?

TIme Series Data

TIme Series DataIn this blog post, we talk about the results of Percona’s time series database poll “What Database Engine Are You Using to Store Time Series Data?”

Time series data is some of the most actionable data available when it comes to analyzing trends and making predictions. Simply put, time series data is data that is indexed not just by value, but by time as well – allowing you to view value changes over time as they occur. Obvious uses include the stock market, web traffic, user behavior, etc.

With the increasing number of smart devices in the Internet of Things (IoT), being able to track data over time is more and more important. With time series data, you can measure and make predictions on things like energy consumption, pH values, water consumption, data from environment-aware machines like smart cars, etc. The sensors used in IoT devices and systems generate huge amounts of time-series data.

A couple of months back, we ran a poll on what time series databases were being used by the community. We wanted to quickly report on the results from that poll.

First the results:

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Here are some thoughts:

  • The fact that this blog started as a place exclusively for MySQL information probably explains why we skewed high with MySQL respondents – still that doesn’t mean it doesn’t reflect reality.
  • Elastic seems the most common after that, possibly to tie in with MySQL use.
  • InfluxDB as next popular. This suggests that Paul Dix’s chosen business model is “AOK” so to speak. It is unclear if people use the open source version, or outgrow it and switch to the commercial stuff.
  • We lumped together “general purpose NoSQL engine”, but in some cases examples like Cassandra are targeted at time series. Notice that KairosDB, which is built on top of Cassandra itself, is not as popular in our survey.
  • Prometheus is the canonical “not a time series database”, but still used as one. I have a feeling alongside Graphite, this is monitoring related.
  • ClickHouse time series is a new time series database and it is surprising that it gets such high rankings. It was also relatively unknown outside of its home country Russia, but now we are seeing uses at places like CloudFlare and more.

Thanks for participating in the poll. We’re still running a poll on operating systems, so don’t forget to register your responses. We’ll report on that poll soon, with a new one on the way shortly.

by Colin Charles at September 15, 2017 11:55 PM

The MySQL High Availability Landscape in 2017 (the Babies)

MySQL High Availability

This post is the third of a series focusing on the MySQL high availability solutions available in 2017.

The first post looked at the elders, the technologies that have been around for more than ten years. The second post talked about the adults, the more recent and mature technologies. In this post, we will look at the emerging MySQL high availability solutions. The “baby” MySQL high availability solutions I chose for the blog are group replication, proxies and distributed storage.

Group replication

Group replication is the Oracle response to Galera. The term “InnoDB cluster” means a cluster using group replication. The goal is offering similar functionalities, especially the almost synchronous feature.

At first glance, the group replication implementation appears to be rather elegant. The basis is the GTID replication mode. The nodes of an InnoDB cluster share a single UUID sequence. To control the replication lag, Oracle added a flow control layer. While Galera requires unanimity, group replication only requires a majority. The majority protocol in use is derived from Paxos. A majority protocol makes the cluster more resilient to a slow node.

Like Galera, when you add flow control you needs queues. Group replication has two queues. There is one queue for the certification process and one queue for the appliers. What is interesting in the Oracle approach is the presence of a throttling mechanism. When flow control is requested by a node, instead of halting the processing of new transactions like Galera, the rate of transactions is throttled. That can help to meet strict timing SLAs.

Because the group replication logic is fairly similar to Galera, they suffer from the same limitations: large transactions, latency and hot rows. Group replication is recent. The first GA version is 5.7.17, from December 2016. It is natural then that it has a number of sharp edges. I won’t extend too much here, but if you are interested read here, here. I am confident over time group replication will get more polished. Some automation, like the Galera SST process, would also be welcome.

Given the fact the technology is recent, I know no Percona customer using group replication in production.

Proxies

Intelligent proxies can be viewed as another type of upcoming MySQL high availability solution. It is not strictly MySQL. In fact, this solution is more of a mix of other solutions.

The principle is simple: you connect to a proxy, and the proxy directs you to a valid MySQL server. The proxy has to monitor the states of the back-end servers, and maybe even perform actions on them. Of course, the proxy layer must not become a single point of failure. There should be more than one proxy host for basic HA. If more that one proxy is used at the same time, they’ll have to agree on the state of the back-end servers. For example, on a cluster using MySQL async replication, if the proxies are not sending the write traffic to the same host, things will quickly become messy.

There are few ways of achieving this. The simplest solution is an active-passive setup where only one proxy is active at a given time. You’ll need some kind of logic to determine if the proxy host is available or not. Typical choices will use tools like keepalived or Pacemaker.

A second option is to have the proxies agree to a deterministic way of identifying a writer node. For example, with a Galera-based cluster, the sane back-end node with the lowest wsrep_local_index could be the writer node.

Finally, the proxies could talk to each other and coordinate. Such an approach is promising. It could allow a single proxy to perform the monitoring and inform its peers of the results. It would allow also coordinated actions on the cluster when a failure is detected.

Currently, there are a few options in terms of proxies:

  • ProxySQL: An open-source that understands the MySQL protocol and can do R/W splitting, query caching, sharding, SQL firewalling, etc. A new alpha level feature, mirroring, targets the inter-proxy communication need.
  • MaxScale: No longer fully open-source (BSL), but understands the MySQL protocol. Can do R/W splitting, sharding, binlog serving, SQL firewalling, etc.
  • MySQL Router: MySQL Router is an open-source proxy developed by Oracle for InnoDB Cluster (Group replication). It understands the MySQL protocol and also supports the new X protocol. It can do R/W splitting.
  • HAProxy: HAProxy is a popular open-source TCP level proxy. It doesn’t understand the MySQL protocol. It needs helper scripts, responding to HTTP type requests, to figure the node’s health.

To these open source proxies, there are two well-known commercial proxy-like solutions, Tungsten and ScaleArc. Both of these technologies are mature and are not “babies” in terms of age and traction. On top of these, there are also numerous hardware-based load balancer solutions.

The importance of proxies in MySQL high availability has led Percona to include ProxySQL in the latest releases of Percona XtraDB Cluster. In collaboration with the ProxySQL maintainer, René Cannaò, features have been added to make ProxySQL aware of the Percona XtraDB Cluster state.

Proxies are already often deployed in MySQL high availability solutions. Often proxies are only doing load balancing type work. We start to see deployment using proxies for more advanced things, like read/write splitting and sharding.

Distributed storage

Replication setup using distributed storage

 

This MySQL high availability solution is a project I am interested in. It is fair to say it is more a “fetus” than a real “baby,” since I know nobody using it in production. You can see this solution as a shared storage approach on steroids.

The simplest solution requires a three-node Ceph cluster. The nodes also run MySQL and the datadir is a Ceph RBD block device. Data in Ceph is automatically replicated to multiple hosts. This built-in data replication is an important component of the solution. Also, Ceph RBD supports snapshots and clones. A clone is a copy of the whole data set that consumes only the data that changed (delta) in terms of storage. Our three MySQL servers will thus not use three full copies of the dataset, but only one full copy and two deltas. As time passes, the deltas grow. When they are too large, we can simply generate new snapshots and clones and be back to day one. The generation of a new snapshot and clone takes a few seconds, and doesn’t require stopping MySQL.

The obvious use case for the distributed storage approach is a read-intensive workload on a very large dataset. The setup can handle a lot of writes. The higher the write load, the more frequently there will be a snapshot refresh. Keep in mind that refreshing a snapshot of a 10 TB data set takes barely more time than for a 1 GB data set.

For that purpose, I wrote an SST script for Percona XtraDB Cluster that works with Ceph. I blogged about it here. I also wrote a Ceph snapshot/clone backup script that can provision a slave from a master snapshot. I’ll blog about how to use this Ceph backup script in the near future.

Going further with distributed storage, multiple MySQL instances could use the same data pages. Ceph would be use as a distributed object store for InnoDB pages. This would allow to build an open-source Aurora like database. Coupled with Galera or Group replication, you could have a highly-available MySQL cluster sharing a single copy of the dataset.

I started to modify MySQL, actually Percona Server for MySQL 5.7, to add support for Ceph/Rados. Rados is the object store protocol of Ceph. There is still a lot of effort needed to make it work. My primary job is not development, so progress is slow. My work can be found (here). The source compiles well but MySQL doesn’t fully start. I need to debug where things are going wrong.

Adding a feature to MySQL like that is an awesome way to learn the internals of MySQL. I would really appreciate any help if you are interested in this project.

Conclusion

Over the three articles in this series, we have covered the 2017 landscape of MySQL high availability solutions. The first focused on the old timers, “the elders”, composed of: replication, shared storage and NDB. The second articles dealt with the solutions that are more recent and have a good traction: Galera and RDS Aurora. The conclusion of the series is the current article, which looked at what could be possibly coming in term of MySQL high availability solutions.

The main goal of this series is to help planning the deployment of MySQL in a highly-available way. I hope it can be used for hints and pointers to get better and more efficient solutions.

by Yves Trudeau at September 15, 2017 11:44 PM

This Week in Data with Colin Charles #6: Open Source Summit and Percona Live Europe

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

What a long, packed week! Spent most of it at Open Source Summit North America, while still enjoying the myriad phone calls and meetings you have as a Perconian. In addition to two talks, I also gave a webinar this week on the differences between MySQL and MariaDB (I’ll post a blog Q&A in the near future).

Colin CharlesPercona Live Europe Dublin

Have you registered for Percona Live Europe Dublin? If no, what’s keeping you from doing so?

In addition, I think it’s definitely worth registering for the community dinner. You can hang out with other like-minded folks, and see the lightning talks (we may announce more as time gets closer).

See what the MySQL Team will speak about at Percona Live Dublin. You’ll notice that a few of the releases I mention below have Percona Live Europe talks associated with them.

Releases

Link List

Feedback

On a somber note, former Perconian and all round great community member, Jaakko Pesonen passed away. Shlomi Noach commented online: Remembering Jaakko Pesonen.

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at September 15, 2017 05:25 PM

MariaDB AB

Data Modeling Machine Learning Datasets: Using MariaDB ColumnStore

Data Modeling Machine Learning Datasets: Using MariaDB ColumnStore jmclaurin Thu, 09/14/2017 - 22:49

Often valuable corporate data is stored in traditional data warehouses. This data can be an important source of data that can be leveraged into important predictive analytics. As a Data Scientist, getting hands on this data can be problematic. Without SQL skillset, getting data warehouse data can be close to impossible. This blog provides insights into querying MariaDB ColumnStore for Machine Learning Datasets.

This post discusses a solution (recipe) for accessing and turning data warehouse fact and dimension tables into a traditional machine learning “dataset”. The dataset is necessary for machine learning (ML) algorithms, processing and data exploration. Leveraging all the goodness and benefits of an analytic data engine such as MariaDB ColumnStore; and then integrating that data into some machine learning algorithms for advanced analytical processing.

A typical data warehouse schema has:

JMBlog-photo1.png

  • dimension tables containing categorization of people, products, place and time – generally modeled as one table per object.
  • fact table containing measurements, metrics and facts of a business process.

An ERP data warehouse will have an order or order_line fact table(s) recording each order and its associated order line items. The product and store dimension tables are related via foreign keys.

A data analyst can then calculate various metrics such as sum of orders per store, or sum of orders per category, or sum of orders per day/week/month. 

While a normalized database model like the above makes sense for a data analyst, it makes ad-hoc queries significantly harder for data scientist. Data scientists balk at complex joins and window functions.

Let's investigate an approach to improve the usability of data warehouse for Machine Learning.


Approaches for data scientist to engage with data

Static reports at the end of the month are one thing and it's alright for a data analyst to work with a data warehouse. To truly unlock the value of data warehouses for machine learning – the warehouse needs to be available for more than simple end of month reporting and querying.  The warehouse needs to be available for data engineers and scientist for advanced analytics and insights.

Here are three (3) approaches of varying levels of SQL query skills required:

  1. Denormalization – create a view (materialized for performance) that consolidates the data warehouse which then produces an analytical dataset. Data Scientist can now explore the data without complex joins and window functions.
  2. Dataset modeling – use a SQL query to represent a semantic model of the relationships between the warehouse tables.
  3. Query filtering – use a SQL stored procedure to implement pluggable WHERE clauses to enable query filtering of warehouse data.

Resulting ML dataset

JMblog-photo2.png

The biggest win is usability. The data warehouse has been merged into a single table (dataset) for analytic processing.  Data has been gathered from the data warehouse that can be ran in a machine learning algorithm. 

Let the classifications and regressions begin!

Dataset creation (SQL query)

To create a dataset (table) like the above, join all the dimension tables into one single table. Refresh this table periodically depending on analytic requirements. Nightly seems to work for most cases. Any changes to the dimension tables, like store name, will be captured on the next refresh.

Dataset query:

with

    products as ( select * from products_table),

    stores as ( select * from stores_table),

    orders as ( select * from orders_table),

    order_lines as ( select * from order_lines_table),

 

joined as (

    select

        order_lines.order_id,

        order_lines.amount,

        order_lines.units,

        orders.date,

        products.name,

        products.other_data,

        stores.name,

        stores.other_data

    from order_lines

    left outer join orders on order_lines.order_id = orders.id,

    left outer join products on orders.product_id = products.id

    left outer join stores on orders.store_id = stores.id

)

 

select * from joined

Dataset as an SQL table or view

A database view creates a pseudo-table and from the perspective of a select statement, it appears exactly as a regular table. In MariaDB, views are created with the create view statement:

create view as

    select , , ....

    from ;

The view is now available to be queried with a select statement. As this is not a real table, you cannot delete or update it. The underlying query is run every time you query the view.  If you want to store the result of the underlying query – you’d just have to use the materialized keyword:

create materialized view as

    select , , ....

    from ;

You now control the upgrade schedule of the view and can be refreshed at your convenience:

refresh materialized view ;

Creating a dataset from data warehouse is just one use-case of using MariaDB ColumnStore advanced analytic datastore.  Here’s a few other use cases for preparing data for machine learning modeling and predictive analytics datasets.

Prepare Your Data for Modeling

MariaDB ColumnStore

 

Description

Data Profiling

Quickly summarize the shape of your dataset to avoid bias or missing information before you start building your model. Missing data, zero values, text, and a visual distribution of the data are visualized automatically upon data ingestion.

Summary Statistics

Visualize your data with summary statistics to get the mean, standard deviation, min, max, cardinality, quantile and a preview of the data set.

Aggregate, Filter, Bin, and Derive Columns

Build unique views with windows functions, Filtering, Binning, and Derived Columns.

Slice, Log Transform, and Anonymize

Normalize, anonymize, and partition to get your data into the right shape for modeling.

Variable Creation

Highly customizable variable value creation to hone in on the key data characteristics to model.

Training and Validation Sampling Datasets

Design a random or stratified sampling plan to generate data sets for model training and scoring.

Get started today. Download MariaDB ColumnStore, a core component of the MariaDB AX subscription.

Often valuable corporate data is stored in traditional data warehouses. This data can be an important source of data that can be leveraged into important predictive analytics. As a Data Scientist, getting hands on this data can be problematic. Without SQL skillset, getting data warehouse data can be close to impossible. This post will provide insights into querying MariaDB ColumnStore for Machine Learning Datasets.

Login or Register to post comments

by jmclaurin at September 15, 2017 02:49 AM

September 14, 2017

Peter Zaitsev

Lock, Stock and MySQL Backups: Data Guaranteed Webinar Follow Up Questions

MySQL Backups

MySQL BackupsHello again! On August 16, we delivered a webinar on MySQL backups. As always, we’ve had a number of interesting questions. Some of them we’ve answered on the webinar, but we’d like to share some of them here in writing.

What is the best way to maintain daily full backups, but selective restores omitting certain archive tables?

There are several ways this can be done, listed below (though not necessarily limited to the following):

  1. Using logical dumps (i.e., mydumper, mysqlpump, mysqldump). This allows you to dump per table and thus be able to selectively restore.
  2. Backup the important tables and archive tables separately, allowing to restore separately as well. This is a better approach in my opinion, since if the archive tables do not change often you can backup only what has changed. This gives you more flexibility in backup size and speed. This is also possible if consistency or inter-dependence between the archive and other tables aren’t necessary.
  3. Filesystem- or XtraBackup-based backups are also another option. However, the restore process means you need to restore the full backup and discard what you do not need. This is especially important if your archive tables are using InnoDB (where metadata is stored in the main tablespace).

Can you recommend a good script on github for mysqlbinlog backup?

This is a shameless plug, but I would recommend the tool I wrote called pyxbackup. At the time it was written, binary log streaming with 5.6 was fairly new. So there weren’t many tools that we could find or adopt that would closely integrate with backups. Hence writing from scratch.

mysqlbinlog can stream binary logs to a remote server. Doesn’t simply copying the binlog to the remote location just as affective. Especially if done frequently using a cronjob that runs rsync?

True, though be aware of a few differences:

  1. rsync may not capture data that would have been flushed to disk from the filesystem cache.
  2. In case the source crashes, you could lose the last binary log(s) between the last rsync and the crash.

How is possible to create a backup using xtrabackup compressed directly to a volume with low capacity? Considering that is needed to use –apply-log step.

In the context of this question, we cannot stream backups for compression and do the apply-log phase at the same time. The backup needs to be complete for the apply-log phase to start. Hence compress, decompress, then apply-log. Make sure enough disk space is available for the dataset size, plus your backups if you want to be able to test your backups with apply-log.

How can you keep connection credentials secure for automated backup?

  1. Tools like xtrabackup, mysqldump, mydumper and mysqlpump have options to pass client defaults file. You can store credentials in those files that are restricted to only a few users on the system (including the backup user).
  2. Aside from the first item, most of the tools also support login paths if you do not want your credentials on a plain text file. It is not completely secure, as credentials from login paths can still be decoded.
  3. Another way we’ve seen is to store the credentials on a vault or similar medium, and use query tools that would return the username or password. For example, if you run xtrabackup on bash:
    xtrabackup --password=$(/usr/bin/vault-query mysql-password) --backup

    Of course, how you secure the account that can run the vault query command is another topic for discussion. 🙂

I missed the name of your github repo. Also for mysqlbinlog parsing? (same question)

See above, and for an example of mysqlbinlog parsing library: https://github.com/Yelp/ybinlogp

Which one is faster between mydumper and 5.7 mysqlpump?

This is an interesting question, though belongs to the “It Depends” category. 🙂 First, we have not benchmarked these two tools head to head. Second, with different approaches one may be faster on a specific use case, while the other is faster on a different use case. For example, with the different lock granularity support on mydumper, it could be faster on InnoDB with only high-concurrent workloads.

If we wanted to migrate a 2.5TB database over a VPN connection, which backup and restore method would you recommend? The method would need to be resilient. This would be for migrating an on premise db to a MySQL RDS instance at AWS.

Again, there could be a number of ways this might be achieved, but one we frequently go with is:

  1. Setup an EC2 instance that would replicate from the original source.
  2. Once the replication is caught up, stop replication, do a parallel dump of the data per table.
  3. Import the data to RDS per table where you can monitor progress and failure, and retry each table if necessary (hint: mydumper can also chunk)
  4. Once complete, configure RDS to replicate from EC2 to complete its data.

Bonus: if you are migrating to Aurora, do you know you can use an XtraBackup based backup directly?

What about if I have 1TB of data to backup and restore to a new server, how much time does it take, can we restore/stream at the same time while taking a backup?

Assuming you have direct access to the source server, XtraBackup is an excellent option here. Backup from the source then streams to the new server. Once complete, prepare the backup on the new server and it should be ready for use. These instructions are mostly for provisioning new slaves, but most of the steps should apply for the same outcome.

Is mydumper your product, and how fast will it take to backup a few millions of data?

No, mydumper is not official Percona software. Percona contributes to this software as it both benefits our customers and the community.

Will it lock my table during the process? How to restore the mydumper?

By default, the table will be locked. However, this is highly configurable. For example, if you are using a version Percona Server for MySQL that supports Backup Locks, the lock time is significantly reduced. Additionally, depending on the backup requirements you can skip locks altogether.

Mydumper comes with a complementary tool called myloader that does the opposite. It restores the resulting dumps into the destination server in parallel.

Thank you again for attending the webinar. If you were not able to make it, you could still watch the recording and the slides here.

By the way, if you are attending Percona Live in Europe, Marcelo’s talk on continuous backup is an excellent follow-up to this webinar!

by Jervin Real at September 14, 2017 10:35 PM

Percona Live Europe Featured Talks: Monitoring Open Source Databases with Icinga with Bernd Erk

Percona Live Europe 2017

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Bernd Erk, CEO of Icinga. His talk is titled Monitoring Open Source Databases with Icinga. Icinga is a popular open source successor of Nagios that checks hosts and services, and notifies you of their statuses. But you also need metrics for performance and growth to deal with your scaling needs. Adding conditional behaviors and configuration in Icinga is not just intuitive, but also intelligently adaptive at runtime. In our conversation, we how to intelligently monitor open source databases:

Percona: How did you get into database technology? What do you love about it?

Bernd: I started a position as a junior systems engineer in a large German mail order company. They were totally committed to Oracle databases and the tool stack around it. As Linux gained more and more attention, we became aware of MySQL very early and were fascinated by the simplicity of installation and administration. There were of course so many things Oracle had in those days that MySQL didn’t have, but most of our uses also didn’t require those extra (and of course expensive) features.

Percona: You’re presenting a session called “Monitoring Open Source Databases with Icinga”. Why is monitoring databases important, and what sort of things need to be monitored?

Bernd: Usually databases are a very important part of an IT infrastructure, and need to be online 24/7. I also had the personal experience of database downtime putting a lot of pressure on both the organization in general and the team in charge. Since most open source databases provide very good interfaces, it is not so hard to figure out if they are up and running. Like in many monitoring arenas, knowing what to monitor is the important information.

In addition to the basic local and remote availability checks, monitoring database replication is very important. We often see environments where the standby slave is outdated by, years or not able to keep up with the incoming load. From there you can go into databases and application metrics to learn more about performance and IO behavior.

Percona: Why are you using Icinga specifically? What value does it provide above other monitoring solutions?

Bernd: I’ve been involved with Icinga from the beginning, so it is my number one choice in open source monitoring. In my opinion, the great advance of Icinga 2 is the simplicity of legacy systems like Nagios (or Icinga 1), but also its support for complex environments (such as application-based clustering). There is also the live configuration of the Icinga 2 monitoring core through our REST API. With all the supported tools for metrics, logs and management around it, for me Icinga 2 is the best match for open source monitoring.

Percona: What do you want attendees to take away from your session? Why should they attend?

Bernd: Attendees will get a short overview on Icinga 2, and why it is different to Nagios (Icinga 1). I will also guide them through practical monitoring examples and show implemented checks in a live demo. After my talk, they should be able to adapt and extend on-premise or cloud monitoring with Icinga 2 using the default open source plugins.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Bernd: Getting together with the great database community in all aspects, and going to Dublin (to be honest). I have never been there, and so it is my first time.

Want to find out more about Bernd and database monitoring? Register for Percona Live Europe 2017, and see his talk Monitoring Open Source Databases with Icinga. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 14, 2017 10:31 PM

Percona Server for MongoDB 3.4.7-1.8 is Now Available

Percona Server for MongoDB 3.2

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.7-1.8 on September 14, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.7 and includes the following additional change:

  • Added packages for Debian 9 (“stretch”)

by Alexey Zhebel at September 14, 2017 06:08 PM

Shlomi Noach

gh-ost 1.0.42 released: JSON support, optimizations

gh-ost 1.0.42 is released and available for download.

JSON

MySQL 5.7's JSON data type is now supported.

There is a soft-limitation, that your JSON may not be part of your PRIMARY KEY. Currently this isn't even supported by MySQL anyhow.

Performance

Two noteworthy changes are:

  • Client side prepared statements reduce network traffic and round trips to the server.
  • Range query iteration avoids creating temporary tables and filesorting.

We're not running benchmarks at this time to observe performance gains.

5.7

More tests validating 5.7 compatibility (at this time GitHub runs MySQL 5.7 in production).

Ongoing

Many other changes included.

We are grateful for all community feedback in form of open Issues, Pull Requests and questions!

gh-ost is authored by GitHub. It is free and open source and is available under the MIT license.

Speaking

In two weeks time, Jonah Berquist will present gh-ost: Triggerless, Painless, Trusted Online Schema Migrations at Percona Live, Dublin.

Tom Krouper and myself will present MySQL Infrastructure Testing Automation at GitHub, where, among other things, we describe how we test gh-ost in production.

by shlomi at September 14, 2017 07:26 AM

September 13, 2017

Peter Zaitsev

Percona Live Europe Featured Talks: Visualize Your Data with Grafana Featuring Daniel Lee

Percona Live Europe 2017

Percona Live Europe 2017Welcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Daniel Lee, a software developer at Grafana. His tutorial is Visualize Your Data With Grafana. This presentation teaches you how to create dashboards and graphs in Grafana and how to use them to gain insight into the behavior of your systems. In our conversation, we discussed how data visualization could benefit your database environment:

Percona: How did you get into database technology? What do you love about it?

Daniel: I’m a developer and my first job was working on a transport logistics system, which was mostly composed of Stored Procedures in SQL Server 2000. Today, I would not build a system with all the logic in Stored Procedures – but that database knowledge is the foundation that I built everything else on. Databases and their data flows will always be the core of most interesting systems. More recently, I have switched from Windows to working with MariaDB on Linux. Grafana Labs uses Percona Server for MySQL for most of our internal applications (worldPing and Hosted Grafana). Working with Grafana also means working with time series databases like Graphite, which is also very interesting.

I enjoy working with data as it is one of the ways to learn how users use a system. Design decisions are theories until you have data to either back them up or disprove them.

Percona: Your presenting a session called “Visualize Your Data With Grafana”. How does monitoring make DBAs life easier, and how do graphs make this information easier to apply for DBAs?

Daniel: Good monitoring provides top-level metrics (throughput, number of errors, performance) for alerting, and other lower-level metrics to allow you to dig into the details and quickly diagnose and resolve an outage. Monitoring also helps you find any constraints (for example, finding bottlenecks for query performance: CPU, row locks, disk, buffer pool size, etc.). Performance monitoring allows you to see trends and lets you know when it is time to scale out or purchase more hardware.

Monitoring can also be used to communicate with business people. It is a way of translating lots of different system metrics into a measurable user experience. Visualizing your data with graphs is a very good way to communicate that information, both within your team and with your business stakeholders. Building dashboards with the metrics that are important to you rather than just the standard checklists (CPU, disk, network etc.) allows you to measure the user experience for your application and to see long-term trends.

Percona: Why Grafana? What does Grafana do better than other monitoring solutions?

Daniel: Grafana is the de facto standard in open source for visualizing time series data. It comes with tons of different ways to visualize your data (graphs, heat maps, gauges). Each data source comes with its own custom query editor that simplifies writing complex queries, and it is easy to create dynamic dashboards that look great on a TV.

Being open source, it can be connected to any data source/database, which makes it easy to unify different data sources in the same dashboard (for example, Prometheus or Graphite data combined with MySQL data). This also means your data is not subject to vendor lock-in like it is in other solutions. Grafana has a large and very active community that creates plugins and dashboards that extend Grafana into lots of niches, as well as providing ways to quickly get started with whatever you want to monitor.

Percona: What do you want attendees to take away from your session? Why should they attend?

Daniel: I want them to know that you can make the invisible visible, with that knowledge start to make better decisions based on data. I hope that my session helps someone take the first step to being more proactive in their monitoring by showing them what can be done with Grafana and other tools in the monitoring space.

In my session, I will give an overview of monitoring and metrics, followed by an intro to Grafana. I plan to show how to monitor MySQL and finish off with a quick look at the new MySQL data source for Grafana.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Daniel: Firstly, it is always great to have an excuse to visit Ireland (I’m an Irishman living in Sweden). I’m also looking forward to getting feedback from the community on Grafana’s new MySQL data source plugin, as well as just talking to people and hearing about their experiences with database monitoring.

Want to find out more about Daniel and data visualization? Register for Percona Live Europe 2017, and see their talk Visualize Your Data With Grafana. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 13, 2017 04:47 PM

Massive Parallel Log Processing with ClickHouse

ClickHouse

In this blog, I’ll look at how to use ClickHouse for parallel log processing.

Percona is seen primarily for our expertise in MySQL and MongoDB (at this time), but neither is quite suitable to perform heavy analytical workloads. There is a need to analyze data sets, and a very popular task is crunching log files. Below I’ll show how ClickHouse can be used to efficiently perform this task. ClickHouse is attractive because it has multi-core parallel query processing, and it can even execute a single query using multiple CPUs in the background.

I am going to check how ClickHouse utilizes multiple CPU cores and threads. I will use a server with two sockets, equipped with “Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz” in each. That gives a total of 28 CPU cores / 56 CPU threads.

To analyze workload, I’ll use an Apache log file from one of Percona’s servers. The log has 1.56 billion rows, and uncompressed it takes 274G. When inserted into ClickHouse, the table on disk takes 9G.

How do we insert the data into ClickHouse? There is a lot of scripts to transform Apache log format to CSV, which ClickHouse can accept. As for the base, I used this one:

https://gist.github.com/sepehr/fff4d777509fa7834531

and my modification you can find here:

https://github.com/vadimtk/clickhouse-misc/blob/master/apachelog-to-csv.pl

The ClickHouse table definition:

CREATE TABLE default.apachelog ( remote_host String, user String, access_date Date, access_time DateTime, timezone String, request_method String, request_uri String, status UInt32, bytes UInt32, referer String, user_agent String) ENGINE = MergeTree(access_date, remote_host, 8192)

To test how ClickHouse scales on multiple CPU cores/threads, I will execute the same query by allocating from 1 to 56 CPU threads for ClickHouse processes. This can be done as:

ps -eLo cmd,tid | grep clickhouse-server | perl -pe 's/.* (d+)$/1/' | xargs -n 1 taskset -cp 0-$i

where $i is (N CPUs-1).

We must also take into account that not all queries are equal. Some are easier to execute in parallel than others. So I will test three different queries. In the end, we can’t get around Amdahl’s Law!

The first query should be easy to execute in parallel:

select extract(request_uri,'(w+)$') p,sum(bytes) sm,count(*) c from apachelog group by p order by c desc limit 100

Speedup:

CPUs Time, sec Speedup to 1 CPU
1 823.646 1
2 413.832 1.990291
3 274.548 3.000007
4 205.961 3.999039
5 164.997 4.991885
6 137.455 5.992114
7 118.079 6.975381
8 103.015 7.995399
9 92.01 8.951701
10 82.853 9.941052
11 75.334 10.93326
12 69.23 11.89724
13 63.848 12.90011
14 59.388 13.8689
15 55.433 14.85841
16 52.158 15.79136
17 49.054 16.7906
18 46.331 17.77743
19 43.985 18.72561
20 41.795 19.70681
21 39.763 20.71388
22 38.031 21.65723
23 36.347 22.66063
24 34.917 23.58868
25 33.626 24.49432
26 32.42 25.40549
27 31.21 26.39045
28 30.135 27.33187
29 29.947 27.50346
30 29.709 27.72379
31 29.283 28.1271
32 28.979 28.42217
33 28.807 28.59187
34 28.477 28.9232
35 28.146 29.26334
36 27.921 29.49916
37 27.613 29.8282
38 27.366 30.09742
39 27.06 30.43777
40 26.817 30.71358
41 26.644 30.913
42 26.394 31.2058
43 26.215 31.41888
44 25.994 31.686
45 25.762 31.97135
46 25.554 32.23159
47 25.243 32.62869
48 25.102 32.81197
49 24.946 33.01716
50 24.668 33.38925
51 24.537 33.56751
52 24.278 33.92561
53 24.035 34.26861
54 23.839 34.55036
55 23.734 34.70321
56 23.587 34.91949

 

It’s much more interesting to chart these results:

From the chart, we can see that the query scales linearly up to 28 cores. After that, it continues to scale up to 56 threads (but with a lesser slope). I think this is related to the CPU architecture (remember we have 28 physical cores and 56 CPU “threads”). Let’s look at the results again. With one available CPU, the query took 823.6 sec to execute. With all available CPUs, it took 23.6 sec. So the total speedup is 34.9 times.

But let’s consider a query that allows a lesser degree of parallelism. For example, this one:

select access_date c2, count(distinct request_uri) cnt from apachelog group by c2 order by c2 limit 300

This query uses aggregation that counts unique URIs, which I am sure limits the counting process to a single shared structure. So some part of the execution is limited to a single process. I won’t show the full results for all 1 to 56 CPUs, but for one CPU the execution time is 177.715 sec, and for 56 CPUs the execution time is 11.564 sec. The total speedup is 15.4 times.

The speedup chart looks like this:

As we suspected, this query allows less parallelism. What about even heavier queries? Let’s consider this one:

SELECT y, request_uri, cnt FROM (SELECT access_date y, request_uri, count(*) AS cnt FROM apachelog GROUP BY y, request_uri ORDER BY y ASC ) ORDER BY y,cnt DESC LIMIT 1 BY y

In that query, we build a derived table (to resolve the subquery) and I expect it will limit the parallelism even further. And it does: with one CPU the query takes 183.063 sec to execute. With 56 CPUs it takes 28.572 sec. So the speedup is only 6.4 times.

The chart is:

Conclusions

ClickHouse can capably utilize multiple CPU cores available on the server, and query execution is not limited by a single CPU (like in MySQL). The degree of parallelism is defined by the complexity of the query, and in the best case scenario, we see linear scalability with the number of CPU cores. For the scaling on multiple servers you can see my previous post:

https://www.percona.com/blog/2017/06/22/clickhouse-general-analytical-workload-based-star-schema-benchmark/

However, if query execution is serial, it limits the speedup (as described in Amdahl’s Law).

One example is a 1.5 billion record Apache log, and we can see that ClickHouse can execute complex analytical queries within tens of seconds.

by Vadim Tkachenko at September 13, 2017 08:17 AM

Shlomi Noach

Speaking at Percona Live Dublin: keynote, orchestrator tutorial, MySQL testing automation

I'm looking forward to a busy Percona Live Dublin conference, delivering three talks. Chronologically, these are:

  • Practical orchestrator tutorial
    Attend this 3 hour tutorial for a thorough overview on orchestrator: what, why, how to configure, best advice, deployments, failovers, security, high availability, common operations, ...
    We will of course discuss the new orchestrator/raft setup and share our experience running it in production.
    The tutorial will allow for general questions from the audience and open discussions.
  • Why Open Sourcing Our Database Tooling was the Smart Decision
    What it says. A 10 minute journey advocating for open sourcing infrastructure.
  • MySQL Infrastructure Testing Automation at GitHub
    Co-presenting with Tom Krouper, we share how & why we run infrastructure tests in and near production that gives us trust in many of our ongoing, ever changing operations. Essentially this is "why you should feel OK trusting us with your data".

See you there!

by shlomi at September 13, 2017 05:20 AM

September 12, 2017

Peter Zaitsev

cscope: Searching Code Efficiently

cscope

In this post, we will discuss how to search code with the help of cscope. Let’s begin by checking its description and capabilities (quoting directly from http://cscope.sourceforge.net/):

Cscope is a developer’s tool for browsing source code.

  • Allows searching code for:
    • all references to a symbol
    • global definitions
    • functions called by a function
    • functions calling a function
    • text string
    • regular expression pattern
    • a file
    • files including a file
  • Curses based (text screen)
  • An information database is generated for faster searches and later reference
  • The fuzzy parser supports C, but is flexible enough to be useful for C++ and Java, and for use as a generalized ‘grep database’ (use it to browse large text documents!)

Of course, developers aren’t the only ones browsing the code (as implied by the tool’s description). In the Support team, we find ourselves having to check code many times. This tool is a great aid in doing so. As you can imagine already, this tool can replace find and grep -R "<keyword(s)>" *, and will even add more functionality! Not only this, but our searches run faster (since they are indexed).

The main focus of this post is to explore cscope’s searching capabilities regarding code, but note that you can also use it for text searches that aren’t linked to function names or symbols (supporting regular expressions) and for file searches. This also means that even if the tool doesn’t recognize a function name, you can still use the text search as a fallback.

There is an online manual page, for quick reference:

http://cscope.sourceforge.net/cscope_man_page.html

To install it under RHEL/CentOS, simply issue:

shell> yum install cscope

You can use cscope with MySQL, Percona Server for MySQL or MariaDB code alike. In my case, I had a VM with Percona Server for MySQL 5.7.18 already available, so I’ve used that for demonstration purposes.

We should first get the source code for the exact version we are working with, and build the cscope database (used by the tool to perform searches):

shell> wget https://www.percona.com/downloads/Percona-Server-LATEST/Percona-Server-5.7.18-15/source/tarball/percona-server-5.7.18-15.tar.gz
shell> tar xzf percona-server-5.7.18-15.tar.gz
shell> cd percona-server-5.7.18-15
shell> cscope -bR

-b will build the database only, without accessing the CLI; -R will recursively build the symbol database from the directory it’s executed, down. We can also add -q for fast symbol lookup, at the expense of a larger database (we’ll check how much more below).

Now that we have built the cscope database, we will see a new file created: cscope.out. If we used -q, we will also see: cscope.in.out and cscope.po.out. Their sizes depend on the size of the codebase in question. Here are the sizes before and after building the cscope database (with -q):

shell> du -d 1 -h ..
615M ../percona-server-5.7.18-15
shell> cscope -bqR
shell> du -h cscope.*
8.2M cscope.in.out
69M cscope.out
103M cscope.po.out
shell> du -d 1 -h ..
794M ../percona-server-5.7.18-15

This gives around 30% increase in size while using -q, and around 10% increase without it. Your mileage may vary: be aware of this if you are using it on a test server with many different versions, or if the project size is considerably larger. It shouldn’t be much of a problem, but it’s something to take into account.

Ok, enough preamble already, let’s see it in action! To access the CLI, we can use cscope -d.

A picture is worth a thousand words. The following output corresponds to searching for the MAX_MAX_ALLOWED_PACKET symbol:

cscope

If there are multiple potential matches, the tool lists them for our review. If there is only one match, it will automatically open the file, with the cursor at the appropriate position. To check a match, either select it with the arrow keys and hit enter, or use the number/letter listed. When you are done and need to get back to cscope to continue checking other matches, simply exit the text editor (which can be defined by using CSCOPE_EDITOR). To get back to the main menu to modify the search, press CTRL-f. To exit the tool press CTRL-d. Lastly, CTRL-c toggles case insensitive mode on and off.

To show how the tool displays searches with many hits, let’s search for functions that call printf:

cscope

We can now see that letters are also used to list options, and that we can hit space to page down for more matches (from a total of 4508).

Lastly, as mentioned before if everything else fails and you are not able to find the function or symbol you need (due to limitations or bugs), you can use the “Find this text string” and “Find this egrep pattern” functionality.

I hope this brief tour of cscope has been useful, and helps you get you started using it. Note that you can use it for other projects, and it can be handy if you need to dive into the Linux kernel too.

Addendum

For even more power, you can read this vim tutorial (http://cscope.sourceforge.net/cscope_vim_tutorial.html), or set up ctags (http://ctags.sourceforge.net/) along with cscope.

by Agustín at September 12, 2017 05:58 PM

Upcoming Webinar September 14, 2017: Supercharge Your Analytics with ClickHouse

ClickHouse

ClickHouseJoin Percona’s CTO Vadim Tkachenko @VadimTk and Altinity’s Co-Founder, Alexander Zaitsev as they present Supercharge Your Analytics with ClickHouse on Thursday, September 14, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

 

ClickHouse is a real-time analytical database system. Even though they’re only celebrating one year as open source software, it has already proved itself ready for serious workloads.

We will talk about ClickHouse in general, some of its internals and why it is so fast. ClickHouse works in conjunction with MySQL – traditionally weak for analytical workloads – and this presentation demonstrates how to make the two systems work together.

There will also be an in-person presentation on How to Build Analytics for 100bn Logs a Month with ClickHouse at the meetup Wednesday, September 13, 2017. RSVP here.

Alexander Zaitsev will also be speaking at Percona Live Europe 2017 on Building Multi-Petabyte Data Warehouses with ClickHouse on Wednesday, September 27 at 11:30 am. Use the promo code “SeeMeSpeakPLE17” for 15% off.

Alexander ZaitsevAlexander Zaitsev
Altinity’s Co-Founder
Alexander is a co-founder of Altinity. He has 20 years of engineering and engineering management experience in several international companies. Alexander is expert in high scale analytics systems design and implementation. He designed and deployed petabyte scale data warehouses, including one of earliest ClickHouse deployments outside of Yandex.

Vadim Tkachenko
CTO
Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products.

He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition. Previously, he founded a web development company in his native Ukraine and spent two years in the High Performance Group within the official MySQL support team. Vadim received a BS in Economics and an MS in computer science from the National Technical University of Ukraine. He now lives in California with his wife and two children.

by Emily Ikuta at September 12, 2017 04:03 PM

Jean-Jerome Schmidt

ClusterControl in the Cloud - All Our Resources

While many of our customers utilize ClusterControl on-premise to automate and manage their open source databases, several are deploying ClusterControl alongside their applications in the cloud. Utilizing the cloud allows your business and applications to benefit from the cost-savings and flexibility that come with cloud computing. In addition you don’t have to worry about purchasing, maintaining and upgrading equipment.

Along the same lines, ClusterControl offers a suite of database automation and management functions to give you full control of your database infrastructure. With it you can deploy, manage, monitor and scale your databases, securely and with ease through our point-and-click interface.

As the load on your application increases… Your cloud environment can be expanded to provide more computing power to handle that load. In much the same way ClusterControl utilizes state-of-the-art database, caching, and load balancing technologies that enable you to scale-out the load on your databases and spread that load evenly across nodes.

These performance benefits are just some of the many reasons to leverage ClusterControl to manage your open source database instances in the cloud. From advanced monitoring to backups and automatic failover, ClusterControl is your true end-to-end database management solution.

Below you will find some of our top resources to help you get your databases up-and-running in the cloud…

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

AWS Marketplace

ClusterControl on the AWS Marketplace

Want to install ClusterControl directly onto your AWS EC2 instance? Check us out on the Amazon Marketplace.
(New version coming soon!)

Install Today

Top Blogs

Migrating MySQL database from Amazon RDS to DigitalOcean

This blog post describes the migration process from EC2 instance to a DigitalOcean droplet

Read More

MySQL in the Cloud - Online Migration from Amazon RDS to EC2 Instance (PART ONE)

RDS for MySQL is easy to get started. It's a convenient way to deploy and use MySQL, without having to worry about any operational overhead. The tradeoff though is reduced control.

Read More

MySQL in the Cloud - Online Migration from Amazon RDS to Your Own Server (PART TWO)

It's challenging to move data out of RDS for MySQL. We will show you how to do the actual migration of data to your own server, and redirect your applications to the new database without downtime.

Read More

MySQL in the Cloud - Pros and Cons of Amazon RDS

Moving your data into a public cloud service is a big decision. All the major cloud vendors offer cloud database services, with Amazon RDS for MySQL being probably the most popular. In this blog, we’ll have a close look at what it is, how it works, and compare its pros and cons.

Read More

About Cloud Lock-in and Open Source Databases

Severalnines CEO Vinay Joosery discusses key considerations to take when choosing cloud providers to host and manage mission critical data; and thus avoid cloud lock-in.

Read More

Infrastructure Automation - Deploying ClusterControl and MySQL-based systems on AWS using Ansible

This blog post has the latest updates to our ClusterControl Ansible Role. It now supports automatic deployment of MySQL-based systems (MySQL Replication, Galera Cluster, NDB Cluster).

Read More

Leveraging AWS tools to speed up management of Galera Cluster on Amazon Cloud

We previously covered basic tuning and configuration best practices for MyQL Galera Cluster on AWS. In this blog post, we’ll go over some AWS features/tools that you may find useful when managing Galera on Amazon Cloud. This won’t be a detailed how-to guide as each tool described below would warrant its own blog post. But this should be a good overview of how you can use the AWS tools at your disposal.

Read More

5 Performance tips for running Galera Cluster for MySQL or MariaDB on AWS Cloud

Amazon Web Services is one of the most popular cloud environments. Galera Cluster is one of the most popular MySQL clustering solutions. This is exactly why you’ll see many Galera clusters running on EC2 instances. In this blog post, we’ll go over five performance tips that you need to take under consideration while deploying and running Galera Cluster on EC2.

Read More

How to change AWS instance sizes for your Galera Cluster and optimize performance

Running your database cluster on AWS is a great way to adapt to changing workloads by adding/removing instances, or by scaling up/down each instance. At Severalnines, we talk much more about scale-out than scale up, but there are cases where you might want to scale up an instance instead of scaling out.

Read More

by Severalnines at September 12, 2017 09:59 AM

Shlomi Noach

orchestrator 3.0.2 GA released: raft consensus, SQLite

orchestrator 3.0.2 GA is released and available for download (see also packagecloud repository).

3.0.2 is the first stable release in the 3.0* series, introducing (recap from 3.0 pre-release announcement):

orchestrator/raft

Raft is a consensus protocol, supporting leader election and consensus across a distributed system.  In an orchestrator/raft setup orchestrator nodes talk to each other via raft protocol, form consensus and elect a leader. Each orchestrator node has its own dedicated backend database. The backend databases do not speak to each other; only the orchestrator nodes speak to each other.

No MySQL replication setup needed; the backend DBs act as standalone servers. In fact, the backend server doesn't have to be MySQL, and SQLiteis supported. orchestrator now ships with SQLite embedded, no external dependency needed.

For details, please refer to the documentation:

SQLite

Suggested and requested by many, is to remove orchestrator's own dependency on a MySQL backend. orchestrator now supports a SQLite backend.

SQLite is a transactional, relational, embedded database, and as of 3.0 it is embedded within orchestrator, no external dependency required.

orchestrator-client

orchestrator-client is a client shell script which mimics the command line interface, while running curl | jq requests against the HTTP API. It stands to simplify your deployments: interacting with the orchestrator service via orchestrator-client is easier and only requires you to place a shell script (this is as opposed to installing the orchestrator binary + configuration file).

orchestrator-client is the way to interact with your orchestrator/raft cluster. orchestrator-client now has its own RPM/deb release package.

You may still use the web interface, web API ; and a special --ignore-raft-setup keeps power at your hand (use at your own risk).

State of orchestrator/raft

orchestrator/raft is a big change:

  • In the way it is deployed
  • In the way it is operated
  • In the high availability it provides
  • and more

This is why it has been tested in production for a few months.

orchestrator/raft now runs in production at GitHub ; we've decommissioned the "old" orchestrator setup having run both in parallel for a while. It drives our failovers and deploys over three data centers.

We are using MySQL as backend to our orchestrator cluster. We will introduce more staging tests for SQLite-based setups.

Roadmap

There's much to do, and we chose to release a version that has a way to go. We expect:

  • Dynamic raft cluster join/leave operations (right now the cluster is static and configuration based)
  • New nodes joining the cluster to auto-populate data from the cluster. This is actually a built-in feature to the hashicorp/raft library that we use, however we intentionally did not use this functionality, and expect to re-introduce it. At this time, adding a newly provisioned node to the cluster requires a backup/restore or dump/load of DB data from an existing node.
  • Partitioning of probe tasks across nodes
  • Various thoughts on proxy integrations
  • More...

We will focus on making operations simpler, and of course keep stability and reliability at highest priority.

orchestrator tutorial

In two weeks time I will be presenting the practical orchestrator tutorial, a 3 hour practical walkthrough on deployment, configuration, reasoning and more.

by shlomi at September 12, 2017 08:39 AM

September 11, 2017

Peter Zaitsev

Updating InnoDB Table Statistics Manually

InnoDB Tables

InnoDB TablesIn this post, we will discuss how to fix cardinality for InnoDB tables manually.

As a support engineer, I often see situations when the cardinality of a table is not correct. When InnoDB calculates the cardinality of an index, it does not scan the full table by default. Instead it looks at random pages, as determined by options innodb_stats_sample_pages, innodb_stats_transient_sample_pages and innodb_stats_persistent_sample_pages, or by the 

CREATE TABLE
 option
STATS_SAMPLE_PAGES
. The default value for persistent statistics is 20. This approach works fine when the number of unique values in your secondary key grows in step with the size of the table. But what if you have a column that has a comparatively small number of unique values? This could be a common service, many-to-many relationship table, for example, or just a table containing a list of sell orders that belong to one of a dozen shops owned by the company. Such tables could grow up to billions of rows with a small (less than 100) number of unique shop IDs.

At some point, InnoDB will report the wrong values for such indexes. Really! If 20 pages have 100 unique shop IDs, how many unique shop IDs would 20000 pages have? 100 times 1000? This seems logical, and after a certain number of rows such indexes will have extraordinarily large cardinality values.

ANALYZE TABLE
 will not help, because it uses the same algorithm. Increasing the number of “stats” sample pages would help, but it has its own downside: the more pages you have to examine, the slower
ANALYZE TABLE
 runs. While this command is not blocking, it still creates side effects as described in this blog post. And the longer it runs, the less control you have.

Another issue with InnoDB statistics: even if it is persistent and

STATS_AUTO_RECALC
 is set to 0, it still adds values for secondary indexes as shown in lp:1538765. Eventually, after you insert million of rows, your statistics get corrupted.
ANALYZE TABLE
  can fix it only if you specify a very large number of “stats” sample pages.

Can we do anything about it?

InnoDB stores statistics in the “mysql” database, in the tables

innodb_table_stats
 and
innodb_index_stats
. Since they are regular MySQL tables, privileged users can access them. We can update them and modify statistics as we like. And these statistics are used by the Optimizer!

I created a small example showing how to do this trick. I used Percona Server for MySQL version 5.7.19, but the trick will work on any supported MySQL and Percona Server for MySQL version.

First, let’s create test tables. The first table has shops, with a few shop profiles with the shop ID and name:

create table shops(
  shop_id int not null auto_increment primary key,
  name varchar(32)
) engine=innodb;

The second table refers to the “shops” table:

create table goods(
  id int not null auto_increment primary key,
  shop_id int not null,
  name varchar(32),
  create_date datetime DEFAULT NULL,
  key (shop_id, create_date)
) engine=innodb;

Let’s check how many unique shops we have:

mysql> select count(distinct shop_id) from shops;
+-------------------------+
| count(distinct shop_id) |
+-------------------------+
| 100                     |
+-------------------------+
1 row in set (0.02 sec)

With 100 distinct shops, and a key on

(shop_id, create_date)
, we expect cardinality in table goods to be not much different than this query result:

mysql> select count(distinct id) as `Cardinality for PRIMARY`,
    -> count(distinct shop_id) as `Cardinality for shop_id column in index shop_id`,
    -> count(distinct shop_id, create_date) as `Cardinality for create_date column in index shop_id`
    -> from goods
*************************** 1. row ***************************
Cardinality for PRIMARY: 8000000
Cardinality for shop_id column in index shop_id: 100
Cardinality for create_date column in index shop_id: 169861
1 row in set (2 min 8.74 sec)

However, 

SHOW INDEX
 returns dramatically different values for the column
shop_id
:

mysql> show index from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7289724 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       13587 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      178787 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.09 sec)

ANALYZE TABLE
 does not help:

mysql> analyze table goods;
+------------+---------+----------+----------+
|      Table |      Op | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.goods | analyze |   status |       OK |
+------------+---------+----------+----------+
1 row in set (0.88 sec)
mysql> show index from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7765796 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       14523 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      168168 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

As a result, if we join the two tables, Optimizer chooses the wrong

JOIN
 order and query execution plan:

mysql> explain select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
| id | select_type | table | partitions |  type | possible_keys |     key | key_len |                ref | rows | filtered |                    Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
|  1 |      SIMPLE | shops |       NULL | index |       PRIMARY | PRIMARY |       4 |               NULL |  100 |   100.00 | Using where; Using index |
|  1 |      SIMPLE | goods |       NULL |   ref |       shop_id | shop_id |       4 | test.shops.shop_id |  534 |    11.11 |    Using index condition |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.13 sec)
mysql> P md5sum
PAGER set to 'md5sum'
mysql> select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
4a94dabc4bfbfb7dd225bcb50278055b -
31896 rows in set (43.32 sec)

If compared to 

STRAIGHT_JOIN
 order:

mysql> explain select goods.* from goods straight_join shops on(goods.shop_id = shops.shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
| id | select_type | table | partitions |   type | possible_keys |     key | key_len |                ref |  rows | filtered |                 Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
|  1 |      SIMPLE | goods |       NULL |  range |       shop_id | shop_id |      10 |               NULL | 31997 |   100.00 | Using index condition |
|  1 |      SIMPLE | shops |       NULL | eq_ref |       PRIMARY | PRIMARY |       4 | test.goods.shop_id |     1 |   100.00 |           Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
2 rows in set, 1 warning (0.14 sec)
mysql> P md5sum
PAGER set to 'md5sum'
mysql> select goods.* from goods straight_join shops on(goods.shop_id = shops.shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
4a94dabc4bfbfb7dd225bcb50278055b -
31896 rows in set (7.94 sec)

The time difference for a small 8M row table is around six times! For a big table with many columns, it would be even larger.

Is

STRAIGHT_JOIN
 the only solution for this case?

No! It’s also not a great solution because if the query is complicated and involves more than two tables, it may be affected by bug fixes and improvements in the Optimizer code. Then the query order might not be optimal for new versions and updates. Therefore, you’ll need to test such queries at each upgrade, including minor ones.

So why does

ANALYZE TABLE
 not work? Because the default number of pages it uses to calculate statistics is too small for the difference. You can increase the table option
STATS_SAMPLE_PAGES
  until you find a proper one. The drawback is that the greater you set 
STATS_SAMPLE_PAGES
, the longer it takes for 
ANALYZE TABLE
 to finish. Also, if you update a large portion of the table, you are often affected by lp:1538765. At some point, the statistics will again be inaccurate.

Now let’s try our manual statistics update trick

InnoDB stores its persistent statistics in the tables

mysql.innodb_table_stats
  and
mysql.innodb_index_stats
:

mysql> alter table goods stats_persistent=1, stats_auto_recalc=0;
Query OK, 0 rows affected (0.11 sec)
Records: 0 Duplicates: 0 Warnings: 0
+---------------+------------+---------------------+---------+----------------------+--------------------------+
| database_name | table_name |         last_update |  n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
|          test |      goods | 2017-09-05 00:21:12 | 7765796 |                34624 |                    17600 |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
1 row in set (0.00 sec)
mysql> select * from mysql.innodb_index_stats where table_name='goods';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name |         last_update |    stat_name | stat_value | sample_size |                  stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 | n_diff_pfx01 |    7765796 |          20 |                                id |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 | n_leaf_pages |      34484 |        NULL | Number of leaf pages in the index |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 |         size |      34624 |        NULL |      Number of pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_diff_pfx01 |      14523 |          20 |                           shop_id |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_diff_pfx02 |     168168 |          20 |               shop_id,create_date |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_diff_pfx03 |    8045310 |          20 |            shop_id,create_date,id |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_leaf_pages |      15288 |        NULL | Number of leaf pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 |         size |      17600 |        NULL |      Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
8 rows in set (0.00 sec)

And we can update these tables directly:

mysql> update mysql.innodb_table_stats set n_rows=8000000 where table_name='goods';
Query OK, 1 row affected (0.18 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=8000000 where stat_description in('id', 'shop_id,create_date,id') and table_name='goods';
Query OK, 2 rows affected (0.08 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=100 where stat_description in('shop_id') and table_name='goods';
Query OK, 1 row affected (0.09 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=169861 where stat_description in('shop_id,create_date') and table_name='goods';
Query OK, 1 row affected (0.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0

I took index values from earlier, as calculated by this query:

select count(distinct id) as `Cardinality for PRIMARY`, count(distinct shop_id) as `Cardinality for shop_id column in index shop_id`, count(distinct shop_id, create_date) as `Cardinality for create_date column in index shop_id` from goods;

mysql> select * from mysql.innodb_table_stats where table_name='goods';
+---------------+------------+---------------------+---------+----------------------+--------------------------+
| database_name | table_name |         last_update |  n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
|          test |      goods | 2017-09-05 00:47:45 | 8000000 |                34624 |                    17600 |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
1 row in set (0.00 sec)
mysql> select * from mysql.innodb_index_stats where table_name='goods';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name |         last_update |    stat_name | stat_value | sample_size |                  stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
|          test |      goods |    PRIMARY | 2017-09-05 00:48:32 | n_diff_pfx01 |    8000000 |          20 |                                id |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 | n_leaf_pages |      34484 |        NULL | Number of leaf pages in the index |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 |         size |      34624 |        NULL |      Number of pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:49:13 | n_diff_pfx01 |        100 |          20 |                           shop_id |
|          test |      goods |    shop_id | 2017-09-05 00:49:26 | n_diff_pfx02 |     169861 |          20 |               shop_id,create_date |
|          test |      goods |    shop_id | 2017-09-05 00:48:32 | n_diff_pfx03 |    8000000 |          20 |            shop_id,create_date,id |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_leaf_pages |      15288 |        NULL | Number of leaf pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 |         size |      17600 |        NULL |      Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
8 rows in set (0.00 sec)

Now the statistics are up to date, but not used:

mysql> explain select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
| id | select_type | table | partitions |  type | possible_keys |     key | key_len |                ref | rows | filtered |                    Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
|  1 |      SIMPLE | shops |       NULL | index |       PRIMARY | PRIMARY |       4 |               NULL |  100 |   100.00 | Using where; Using index |
|  1 |      SIMPLE | goods |       NULL |   ref |       shop_id | shop_id |       4 | test.shops.shop_id |  534 |    11.11 |    Using index condition |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.04 sec)

To finalize the changes, we need to run

FLUSH TABLE goods
:

mysql> FLUSH TABLE goods;
Query OK, 0 rows affected (0.00 sec)
mysql> explain select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
| id | select_type | table | partitions |   type | possible_keys |     key | key_len |                ref |  rows | filtered |                 Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
|  1 |      SIMPLE | goods |       NULL |  range |       shop_id | shop_id |      10 |               NULL | 31997 |   100.00 | Using index condition |
|  1 |      SIMPLE | shops |       NULL | eq_ref |       PRIMARY | PRIMARY |       4 | test.goods.shop_id |     1 |   100.00 |           Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
2 rows in set, 1 warning (0.28 sec)
mysql> P md5sum
PAGER set to 'md5sum'
mysql> select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
4a94dabc4bfbfb7dd225bcb50278055b -
31896 rows in set (7.79 sec)

Now everything is good.

But

FLUSH TABLE
 is a blocking operation, right? Won’t it block queries and create a worse scenario than described for ANALYZE TABLE in this post?

At first glance this is true. But we can use the same trick Percona Toolkit uses: set

lock_wait_timeout
 to 1 and call
FLUSH
 in a loop. To demonstrate how it works, I use a similar scenario as described in the
ANALYZE TABLE
 blog post.

First, let’s reset the statistics to ensure our

FLUSH
 works as expected:

mysql> analyze table goods;
+------------+---------+----------+----------+
|      Table |      Op | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.goods | analyze |   status |       OK |
+------------+---------+----------+----------+
1 row in set (0.38 sec)
mysql> show indexes from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7765796 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       14523 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      168168 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

And then update

mysql.innodb_*_stats
 tables manually. Then check that Optimizer still sees outdated statistics:

mysql> update mysql.innodb_table_stats set n_rows=8000000 where table_name='goods';
Query OK, 1 row affected (0.09 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=8000000 where stat_description in('id', 'shop_id,create_date,id') and table_name='goods';
Query OK, 2 rows affected (0.09 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=100 where stat_description in('shop_id') and table_name='goods';
Query OK, 1 row affected (0.11 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=169861 where stat_description in('shop_id,create_date') and table_name='goods';
Query OK, 1 row affected (0.10 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> show indexes from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7765796 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       14523 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      168168 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

Now let’s start a long running query in one session that blocks our

FLUSH TABLE
 command:

mysql> select sleep(1) from goods limit 1000, 300;

And let’s run

FLUSH TABLE
 in a loop:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ until (`mysqlmtr -P13001 -e "set lock_wait_timeout=1; flush table goods;" test`); do sleep 1; done
ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
...

Now let’s ensure we can access the table:

mysql> select * from goods order by id limit 10;
^C

We cannot! We cannot even connect to the database where the table is stored:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysqlmtr -P13001 test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
^C

The reason for this is that while the 

FLUSH TABLE
 command was killed due to the metadata lock wait timeout, it also requested table lock for flushing and blocked other incoming queries.

But we can enclose

FLUSH TABLE
 into
LOCK TABLE ... WRITE; ... UNLOCK TABLES;
 operations. In this case, the 
LOCK TABLE
 command gets blocked until all queries release metadata lock on the table. Then it exclusively locks the table,
FLUSH TABLE
 runs and then the script immediately unlocks the table. Since closing the session causes an implicit unlock, I used a PHP one-liner to have everything in a single session:

$ php -r '
> $link = new mysqli("127.0.0.1", "root", "", "test", 13001);
> $link->query("set lock_wait_timeout=1");
> while(!$link->query("lock table goods write")) {sleep(1);}
> $link->query("flush table goods");
> $link->query("unlock tables");'

We can confirm if a parallel session can access the table:

mysql> select * from goods order by id limit 10;
+----+---------+----------------------------------+---------------------+
| id | shop_id |                             name |         create_date |
+----+---------+----------------------------------+---------------------+
|  1 |      58 | 5K0z2sHTgjWKKdryTaniQdZmjGjA9wls | 2015-09-19 00:00:00 |
|  2 |      17 | xNll02kgUTWAFURj6j5lL1zXAubG0THG | 2013-10-19 00:00:00 |
|  3 |      30 | clHX7uQopKmoTtEFH5LYBgQncsxRtTIB | 2017-08-01 00:00:00 |
|  4 |      93 | bAzoQTN98AmFjPOZs7PGfbiGfaf9Ye4b | 2013-02-24 00:00:00 |
|  5 |      20 | rQuTO5GHjP60kDbN6WoPpE2S8TtMbrVL | 2017-08-05 00:00:00 |
|  6 |      37 | WxqxA5tBHxikaKbuvbIF84H9QuaCnqQ3 | 2013-10-18 00:00:00 |
|  7 |      13 | DoYnFpQZSVV8UswBsWklgGBUc8zW9mVW | 2017-02-06 00:00:00 |
|  8 |      81 | dkNxMQyZNZuTrONEX4gxRLa0DOedatIs | 2015-07-05 00:00:00 |
|  9 |      12 | Z0t2uQ9itexpPf01KUpa7qBWlT5fBmXR | 2014-06-25 00:00:00 |
| 10 |      90 | 6urABBQyaUVVyxljvd11D3kUxbdDRPRV | 2013-10-23 00:00:00 |
+----+---------+----------------------------------+---------------------+
10 rows in set (0.00 sec)
mysql> update goods set name='test' where id=100;
Query OK, 1 row affected (0.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0

After the PHP script finishes its job, statistics are corrected:

mysql> show index from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     8000000 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |         100 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      169861 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

Conclusion

We can manually update persistent InnoDB statistics to fix Optimizer plans for our queries, with almost no impact on a live server.

by Sveta Smirnova at September 11, 2017 07:00 PM

Upcoming Webinar Tuesday September 12: Differences between MariaDB® and MySQL®

MariaDB and MySQL

MariaDB and MySQLJoin Percona’s Chief Evangelist, Colin Charles (@bytebot) as he presents Differences Between MariaDB and MySQL on Tuesday, September 12, 2017, at 7:00 am PDT / 10:00 am EDT (UTC-7).

 

Are they syntactically similar? Where do these two query languages differ? Why would I use one over the other?

MariaDB is on the path to gradually diverge from MySQL. One obvious example is the internal data dictionary currently under development for MySQL 8. This is a major change to the way metadata is stored and used within the server, and MariaDB doesn’t have an equivalent feature. Implementing this feature could mark the end of datafile-level compatibility between MySQL and MariaDB.

There are also non-technical differences between MySQL and MariaDB, including:

  • Licensing: MySQL offers their code as open-source under the GPL, and provides the option of non-GPL commercial distribution in the form of MySQL Enterprise. MariaDB can only use the GPL because their work is derived from the MySQL source code under the terms of that license.
  • Support services: Oracle provides technical support, training, certification and consulting for MySQL, while MariaDB has their own support services. Some people will prefer working with smaller companies, as traditionally it affords them more leverage as a customer.
  • Community contributions: MariaDB touts the fact that they accept more community contributions than Oracle. Part of the reason for this disparity is that developers like to contribute features, bug fixes and other code without a lot of paperwork overhead (and they complain about the Oracle Contributor Agreement). However, MariaDB has its own MariaDB Contributor Agreement — which more or less serves the same purpose.

Colin will take a look at some of the differences between MariaDB and MySQL and help answer some of the common questions our Database Performance Experts get about the two databases.

You can register for the webinar here.

MariaDB and MySQLColin Charles, Percona Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team of MariaDB Server in 2009, worked at MySQL since 2005 and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He’s well known within open source communities in APAC and has spoken at many conferences.

by Emily Ikuta at September 11, 2017 03:28 PM

September 10, 2017

Henrik Ingo

New impress.js features 2017

It's been almost a year since my last update on features I've added to the impress.js presentation framework. Already a year ago I had pretty much merged or implemented all of the open backlog for the project. But at the end of the post I still listed two popular requests I hadn't implemented: Support sub-steps and 2D navigation.

read more

by hingo at September 10, 2017 05:58 PM

September 09, 2017

Shlomi Noach

Remembering Jaakko Pesonen

I was sorrowed to hear that Jaakko Pesonen has passed away after battling cancer.

I first met Jaakko a few years back, during a Percona Live conference, and as community goes, our paths crossed again a few times. He spoke at and attended conferences where we'd have casual chats.

We were both expats in the Netherlands for a period. As I moved in from Israel, he was already working at Spil Games, having relocated from Finland, his home country. We shared expat experiences and longings to our homes. One day he pinged me that he was planning a trip to Israel - and the next few days were all about planning the best culinary experience of his travel (he approved of the results).

He was happy for the opportunity to work for Percona, as this allowed him to move back home to Finland.

Jaakko had the biggest, widest, most consuming smile, and this smile will sure be the most vivid memory of him that I'll keep.

I do not have personal pictures of Jaakko. This picture was taken by Julian Cash at Percona Live. A rare non-smiling appearance.

 

 

 

by shlomi at September 09, 2017 05:00 AM

September 08, 2017

Peter Zaitsev

Percona Live Europe Featured Talks: MongoDB – Sharded Cluster Tutorial with Jason Terpko and Antonios Giannopoulos

Percona Live Europe 2017

Percona Live Europe 2017 - JasonWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Jason Terpko and Antonios Giannopoulos, DBAs at ObjectRocket. Their tutorial is MongoDB – Sharded Cluster Tutorial. This tutorial guides you through the many considerations when deploying a sharded cluster. The talk covers the services that make up a sharded cluster, configuration recommendations for these services, shard key selection, use cases and how to manage data within a sharded cluster. In our conversation, we discussed how using a sharded cluster can benefit your database environment:

Percona: How did you get into database technology? What do you love about it?

Jason: Nowadays, a DBA must also be part sysadmin and part developer (and always awesome). Being a DBA gives me the opportunity to deal with the entire stack. I never get bored.

Percona Live Europe 2017 - AntoniosAntonios: I agree with Jason, and I have to add that today there are probably more databases than programming languages. Designing an application often involves using more than one database technology. The challenge is choosing the right ones each time. Honestly, how can you get bored with that?

Percona: You’re presenting a session called “MongoDB – Sharded Cluster Tutorial.” What is a MongoDB sharded cluster, and how is it useful in databases?

Jason: Scaling is one of the biggest challenges in databases world. A database can scale either vertically or horizontally. Vertical scaling is increasing the resources count to deal with database underperformance – doubling the RAM, or increasing speed through faster CPUs. Unfortunately, doubling capacity doesn’t necessarily mean doubling the performance. There is a breaking point where adding resources does not affect the performance. With horizontal scaling, we distribute the database workload among multiple servers. Each of the instances serves only a portion of the workload. If the database is underperforming, we simply add more servers. It’s faster and cheaper compared to vertical scaling, and at the same time the capacity-to-performance ratio is much higher with horizontal scaling. Sharding is MongoDB’s horizontal scaling mechanism.

Percona: How can a sharded cluster affect MongoDB performance (both negatively and positively)?

Antonios: Sharded clusters can have an immediate positive impact on application performance when the collection has been pre-distributed with a hashed shard key or manual splitting. These approaches allow your application to make use of all shards and resources from the start. For some newly deployed applications, this throughput is a requirement for a successful release.

The distribution of data, with its positive impact on performance, can also have a negative effect. Even with an evenly distributed collection, hot spotting can occur. This causes degradation for both targeted and broadcast operations. Also, write operations cause overhead when you need to move or balance this distributed data. This overhead can impact some applications when added to their workload at specific times.

Percona: What are some of the things you need to watch out for when deploying a sharded cluster?

Jason: First of all, there is the shard key selection. Choosing the right shard key makes your application rock (and you might earn the employee of the month award). Selecting a poor shard key may have a catastrophic effect on your business (and get you a much different type of company notice).

Secondly, after sharding a collection any existing applications must continue to work error free. Familiarizing yourself with shard key limitations and what operations may not work on a sharded collection is very important. Doing the research beforehand will prevent issues later.

Thirdly, you need to monitor the resources you have deployed your sharded cluster on. Whether it is physical, virtualized or containerized, all components should have a similar performance profile and reliable communication. For broadcast operations, your operation is only as fast as the slowest shard. And if internal traffic is not reliable your can cluster can be prone to issues.

Percona: What do you want attendees to take away from your session? Why should they attend?

Antonios: Our attendees will feel like Hamlet: “To shard or not to shard?”. At the end of the session, every attendee will be able to setup, maintain and troubleshoot a sharded cluster. Additionally, they will get their hands dirty in our labs. Don’t get me wrong, our slides are great! But sharded cluster mastery requires practice. Finally, we encourage discussion during the tutorial, so please come and raise your hand and ask us sharding-related questions. We would love to learn about your use cases and help you in any way.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Both: The free drinks of course! But seriously, Percona Live Euorpe is how community events should be. It’s our fourth time attending, and every single time we meet people that share our passion for databases. They are open for a conversation and everywhere we discover new ways to solve complex problems, new technologies to look at and innovative ideas to try out. The Percona Live conferences are getting better and better every year, and we are 100% sure that Percona Live Europe 2017 will be a success.

Want to find out more about Jason, Antonios and sharded clusters? Register for Percona Live Europe 2017, and see their talk MongoDB – Sharded Cluster Tutorial. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 08, 2017 06:32 PM

This Week in Data with Colin Charles #5: db tech showcase and Percona Live Europe

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Colin Charles

Percona Live Europe 2017 in Dublin

Have you registered for Percona Live Europe Dublin? We have announced some awesome keynotes, and our sponsor list is growing (and we’re always looking for more!).

There will also be a community dinner (Tuesday, September 26, 2017), so definitely watch the announcement that will be on the blog, and I’m sure on Twitter. Besides being fun, the Lightning Talks will happen during that time.

Releases

Link List

db tech showcase Tokyo, Japan

The annual db tech showcase Tokyo 2017 took place this week from 5-7 September. It was a fun event as always, with capacity for 800 people per day. The event grows larger each year, and reminds me of the heyday of the MySQL Conference & Expo.

The db tech showcase is a five-parallel-track show, with each talk approximately 50 minutes. The event started with a keynote by Richard Hipp, creator of SQLite (if you were a Percona Live Santa Clara 2017 attendee, you’d have also seen him there). The rest of the event is a mix between Japanese language content and English language content. The sponsor list is lengthy, and if you walk the floor you could collect a lot of datasheets.

One thing I really liked? At some talks, you’d get a clear folder with a contact form as well as the printed slide deck. This is a great way to let the speaker’s company contact you. It’s a common issue that I (and others) speak to large amounts of people and have no idea who’s in the talk. I can only imagine our marketing and sales teams being much happier if they could get access to an attendee list! I wonder if this will work in other markets?

It’s interesting to see that there is a Japan MariaDB User Group now. It’s clear the MySQL user group needs a revival! I saw a talk from Toshiba on performance tests using MariaDB Server, but not with MySQL (a little odd?). The MongoDB content was pretty latent, which is unsurprising because we don’t see a huge MongoDB uptake or community in Japan (or South Korea for that matter).

Will I go back? Absolutely. I’ve been going for a few years, and it’s a great place for people who are crazy about database technology. You really get a spectrum of database presentations, and I expect most people go back with many ideas of what they might want to evaluate for production.

I spoke about the Engineering that goes into Percona Server for MySQL 5.6 and 5.7, with a hint of MongoDB. The slides are in a mix of Japanese and English. The Japanese translation: Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント).

Upcoming Appearances

Percona’s website keeps track of community events, so check there to see where to listen to a Perconian speak. My upcoming appearances are:

Feedback

Did you try replication-manager last week? Guillaume Lefranc, the lead developer, writes in to talk about the new features such as support for MySQL 5.7, Binlog Flashback, multi-cluster mode and various stability fixes.

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at September 08, 2017 06:25 PM

Jean-Jerome Schmidt

Percona Live from the Emerald Isle: Containerised Dolphins, MySQL Load Balancers, MongoDB Management & more - for plenty of 9s

Yes, it’s that time of year again: the Percona Live Europe Conference is just around the corner.

And we’ll be broadcasting live from the Emerald Isle!

Quite literally, since we’re planning to be on our Twitter and Facebook channels during the course of the conference, so do make sure to tune in (if you’re not attending in person). And this year’s location is indeed Dublin, Ireland, so expect lots of banter and a bit of craic.

More specifically though, the Severalnines team will be well represented with three talks and a booth in the exhibition area. So do come and meet us there. If you haven’t registered yet, there’s still time to do so on the conference website: https://www.percona.com/live/e17/

Our talks at the conference:

Ashraf Sharif, Senior Support Engineer will be talking about MySQL on Docker and containerised dolphins (which only sounds like a good idea in the database world).

Krzysztof Książek, Senior Support Engineer, will share his knowledge and experience on all things MySQL load balancing.

And Ruairí Newman, also Senior Support Engineer, will be discussing what some of the main considerations are to think through when looking at automating and managing MongoDB - including a closer look at MongoDB Ops Manager and ClusterControl.

And since we don’t solely employ Senior Support Engineers at Severalnines, you’ll be pleased to know that Andrada Enache, Sales Manager, and myself will also be present to talk open source database management with ClusterControl at our booth.

This year’s conference agenda looks pretty exciting overall with a wide enough range of topics, so there’ll be something of interest for every open source database aficionado out there.

See you in Dublin ;-)

by jj at September 08, 2017 11:26 AM

September 07, 2017

Peter Zaitsev

Always Verify Examples When Comparing DB Products (PostgreSQL and MySQL)

PostgreSQL and MySQL

PostgreSQL and MySQLIn this blog post, I’ll look at a comparison of PostgreSQL and MySQL.

I came across a post from Hans-Juergen Schoenig, a Postgres consultant at Cybertec. In it, he dismissed MySQL and showed Postgres as better. While his post ignores most of the reasons why MySQL is better, I will focus on where his post is less than accurate. Testing for MySQL was done with Percona Server 5.7, defaults.

Mr. Schoenig complains that MySQL changes data types automatically. He claims inserting 1234.5678 into a numeric(4, 2) column on Postgres produces an error, and that MySQL just rounds the number to fit. In my testing I found this to be a false claim:

mysql> CREATE TABLE data (
    -> id    integer NOT NULL,
    -> data  numeric(4, 2));
Query OK, 0 rows affected (0.07 sec)
mysql> INSERT INTO data VALUES (1, 1234.5678);
ERROR 1264 (22003): Out of range value for column 'data' at row 1

His next claim is that MySQL allows updating a key column to NULL and silently changes it to 0. This is also false:

mysql> INSERT INTO data VALUES (1, 12);
Query OK, 1 row affected (0.00 sec)
mysql> UPDATE data SET id = NULL WHERE id = 1;
ERROR 1048 (23000): Column 'id' cannot be null

In the original post, we never see the warnings and so don’t have the full details of his environment. Since he didn’t specify which version he was testing on, I will point out that MySQL 5.7 does a far better job out-of-the-box handling your data than 5.6 does, and SQL Mode has existed in MySQL for ages. Any user could set it to

STRICT_ALL|TRANS_TABLES
 and get the behavior that is now default in 5.7.

The author is also focusing on a narrow issue, using it to say Postgres is better. I feel this is misleading. I could point out factors in MySQL that are better than in Postgres as well.

This is another case of “don’t necessarily take our word for it”. A simple test of what you see on a blog can help you understand how things work in your environment and why.

by Manjot Singh at September 07, 2017 03:45 PM

Percona Live Europe Featured Talks: NoSQL Best Practices for PostgreSQL with Dmitry Dolgov

Colin Charles

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Dmitry Dolgov, a software engineer at Zalando SE. His talk is titled NoSQL Best Practices for PostgreSQL. One of PostgreSQL’s most attractive features is the Jsonb data type. It allows efficient work with semi-structured data without sacrificing strong consistency and ability to use all the power of proven relational technology.  In our conversation, we discussed how to use this NoSQL feature in PostgreSQL:

Percona: How did you get into databases? What do you love about it?  

Dmitry: I grew extremely interested in databases not so long ago, mostly due to the influence of Oleg Bartunov, who is a longtime contributor to PostgreSQL. Initially, I just implemented one patch for the Jsonb data type that was eventually included in the core. After that I couldn’t stop. So I still try to help the PostgreSQL community as much as I can.

What I love is just that: PostgreSQL has an awesome community. And I mean it, there are a lot of people that are excited about databases and possess valuable expertise in this area. My most vivid memory so far about the community was someone asking a question in the hackers mailing list that got answered within minutes – even before I started to type my own reply.

Percona: How can NoSQL Jsonb data type get used effectively with PostgreSQL?

Dmitry: The trick is that you don’t need to do anything supernatural for that. Jsonb is already effective enough right out of the box. But as always there are some limitations, implementation details and tricks (which I’ll show in my talk).

Percona: What do you want attendees to take away from your session? Why should they attend?

Dmitry: The biggest idea behind this talk is to show that we live in interesting times. It’s not that easy to stick with only one data model/data storage. And to mitigate this issue, most modern databases are trying to provide more that one approach. We have to evaluate them each carefully.

Or you can attend if you expect a holy war of PostgreSQL vs. MongoDB vs. MySQL vs. whatever else. But you won’t see anything like that, because we’re all grown up people. 🙂

Percona: What are you most looking forward to at Percona Live Europe 2017?

Dmitry: I look forward to meeting a lot of interesting people to collaborate with, and to share my own experiences.

Want to find out more about Dmitry and PostgreSQL and the Jsonb data type? Register for Percona Live Europe 2017, and see his talk NoSQL Best Practices for PostgreSQL. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at September 07, 2017 03:12 PM

September 06, 2017

Peter Zaitsev

Upcoming Webinar Thursday, September 7: Using PMM to Troubleshoot MySQL Performance Issues

Troubleshooting MySQL Performance

Troubleshooting MySQL PerformanceJoin Percona’s Product Manager, Michael Coburn as he presents Using Percona Monitoring and Management to Troubleshoot MySQL Performance Issues on Thursday, September 7, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

 

Successful applications often become limited by MySQL performance. Michael will show you how to get great MySQL performance using Percona Monitoring and Management (PMM). There will be a demonstration of how to leverage the combination of the query analytics and metrics monitor when troubleshooting MySQL performance issues. We’ll review the essential components of PMM, and use some of the most common database slowness cases as examples of where to look and what to do.

By the end of the webinar you will have a better understanding of:

  • Query metrics, including bytes sent, lock time, rows sent, and more
  • Metrics monitoring
  • How to identify MySQL performance issues
  • Point-in-time visibility and historical trending of database performance

Register for the webinar here.

Troubleshoot MySQL PerformanceMichael Coburn, Product Manager

Michael joined Percona as a Consultant in 2012 after having worked with high volume stock photography websites and email service provider platforms. WIth a foundation in systems administration, Michael enjoys working with SAN technologies and high availability solutions. A Canadian, Michael currently lives in the Nicoya, Costa Rica area with his wife, two children, and two dogs.

by Emily Ikuta at September 06, 2017 05:32 PM

MyRocks Experimental Now Available with Percona Server for MySQL 5.7.19-17

MyRocks with Percona Server for MySQL

MyRocks with Percona Server for MySQLPercona, in collaboration with Facebook, is proud to announce the first experimental release of MyRocks in Percona Server for MySQL 5.7, with packages.

Back in October of 2016, Peter Zaitsev announced that we were going to port MyRocks from Facebook MySQL to Percona Server for MySQL.

Then in April 2017, Vadim Tkachenko announced the availability of experimental builds of Percona Server for MySQL with the MyRocks storage engine.

Now in September 2017, we are pleased to announce the first full experimental release of MyRocks with packages for Percona Server for MySQL 5.7.19-17.

The basis of the MyRocks storage engine is the RocksDB key-value store, which is a log-structured merge-tree (or LSM). It uses much less space, and has a much smaller write volume (write amplification) compared to a B+ tree database implementation such as InnoDB. As a result, MyRocks has the following advantages compared to other storage engines, if your workload uses fast storage (such as SSD):

  • Requires less storage space
  • Provides more storage endurance
  • Ensures better IO capacity

Percona MyRocks is distributed as a separate package that can be enabled as a plugin for Percona Server for MySQL 5.7.19-17.

WARNING: Percona MyRocks is currently considered experimental and is not yet recommended for production use.

We are providing packages for most popular 64-bit Linux distributions:

  • Debian 8 (“jessie”)
  • Debian 9 (“stretch”)
  • Ubuntu 14.04 LTS (Trusty Tahr)
  • Ubuntu 16.04 LTS (Xenial Xerus)
  • Ubuntu 16.10 (Yakkety Yak)
  • Ubuntu 17.04 (Zesty Zapus)
  • Red Hat Enterprise Linux or CentOS 6 (Santiago)
  • Red Hat Enterprise Linux or CentOS 7 (Maipo)

Installation instructions can be found here.

Due to the differences between Facebook MySQL 5.6.35 and Percona Server for MySQL 5.7, there are some behavioral differences and additional limitations. Some of these are documented here.

We encourage you to install and experiment with MyRocks for Percona Server for MySQL and join the discussion here.

Any issues that you might find can be searched for and reported here.

We thank the RocksDB and MyRocks development teams at Facebook for providing the foundation and assistance in developing MyRocks for Percona Server for MySQL. Without their efforts, this would not have been possible.

by George O. Lorch III at September 06, 2017 05:29 PM

September 05, 2017

Peter Zaitsev

Revenge of Ransomware! MongoDB Security in the News Again . . .

MongoDB Security

MongoDB SecurityA new set of MongoDB attacks and data breaches struck businesses this weekend, mirroring the attacks that hit back in January and putting MongoDB security back into the spotlight.

Like the last set, this new attack strategy focused on ransomware that demanded a paid ransom to unlock hijacked data. As with many security breaches, the attack was preventable – if you correctly configured your MongoDB databases to prevent security vulnerabilities.

From the ZDNet article above:

“So these attackers simply scan the entire IPv4 internet for a MongoDB running on port 271017,” Gevers told ZDNet. “When they detect these, they then simply try to get access to it with a script that automatically deletes the database and creates a similar one with only one record holding the ransom note.

“The databases that get hacked were running with default settings and were completely exposed to the internet.”

If you rely on databases to run your business, you need to guarantee database security and performance. Your administrators must protect you from situations like the one mentioned above.

Ultimately, database security comes down to two things: identifying core areas of MongoDB security and knowing exactly what to monitor. For MongoDB to work as expected, you need to correctly setup up your databases and monitor them regularly.

Percona experts have addressed many of these issues already on our blog and in our webinars. Here are some links to existing resources that can help you secure your MongoDB databases, and prevent security disasters:

by Dave Avery at September 05, 2017 07:03 PM

MariaDB AB

Automatic Partition Maintenance in MariaDB

Automatic Partition Maintenance in MariaDB geoff_montee_g Tue, 09/05/2017 - 14:30

A MariaDB Support customer recently asked how they could automatically drop old partitions after 6 months. MariaDB does not have a mechanism to do this automatically out-of-the-box, but it is not too difficult to create a custom stored procedure and an event to call the procedure on the desired schedule. In fact, it is also possible to go even further and create a stored procedure that can also automatically add new partitions. In this blog post, I will show how to write stored procedures that perform these tasks.

Partitioned table definition

For this demonstration, I'll use a table definition based on one from MySQL's documentation on range partitioning, with some minor changes:

DROP TABLE IF EXISTS db1.quarterly_report_status;

CREATE TABLE db1.quarterly_report_status (
   report_id INT NOT NULL,
   report_status VARCHAR(20) NOT NULL,
   report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB
PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
   PARTITION p_first VALUES LESS THAN ( UNIX_TIMESTAMP('2016-10-01 00:00:00')),
   PARTITION p201610 VALUES LESS THAN ( UNIX_TIMESTAMP('2016-11-01 00:00:00')),
   PARTITION p201611 VALUES LESS THAN ( UNIX_TIMESTAMP('2016-12-01 00:00:00')),
   PARTITION p201612 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-01-01 00:00:00')),
   PARTITION p201701 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-02-01 00:00:00')),
   PARTITION p201702 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-03-01 00:00:00')),
   PARTITION p201703 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-04-01 00:00:00')),
   PARTITION p201704 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-05-01 00:00:00')),
   PARTITION p201705 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-06-01 00:00:00')),
   PARTITION p201706 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-07-01 00:00:00')),
   PARTITION p201707 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-08-01 00:00:00')),
   PARTITION p201708 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-09-01 00:00:00')),
   PARTITION p_future VALUES LESS THAN (MAXVALUE)
);

The most significant change is that the partition naming scheme is based on the date. This will allow us to more easily determine which partitions to remove.

Stored procedure definition (create new partitions)

The stored procedure itself contains some comments that explain what it does, so I will let the code speak for itself, for the most part. One noteworthy item to point out is that we are not doing ALTER TABLE ... ADD PARTITION. This is because the partition p_future already covers the end range up to MAXVALUE, so we actually need to do ALTER TABLE ... REORGANIZE PARTITION instead.

DROP PROCEDURE IF EXISTS db1.create_new_partitions;

DELIMITER $$
CREATE PROCEDURE db1.create_new_partitions(p_schema varchar(64), p_table varchar(64), p_months_to_add int)
   LANGUAGE SQL
   NOT DETERMINISTIC
   SQL SECURITY INVOKER
BEGIN  
   DECLARE done INT DEFAULT FALSE;
   DECLARE current_partition_name varchar(64);
   DECLARE current_partition_ts int;
   
   -- We'll use this cursor later to check
   -- whether a particular already exists.
   -- @partition_name_to_add will be
   -- set later.
   DECLARE cur1 CURSOR FOR 
   SELECT partition_name 
   FROM information_schema.partitions 
   WHERE TABLE_SCHEMA = p_schema 
   AND TABLE_NAME = p_table 
   AND PARTITION_NAME != 'p_first'
   AND PARTITION_NAME != 'p_future'
   AND PARTITION_NAME = @partition_name_to_add;
   
   -- We'll also use this cursor later 
   -- to query our temporary table.
   DECLARE cur2 CURSOR FOR 
   SELECT partition_name, partition_range_ts 
   FROM partitions_to_add;
   
   DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
   
   DROP TEMPORARY TABLE IF EXISTS partitions_to_add;
   
   CREATE TEMPORARY TABLE partitions_to_add (
      partition_name varchar(64),
      partition_range_ts int
   );
   
   SET @partitions_added = FALSE;
   SET @months_ahead = 0;
   
   -- Let's go through a loop and add each month individually between
   -- the current month and the month p_months_to_add in the future.
   WHILE @months_ahead <= p_months_to_add DO
      -- We figure out what the correct month is by adding the
      -- number of months to the current date
      SET @date = CURDATE();
      SET @q = 'SELECT DATE_ADD(?, INTERVAL ? MONTH) INTO @month_to_add';
      PREPARE st FROM @q;
      EXECUTE st USING @date, @months_ahead;
      DEALLOCATE PREPARE st;
      SET @months_ahead = @months_ahead + 1;
      
      -- Then we format the month in the same format used
      -- in our partition names.
      SET @q = 'SELECT DATE_FORMAT(@month_to_add, ''%Y%m'') INTO @formatted_month_to_add';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
      
      -- And then we use the formatted date to build the name of the
      -- partition that we want to add. This partition name is
      -- assigned to @partition_name_to_add, which is used in
      -- the cursor declared at the start of the procedure.
      SET @q = 'SELECT CONCAT(''p'', @formatted_month_to_add) INTO @partition_name_to_add';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
     
      SET done = FALSE; 
      SET @first = TRUE;
     
      -- And then we loop through the results returned by the cursor,
      -- and if a row already exists for the current partition, 
      -- then we do not need to create the partition.
      OPEN cur1;

      read_loop: LOOP
         FETCH cur1 INTO current_partition_name;
      
         -- The cursor returned 0 rows, so we can create the partition.
         IF done AND @first THEN
            SELECT CONCAT('Creating partition: ', @partition_name_to_add);

            -- Now we need to get the end date of the new partition.
            -- Note that the date is for the non-inclusive end range,
            -- so we actually need the date of the first day of the *next* month.

            -- First, let's get a date variable for the first of the partition month
            SET @q = 'SELECT DATE_FORMAT(@month_to_add, ''%Y-%m-01 00:00:00'') INTO @month_to_add';
            PREPARE st FROM @q;
            EXECUTE st;
            DEALLOCATE PREPARE st; 

            -- Then, let's add 1 month
            SET @q = 'SELECT DATE_ADD(?, INTERVAL 1 MONTH) INTO @partition_end_date';
            PREPARE st FROM @q;
            EXECUTE st USING @month_to_add;
            DEALLOCATE PREPARE st;

            -- We need the date in UNIX timestamp format.  
            SELECT UNIX_TIMESTAMP(@partition_end_date) INTO @partition_end_ts;
         
            -- Now insert the information into our temporary table
            INSERT INTO partitions_to_add VALUES (@partition_name_to_add, @partition_end_ts);
            SET @partitions_added = TRUE;
         END IF;
        
         -- Since we had at least one row returned, we know the
         -- partition already exists.
         IF ! @first THEN
            LEAVE read_loop;
         END IF;
        
         SET @first = FALSE;
      END LOOP;
     
     CLOSE cur1;
   END WHILE;
   
   -- Let's actually add the partitions now.
   IF @partitions_added THEN
      -- First we need to build the actual ALTER TABLE query.
      SET @schema = p_schema;
      SET @table = p_table;
      SET @q = 'SELECT CONCAT(''ALTER TABLE '', @schema, ''.'', @table, '' REORGANIZE PARTITION p_future INTO ( '') INTO @query';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
     
      SET done = FALSE;
      SET @first = TRUE;
     
      OPEN cur2;

      read_loop: LOOP
         FETCH cur2 INTO current_partition_name, current_partition_ts;
       
        IF done THEN
            LEAVE read_loop;
         END IF;
      
         -- If it is not the first partition, 
         -- then we need to add a comma
         IF ! @first THEN
            SET @q = 'SELECT CONCAT(@query, '', '') INTO @query';
            PREPARE st FROM @q;
            EXECUTE st;
            DEALLOCATE PREPARE st;
         END IF;

         -- Add the current partition
         SET @partition_name =  current_partition_name;
         SET @partition_ts =  current_partition_ts;         
         SET @q = 'SELECT CONCAT(@query, ''PARTITION '', @partition_name, '' VALUES LESS THAN ('', @partition_ts, '')'') INTO @query';
         PREPARE st FROM @q;
         EXECUTE st;
         DEALLOCATE PREPARE st;
       
         SET @first = FALSE;
      END LOOP;
     
      CLOSE cur2;
     
      -- We also need to include the p_future partition
      SET @q = 'SELECT CONCAT(@query, '', PARTITION p_future VALUES LESS THAN (MAXVALUE))'') INTO @query';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
     
      -- And then we prepare and execute the ALTER TABLE query.
      PREPARE st FROM @query;
      EXECUTE st;
      DEALLOCATE PREPARE st;  
   END IF;
   
   DROP TEMPORARY TABLE partitions_to_add;
END$$
DELIMITER ;

Let's try running the new procedure:

MariaDB [db1]> SHOW CREATE TABLE db1.quarterly_report_status\G
*************************** 1. row ***************************
       Table: quarterly_report_status
Create Table: CREATE TABLE `quarterly_report_status` (
  `report_id` int(11) NOT NULL,
  `report_status` varchar(20) NOT NULL,
  `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated))
(PARTITION p_first VALUES LESS THAN (1475294400) ENGINE = InnoDB,
 PARTITION p201610 VALUES LESS THAN (1477972800) ENGINE = InnoDB,
 PARTITION p201611 VALUES LESS THAN (1480568400) ENGINE = InnoDB,
 PARTITION p201612 VALUES LESS THAN (1483246800) ENGINE = InnoDB,
 PARTITION p201701 VALUES LESS THAN (1485925200) ENGINE = InnoDB,
 PARTITION p201702 VALUES LESS THAN (1488344400) ENGINE = InnoDB,
 PARTITION p201703 VALUES LESS THAN (1491019200) ENGINE = InnoDB,
 PARTITION p201704 VALUES LESS THAN (1493611200) ENGINE = InnoDB,
 PARTITION p201705 VALUES LESS THAN (1496289600) ENGINE = InnoDB,
 PARTITION p201706 VALUES LESS THAN (1498881600) ENGINE = InnoDB,
 PARTITION p201707 VALUES LESS THAN (1501560000) ENGINE = InnoDB,
 PARTITION p201708 VALUES LESS THAN (1504238400) ENGINE = InnoDB,
 PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

MariaDB [db1]> CALL db1.create_new_partitions('db1', 'quarterly_report_status', 3);
+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201709                            |
+--------------------------------------------------------+
1 row in set (0.01 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201710                            |
+--------------------------------------------------------+
1 row in set (0.02 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201711                            |
+--------------------------------------------------------+
1 row in set (0.02 sec)

Query OK, 0 rows affected (0.09 sec)

MariaDB [db1]> SHOW CREATE TABLE db1.quarterly_report_status\G
*************************** 1. row ***************************
       Table: quarterly_report_status
Create Table: CREATE TABLE `quarterly_report_status` (
  `report_id` int(11) NOT NULL,
  `report_status` varchar(20) NOT NULL,
  `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated))
(PARTITION p_first VALUES LESS THAN (1475294400) ENGINE = InnoDB,
 PARTITION p201610 VALUES LESS THAN (1477972800) ENGINE = InnoDB,
 PARTITION p201611 VALUES LESS THAN (1480568400) ENGINE = InnoDB,
 PARTITION p201612 VALUES LESS THAN (1483246800) ENGINE = InnoDB,
 PARTITION p201701 VALUES LESS THAN (1485925200) ENGINE = InnoDB,
 PARTITION p201702 VALUES LESS THAN (1488344400) ENGINE = InnoDB,
 PARTITION p201703 VALUES LESS THAN (1491019200) ENGINE = InnoDB,
 PARTITION p201704 VALUES LESS THAN (1493611200) ENGINE = InnoDB,
 PARTITION p201705 VALUES LESS THAN (1496289600) ENGINE = InnoDB,
 PARTITION p201706 VALUES LESS THAN (1498881600) ENGINE = InnoDB,
 PARTITION p201707 VALUES LESS THAN (1501560000) ENGINE = InnoDB,
 PARTITION p201708 VALUES LESS THAN (1504238400) ENGINE = InnoDB,
 PARTITION p201709 VALUES LESS THAN (1506830400) ENGINE = InnoDB,
 PARTITION p201710 VALUES LESS THAN (1509508800) ENGINE = InnoDB,
 PARTITION p201711 VALUES LESS THAN (1512104400) ENGINE = InnoDB,
 PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

We can see that it appears to be working as expected.

Stored procedure definition (drop old partitions)

This additional stored procedure also contains some comments that explain what it does, so I will let the code speak for itself, for the most part. One noteworthy item to point out is that the stored procedure drops all old partitions individually with ALTER TABLE ... DROP PARTITION, and then it increases the range of the p_first partition with ALTER TABLE ... REORGANIZE PARTITION, so that it fills in the gap left behind.

DROP PROCEDURE IF EXISTS db1.drop_old_partitions;

DELIMITER $$
CREATE PROCEDURE db1.drop_old_partitions(p_schema varchar(64), p_table varchar(64), p_months_to_keep int, p_seconds_to_sleep int)
   LANGUAGE SQL
   NOT DETERMINISTIC
   SQL SECURITY INVOKER
BEGIN  
   DECLARE done INT DEFAULT FALSE;
   DECLARE current_partition_name varchar(64);
   
   -- We'll use this cursor later to get
   -- the list of partitions to drop.
   -- @last_partition_name_to_keep will be
   -- set later.
   DECLARE cur1 CURSOR FOR 
   SELECT partition_name 
   FROM information_schema.partitions 
   WHERE TABLE_SCHEMA = p_schema 
   AND TABLE_NAME = p_table 
   AND PARTITION_NAME != 'p_first'
   AND PARTITION_NAME != 'p_future'
   AND PARTITION_NAME < @last_partition_name_to_keep;
   
   DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
   
   -- Now we get the last month of data that we want to keep
   -- by subtracting p_months_to_keep from the current date.
   -- Note that it will actually keep p_months_to_keep+1 partitions,
   -- since the current month is not complete.
   SET @date = CURDATE();
   SET @months_to_keep = p_months_to_keep;   
   SET @q = 'SELECT DATE_SUB(?, INTERVAL ? MONTH) INTO @last_month_to_keep';
   PREPARE st FROM @q;
   EXECUTE st USING @date, @months_to_keep;
   DEALLOCATE PREPARE st;
   
   -- Then we format the last month in the same format used
   -- in our partition names.
   SET @q = 'SELECT DATE_FORMAT(@last_month_to_keep, ''%Y%m'') INTO @formatted_last_month_to_keep';
   PREPARE st FROM @q;
   EXECUTE st;
   DEALLOCATE PREPARE st;
   
   -- And then we use the formatted date to build the name of the
   -- last partition that we want to keep. This partition name is
   -- assigned to @last_partition_name_to_keep, which is used in
   -- the cursor declared at the start of the procedure.
   SET @q = 'SELECT CONCAT(''p'', @formatted_last_month_to_keep) INTO @last_partition_name_to_keep';
   PREPARE st FROM @q;
   EXECUTE st;
   DEALLOCATE PREPARE st;
   
   SELECT CONCAT('Dropping all partitions before: ', @last_partition_name_to_keep);
   
   SET @first = TRUE;
   
   -- And then we loop through all partitions returned by the cursor,
   -- and those partitions are dropped.
   OPEN cur1;

   read_loop: LOOP
      FETCH cur1 INTO current_partition_name;
   
      IF done THEN
         LEAVE read_loop;
      END IF;
     
      IF ! @first AND p_seconds_to_sleep > 0 THEN
         SELECT CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds');
         SELECT SLEEP(p_seconds_to_sleep);
      END IF;

      SELECT CONCAT('Dropping partition: ', current_partition_name);
   
      -- First we build the ALTER TABLE query.
      SET @schema = p_schema;
      SET @table = p_table;
      SET @partition = current_partition_name;
      SET @q = 'SELECT CONCAT(''ALTER TABLE '', @schema, ''.'', @table, '' DROP PARTITION '', @partition) INTO @query';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
      
      -- And then we prepare and execute the ALTER TABLE query.
      PREPARE st FROM @query;
      EXECUTE st;
      DEALLOCATE PREPARE st;
     
      SET @first = FALSE;
   END LOOP;
   
   CLOSE cur1;
   
   -- If no partitions were dropped, then we can also skip this.
   IF ! @first THEN
      -- Then we need to get the date of the new first partition.
      -- We need the date in UNIX timestamp format.
      SET @q = 'SELECT DATE_FORMAT(@last_month_to_keep, ''%Y-%m-01 00:00:00'') INTO @new_first_partition_date';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;     
      SELECT UNIX_TIMESTAMP(@new_first_partition_date) INTO @new_first_partition_ts;
     
      -- We also need to get the date of the second partition
      -- since the second partition is also needed for REORGANIZE PARTITION.
      SET @q = 'SELECT DATE_ADD(@new_first_partition_date, INTERVAL 1 MONTH) INTO @second_partition_date';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
      SELECT UNIX_TIMESTAMP(@second_partition_date) INTO @second_partition_ts;
  
      SELECT CONCAT('Reorganizing first and second partitions. first partition date = ', @new_first_partition_date, ', second partition date = ', @second_partition_date);
   
      -- Then we build the ALTER TABLE query.
      SET @schema = p_schema;
      SET @table = p_table;
      SET @q = 'SELECT CONCAT(''ALTER TABLE '', @schema, ''.'', @table, '' REORGANIZE PARTITION p_first, '', @last_partition_name_to_keep, '' INTO ( PARTITION p_first VALUES LESS THAN ( '', @new_first_partition_ts, '' ), PARTITION '', @last_partition_name_to_keep, '' VALUES LESS THAN ( '', @second_partition_ts, '' ) ) '') INTO @query';
      PREPARE st FROM @q;
      EXECUTE st;
      DEALLOCATE PREPARE st;
     
      -- And then we prepare and execute the ALTER TABLE query.
      PREPARE st FROM @query;
      EXECUTE st;
      DEALLOCATE PREPARE st;
   END IF;
END$$
DELIMITER ;

Let's try running the new procedure:

MariaDB [db1]> SHOW CREATE TABLE db1.quarterly_report_status\G
*************************** 1. row ***************************
       Table: quarterly_report_status
Create Table: CREATE TABLE `quarterly_report_status` (
  `report_id` int(11) NOT NULL,
  `report_status` varchar(20) NOT NULL,
  `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated))
(PARTITION p_first VALUES LESS THAN (1475294400) ENGINE = InnoDB,
 PARTITION p201610 VALUES LESS THAN (1477972800) ENGINE = InnoDB,
 PARTITION p201611 VALUES LESS THAN (1480568400) ENGINE = InnoDB,
 PARTITION p201612 VALUES LESS THAN (1483246800) ENGINE = InnoDB,
 PARTITION p201701 VALUES LESS THAN (1485925200) ENGINE = InnoDB,
 PARTITION p201702 VALUES LESS THAN (1488344400) ENGINE = InnoDB,
 PARTITION p201703 VALUES LESS THAN (1491019200) ENGINE = InnoDB,
 PARTITION p201704 VALUES LESS THAN (1493611200) ENGINE = InnoDB,
 PARTITION p201705 VALUES LESS THAN (1496289600) ENGINE = InnoDB,
 PARTITION p201706 VALUES LESS THAN (1498881600) ENGINE = InnoDB,
 PARTITION p201707 VALUES LESS THAN (1501560000) ENGINE = InnoDB,
 PARTITION p201708 VALUES LESS THAN (1504238400) ENGINE = InnoDB,
 PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

MariaDB [db1]> CALL db1.drop_old_partitions('db1', 'quarterly_report_status', 6, 5);
+--------------------------------------------------------------------------+
| CONCAT('Dropping all partitions before: ', @last_partition_name_to_keep) |
+--------------------------------------------------------------------------+
| Dropping all partitions before: p201702                                  |
+--------------------------------------------------------------------------+
1 row in set (0.00 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201610                            |
+--------------------------------------------------------+
1 row in set (0.00 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (0.02 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (5.02 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201611                            |
+--------------------------------------------------------+
1 row in set (5.02 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (5.03 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (10.03 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201612                            |
+--------------------------------------------------------+
1 row in set (10.03 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (10.05 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (15.05 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201701                            |
+--------------------------------------------------------+
1 row in set (15.05 sec)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT('Reorganizing first and second partitions. first partition date = ', @new_first_partition_date, ', second partition date = ', @second_partition_date) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Reorganizing first and second partitions. first partition date = 2017-02-01 00:00:00, second partition date = 2017-03-01 00:00:00                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (15.06 sec)

Query OK, 0 rows affected (15.11 sec)

MariaDB [db1]> SHOW CREATE TABLE db1.quarterly_report_status\G
*************************** 1. row ***************************
       Table: quarterly_report_status
Create Table: CREATE TABLE `quarterly_report_status` (
  `report_id` int(11) NOT NULL,
  `report_status` varchar(20) NOT NULL,
  `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated))
(PARTITION p_first VALUES LESS THAN (1485925200) ENGINE = InnoDB,
 PARTITION p201702 VALUES LESS THAN (1488344400) ENGINE = InnoDB,
 PARTITION p201703 VALUES LESS THAN (1491019200) ENGINE = InnoDB,
 PARTITION p201704 VALUES LESS THAN (1493611200) ENGINE = InnoDB,
 PARTITION p201705 VALUES LESS THAN (1496289600) ENGINE = InnoDB,
 PARTITION p201706 VALUES LESS THAN (1498881600) ENGINE = InnoDB,
 PARTITION p201707 VALUES LESS THAN (1501560000) ENGINE = InnoDB,
 PARTITION p201708 VALUES LESS THAN (1504238400) ENGINE = InnoDB,
 PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

We can see that our changes seem to be working as expected. In addition to old partitions being dropped, we can also see that p_first's date range was updated.

Stored procedure definition (tie other procedures together)

It is probably going to be preferable in most cases to perform all partition maintenance at the same time. Therefore, we can create another stored procedure that calls our other two stored procedures. This is fairly straight forward.

DROP PROCEDURE IF EXISTS db1.perform_partition_maintenance;

DELIMITER $$
CREATE PROCEDURE db1.perform_partition_maintenance(p_schema varchar(64), p_table varchar(64), p_months_to_add int, p_months_to_keep int, p_seconds_to_sleep int)
   LANGUAGE SQL
   NOT DETERMINISTIC
   SQL SECURITY INVOKER
BEGIN 
   CALL db1.drop_old_partitions(p_schema, p_table, p_months_to_keep, p_seconds_to_sleep);
   CALL db1.create_new_partitions(p_schema, p_table, p_months_to_add);
END$$
DELIMITER ;

Let's reset our partitioned table to its original state, and then let's try running our new stored procedure.

MariaDB [db1]> SHOW CREATE TABLE db1.quarterly_report_status\G
*************************** 1. row ***************************
       Table: quarterly_report_status
Create Table: CREATE TABLE `quarterly_report_status` (
  `report_id` int(11) NOT NULL,
  `report_status` varchar(20) NOT NULL,
  `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated))
(PARTITION p_first VALUES LESS THAN (1475294400) ENGINE = InnoDB,
 PARTITION p201610 VALUES LESS THAN (1477972800) ENGINE = InnoDB,
 PARTITION p201611 VALUES LESS THAN (1480568400) ENGINE = InnoDB,
 PARTITION p201612 VALUES LESS THAN (1483246800) ENGINE = InnoDB,
 PARTITION p201701 VALUES LESS THAN (1485925200) ENGINE = InnoDB,
 PARTITION p201702 VALUES LESS THAN (1488344400) ENGINE = InnoDB,
 PARTITION p201703 VALUES LESS THAN (1491019200) ENGINE = InnoDB,
 PARTITION p201704 VALUES LESS THAN (1493611200) ENGINE = InnoDB,
 PARTITION p201705 VALUES LESS THAN (1496289600) ENGINE = InnoDB,
 PARTITION p201706 VALUES LESS THAN (1498881600) ENGINE = InnoDB,
 PARTITION p201707 VALUES LESS THAN (1501560000) ENGINE = InnoDB,
 PARTITION p201708 VALUES LESS THAN (1504238400) ENGINE = InnoDB,
 PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

MariaDB [db1]> CALL db1.perform_partition_maintenance('db1', 'quarterly_report_status', 3, 6, 5);
+--------------------------------------------------------------------------+
| CONCAT('Dropping all partitions before: ', @last_partition_name_to_keep) |
+--------------------------------------------------------------------------+
| Dropping all partitions before: p201702                                  |
+--------------------------------------------------------------------------+
1 row in set (0.00 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201610                            |
+--------------------------------------------------------+
1 row in set (0.00 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (0.02 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (5.02 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201611                            |
+--------------------------------------------------------+
1 row in set (5.02 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (5.03 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (10.03 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201612                            |
+--------------------------------------------------------+
1 row in set (10.03 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (10.06 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (15.06 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201701                            |
+--------------------------------------------------------+
1 row in set (15.06 sec)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT('Reorganizing first and second partitions. first partition date = ', @new_first_partition_date, ', second partition date = ', @second_partition_date) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Reorganizing first and second partitions. first partition date = 2017-02-01 00:00:00, second partition date = 2017-03-01 00:00:00                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (15.08 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201709                            |
+--------------------------------------------------------+
1 row in set (15.16 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201710                            |
+--------------------------------------------------------+
1 row in set (15.17 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201711                            |
+--------------------------------------------------------+
1 row in set (15.17 sec)

Query OK, 0 rows affected (15.26 sec)

MariaDB [db1]> SHOW CREATE TABLE db1.quarterly_report_status\G
*************************** 1. row ***************************
       Table: quarterly_report_status
Create Table: CREATE TABLE `quarterly_report_status` (
  `report_id` int(11) NOT NULL,
  `report_status` varchar(20) NOT NULL,
  `report_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated))
(PARTITION p_first VALUES LESS THAN (1485925200) ENGINE = InnoDB,
 PARTITION p201702 VALUES LESS THAN (1488344400) ENGINE = InnoDB,
 PARTITION p201703 VALUES LESS THAN (1491019200) ENGINE = InnoDB,
 PARTITION p201704 VALUES LESS THAN (1493611200) ENGINE = InnoDB,
 PARTITION p201705 VALUES LESS THAN (1496289600) ENGINE = InnoDB,
 PARTITION p201706 VALUES LESS THAN (1498881600) ENGINE = InnoDB,
 PARTITION p201707 VALUES LESS THAN (1501560000) ENGINE = InnoDB,
 PARTITION p201708 VALUES LESS THAN (1504238400) ENGINE = InnoDB,
 PARTITION p201709 VALUES LESS THAN (1506830400) ENGINE = InnoDB,
 PARTITION p201710 VALUES LESS THAN (1509508800) ENGINE = InnoDB,
 PARTITION p201711 VALUES LESS THAN (1512104400) ENGINE = InnoDB,
 PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

This stored procedure also seems to be working as expected.

Running the procedure more often than necessary

It should be noted that these stored procedures can be run more often than is necessary. If the procedures are run when no partitions need to be added or deleted, then the procedure will not perform any work. Let's reset our table definition and try it out.

MariaDB [db1]> CALL db1.perform_partition_maintenance('db1', 'quarterly_report_status', 3, 6, 5);
+--------------------------------------------------------------------------+
| CONCAT('Dropping all partitions before: ', @last_partition_name_to_keep) |
+--------------------------------------------------------------------------+
| Dropping all partitions before: p201702                                  |
+--------------------------------------------------------------------------+
1 row in set (0.00 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201610                            |
+--------------------------------------------------------+
1 row in set (0.00 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (0.03 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (5.03 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201611                            |
+--------------------------------------------------------+
1 row in set (5.03 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (5.06 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (10.06 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201612                            |
+--------------------------------------------------------+
1 row in set (10.06 sec)

+---------------------------------------------------------+
| CONCAT('Sleeping for ', p_seconds_to_sleep, ' seconds') |
+---------------------------------------------------------+
| Sleeping for 5 seconds                                  |
+---------------------------------------------------------+
1 row in set (10.08 sec)

+---------------------------+
| SLEEP(p_seconds_to_sleep) |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (15.09 sec)

+--------------------------------------------------------+
| CONCAT('Dropping partition: ', current_partition_name) |
+--------------------------------------------------------+
| Dropping partition: p201701                            |
+--------------------------------------------------------+
1 row in set (15.09 sec)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT('Reorganizing first and second partitions. first partition date = ', @new_first_partition_date, ', second partition date = ', @second_partition_date) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Reorganizing first and second partitions. first partition date = 2017-02-01 00:00:00, second partition date = 2017-03-01 00:00:00                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (15.11 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201709                            |
+--------------------------------------------------------+
1 row in set (15.18 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201710                            |
+--------------------------------------------------------+
1 row in set (15.18 sec)

+--------------------------------------------------------+
| CONCAT('Creating partition: ', @partition_name_to_add) |
+--------------------------------------------------------+
| Creating partition: p201711                            |
+--------------------------------------------------------+
1 row in set (15.18 sec)

Query OK, 0 rows affected (15.28 sec)

MariaDB [db1]> CALL db1.perform_partition_maintenance('db1', 'quarterly_report_status', 3, 6, 5);
+--------------------------------------------------------------------------+
| CONCAT('Dropping all partitions before: ', @last_partition_name_to_keep) |
+--------------------------------------------------------------------------+
| Dropping all partitions before: p201702                                  |
+--------------------------------------------------------------------------+
1 row in set (0.01 sec)

Query OK, 0 rows affected (0.02 sec)

As we can see from the above output, the procedure did not perform any work the second time.

Event definition

We want our stored procedure to run automatically every month, so we can use an event to do that. Before testing the event, we need to do two things:

  • We need to recreate the table with the original definition, so that it has all of the original partitions.
  • We need to ensure that event_scheduler=ON is set, and if not, we need to set it.
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'event_scheduler';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| event_scheduler | OFF   |
+-----------------+-------+
1 row in set (0.00 sec)

MariaDB [(none)]> SET GLOBAL event_scheduler=ON;
Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'event_scheduler';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| event_scheduler | ON    |
+-----------------+-------+
1 row in set (0.00 sec)

And then we can run the following:

DROP EVENT db1.monthly_perform_partition_maintenance_event;

CREATE EVENT db1.monthly_perform_partition_maintenance_event
   ON SCHEDULE
   EVERY 1 MONTH
   STARTS NOW()
DO
   CALL db1.perform_partition_maintenance('db1', 'quarterly_report_status', 3, 6, 5);

However, there's another great change that we can make here. It might not be ideal to only run the procedure once per month, because if the procedure fails for whatever reason, then it might not get another chance to run again until the next month. For that reason, it might be better to run the procedure more often, such as once per day. As mentioned above, the procedure will only do work when partition maintenance is actually necessary, so it should not cause any issues to execute the procedure more often.

If we wanted to run the procedure once per day, then the event definition would become:

DROP EVENT db1.monthly_perform_partition_maintenance_event;

CREATE EVENT db1.monthly_perform_partition_maintenance_event
   ON SCHEDULE
   EVERY 1 DAY
   STARTS NOW()
DO
   CALL db1.perform_partition_maintenance('db1', 'quarterly_report_status', 3, 6, 5);

Conclusion

Thanks to the flexibility of stored procedures and events, it is relatively easy to automatically perform partition maintenance in MariaDB. Has anyone else implemented something like this?

A MariaDB Support customer recently asked how they could automatically drop old partitions after 6 months. MariaDB does not have a mechanism to do this automatically out-of-the-box, but it is not too difficult to create a custom stored procedure and an event to call the procedure on the desired schedule. In fact, it is also possible to go even further and create a stored procedure that can also automatically add new partitions. In this blog post, I will show how to write stored procedures that perform these tasks.

Shlomi Noach

Shlomi Noach

Tue, 09/05/2017 - 13:49

Partition management via common_schema

A shameless plug to common_schema's `sql_range_partitions` view, which generates the correct `DROP` statements for purging old partitions, as well as `CREATE` statements for creating the next partitions: sql_range_partitions

Also see Ike Walker's blog post on this topic: http://mechanics.flite.com/blog/2016/03/28/simplifying-mysql-partition-management-using-common-schema/

Login or Register to post comments

by geoff_montee_g at September 05, 2017 06:30 PM

Peter Zaitsev

Webinar Wednesday, September 6, 2017: Percona Roadmap and Software News Update – Q3 2017

Percona Roadmap

Percona RoadmapCome and listen to Percona CEO Peter Zaitsev on Wednesday, September 6, 2017 at 10am PT / 1pm ET (UTC-7) discuss the Percona roadmap, as well as what’s new in Percona open source software.

 

During this webinar Peter will talk about newly released features in Percona software, show a few quick demos and share with you highlights from the Percona open source software roadmap. This discussion will cover Percona Server for MySQL and MongoDB, Percona XtraBackup, Percona Toolkit, Percona XtraDB Cluster and Percona Monitoring and Management.

Peter will also talk about new developments in Percona commercial services and finish with a Q&A.

Register for the webinar before seats fill up for this exciting webinar Wednesday, September 6, 2017 at 10am PT / 1pm ET (UTC-7).

Peter ZaitsevPeter Zaitsev, Percona CEO and Co-Founder

Peter Zaitsev co-founded Percona and assumed the role of CEO in 2006. As one of the foremost experts on MySQL strategy and optimization, Peter leveraged both his technical vision and entrepreneurial skills to grow Percona from a two-person shop to one of the most respected open source companies in the business. With over 140 professionals in 30+ countries, Peter’s venture now serves over 3000 customers – including the “who’s who” of internet giants, large enterprises and many exciting startups. Percona was named to the Inc. 5000 in 2013, 2014, 2015 and 2016. Peter was an early employee at MySQL AB, eventually leading the company’s High Performance Group. A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University, where he majored in Computer Science. Peter is a co-author of High Performance MySQL: Optimization, Backups, and Replication, one of the most popular books on MySQL performance. Peter frequently speaks as an expert lecturer at MySQL and related conferences, and regularly posts on the Percona Database Performance Blog. Fortune and DZone have both tapped Peter as a contributor, and his recent ebook Practical MySQL Performance Optimization is one of percona.com’s most popular downloads.

by Emily Ikuta at September 05, 2017 05:15 PM

Jean-Jerome Schmidt

Now live! MySQL on Docker: Understanding the Basics - The Webinar

We’re excited to announce the live version of our blog ‘MySQL on Docker: Understanding the Basics’, which will be presented by its author, our colleague Ashraf Sharif, on September 27th during this new webinar.

With 100K+ views to date, ‘MySQL on Docker: Understanding the Basics’ has become a popular go-to resource for MySQL users worldwide who are looking to get an initial understanding of and start experimenting with Docker.

So if you’re looking at taking your first steps with Docker for MySQL then this webinar is made for you :-)

Docker is quickly becoming mainstream as a method to package and deploy self-sufficient applications in primarily stateless Linux containers. This could be a challenge though for a stateful service like a database. As a database user you might be asking yourself: How do I best configure MySQL in a container environment? What can go wrong? Should I even run my databases in a container environment? How does performance compare with e.g. running on virtual machines or bare-metal servers? How do I manage replicated or clustered setups, where multiple containers need to be created, upgraded and made highly available? We’ll look at helping you answer these questions with our new Docker webinar series.

Check out the agenda and sign up for its first installment below, we look forward to “seeing” you there.

Date, Time & Registration

Europe/MEA/APAC

Wednesday, September 27th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)

Register Now

North America/LatAm

Wednesday, September 27th at 09:00 PST (US) / 12:00 EST (US)

Register Now

Agenda

This webinar is for MySQL users who are Docker beginners and who would like to understand the basics of running a MySQL container on Docker. We are going to cover:

  • Docker and its components
  • Concept and terminology
  • How a Docker container works
  • Advantages and disadvantages
  • Stateless vs stateful
  • Docker images for MySQL
  • Running a simple MySQL container
  • The ClusterControl Docker image
  • Severalnines on Docker Hub

Speaker

Ashraf Sharif is System Support Engineer at Severalnines. He was previously involved in hosting world and LAMP stack, where he worked as principal consultant and head of support team and delivered clustering solutions for large websites in the South East Asia region. His professional interests are on system scalability and high availability.

by jj at September 05, 2017 02:54 PM

September 03, 2017

Shlomi Noach

Speaking at August Penguin, MySQL Track, GitHub sponsored

This Thursday I'll be presenting at August Penguin, conveniently taking place September 7th, 8th, Ramat Gan, Israel.

I will be speaking as part of the MySQL track, 2nd half of Thursday. The (Hebrew) schedule is here.

My talk is titled Reliable failovers, safe schema migrations: open source solutions to MySQL problems. I will describe some of the open source MySQL infrastructure work we run at GitHub ; how it solves reliability, availability and usability. I'll describe some of our internal workflows and our use of chat and chatops.

I'm proud to announce GitHub sponsors the event. We won't have a booth, but please do grab me in the hallways or over lunch to chat!

And, yes, octocat stickers will be made available 🙂

 

by shlomi at September 03, 2017 10:42 AM

September 02, 2017

Valeriy Kravchuk

MySQL Support Engineer's Chronicles, Issue #8

This week is special and full of anniversaries for me. This week 5 years ago I left Oracle behind and joined Percona... Same week 5 years ago I had written something about MySQL in this blog for the first time in my life. 5 years ago I've created my Facebook account that I actively (ab)use for discussing work-related issues. So, to summarize, it's a five years anniversary of my coming out as a MySQL Entomologist, somebody who writes and speaks about MySQL and bugs in MySQL in public! These 5 years were mostly awesome.

I decided to celebrate with yet another post in this relatively new series and summarize in short what interesting things I studied, noticed or had to work on this week while providing support to customers of all kinds of MySQL.

This week started for me with the need to find out why mariabackup fails on Windows, for one of customers. If you missed it, MariaDB created a tool for online backup based on Percona's XtraBackup that supports few additional features (like data at rest encryption of MariaDB Server) and works on Windows as well, included it into MariaDB server and even declared it "Stable" as of MariaDB 10.1.26. In the process of working on that problem I had to use procmon tool, based on this KB article. The root cause of the problem was NTFS compression used for the target directory (see MDEV-13691 by Vladislav Vaintroub , who forces lazy me to improve my rudimentary Windows user skills from time to time, for some related details). So, better do not use NTFS compression of the backup destination if you need to back up big enough (50G+) tables. I really enjoyed working with procmon that helped to find out what could cause (somewhat random) "error 22" failures.

I was (positively) surprised to find out there there is a MariaDB KB article on such a specific topic as troubleshooting on Windows. Besides this one, I had to use the following KB articles while working with customers this week:
and found something new (for me) there. I never cared to find out what join_cache_level is used for previously, for example.

Besides mariabackup, this week I had to discuss various problems related to backing up TokuDB tables, so you should expect my blog posts related to this topic soon.

My colleague Geoff Montee published a useful post this week, "Automatically Dropping Old Partitions in MySQL and MariaDB". Make sure to check comments and his followup. Geoff had also reported nice Bug #87589 - "Documentation incorrectly states that LOAD DATA LOCAL INFILE does not use tmpdir", that is still "Open" for some reason.

During such a great week I had to report some MySQL bug, so I did it. Check Bug #87624 - "Wrong statement-based binary logging for CONCAT() in stored procedure". I've seen all these "Illegal mix of collations..." way too often over years.

Other bugs that attracted my attention were:
  • Bug #84108 - "Mysql 5.7.14 hangs and kills itself during drop database statement". This bug should probably become "Open" and properly processed, as current "Duplicate" status is questionable, at best. Arjen Lentz attracted mine (and not only mine) attention to this old enough and improperly handled bug report.
  • Bug #87619 - "InnoDB partition table will lock into the near record as a condition in the use". Nice to see this regression bug "Verified" after all. It seems native partitioning in MySQL 5.7 came with some cost of extra/different locking.
Time to stop writing and prepare for further celebrations, fun and anniversaries. Stay tuned!





by Valeriy Kravchuk (noreply@blogger.com) at September 02, 2017 03:58 PM

September 01, 2017

Peter Zaitsev

How Life360 Used ProxySQL to Lower Its Database Load

ProxySQL

ProxySQLIn this blog post, we’ll look at how to use ProxySQL to help the database load by handling PINGs.

I’ve blogged before about one of our regular clients, Life360. One of the issues they recently had was the PING command taking about 30%-40% of total queries per second across their database infrastructure. This is a non-trivial amount and was easily tens of thousands of pings per second. This added a significant amount of latency to real queries.

A large number of pings is due to the use of PHP PDO with persistent connections. Persistence, or pooling, is necessary to reduce time spent on connecting, disconnecting and reconnecting.

Unfortunately, in PHP (and other) implementations, the driver checks if the database is still alive with a PING before sending the actual command. Logic dictates that you could use the actual command as the PING, and if it fails it could return the same error it would have if the ping itself failed. Baron Schwartz has a lot to say about how unwise the use of a PING is within the drivers.

Barring rewriting PHP PDO, we thought up another solution: ProxySQL.

Before testing ProxySQL, we didn’t know how much gets forwarded to the actual hosts (including these com_admin commands). We wanted to test a quick PoC to discover what the actual behavior of ProxySQL was with respect to these commands. We had two hypothesis that we wanted to check:

  1. ProxySQL forwards everything including com_commands
  2. ProxySQL responds to the com_commands itself

In the event that ProxySQL forwarded everything, we set up a “decoy” MySQL instance to respond to the pings using ProxySQL query filtering. As it turned out, we found that ProxySQL quickly and silently replies to PINGs and doesn’t forward it onto the underlying database server backend. This is the case for other commands as well, as ProxySQL isn’t strictly a forwarding proxy (but more of a reverse proxy).

By placing ProxySQL on the application servers, Life360 was able to reduce QPS significantly. The other advantage of introducing ProxySQL is that it does connection pooling and multiplexing for you.

Here is the graph of com_ping on the day of the deployment:

ProxySQL

Overall, we are talking about hundreds of millions of pings per day down to 0.

We can see that the vanilla install of ProxySQL also reduced active threads significantly:

ProxySQL

This change has enabled Life360 to put off some of their scaling plans and provided other operational gains.

In conclusion, you can use ProxySQL as a simple (or advanced) firewall between your application and database. It consumes very little resources, but provides an immense performance gain.

While ProxySQL can be used as a more advanced firewall, these features are beyond the scope of this post. There are very specific ways to configure it as an advanced firewall. We plan to blog more on this soon.

by Manjot Singh at September 01, 2017 07:43 PM

This Week in Data with Colin Charles #4: Percona Server for MySQL with MyRocks

Colin Charles

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Colin CharlesPercona Live Europe Dublin

Have you registered for Percona Live Europe Dublin? We’ve more or less finalized the schedule, and the conference grid looks 100% full. We’re four weeks away, so I suggest you register ASAP!

I should also mention that no event can be pulled off without sponsors, so thank you sponsors of Percona Live Europe 2017. I sincerely hope to see more sign up. Feel free to ask me more about it, or just check out our sponsor prospectus.

Releases

  • MariaDB/MySQL Replication Manager 1.1.1 release. There was recently a talk accepted at Percona Live Europe 2017 that referenced “MRM”. I was asked about it, and I think this tool needs more marketing! MRM is a high availability solution to manage MariaDB 10.x and MySQL and Percona Server for MySQL 5.7 GTID replication topologies. It has a new 1.1.1 release that provides improvements for MariaDB Server and MariaDB MaxScale (this tool itself gained MySQL GTID support back in April 2017). Do you use MRM?
  • Colin CharlesPercona Server 5.7.19-17 is now released! Why is this exciting? Because it comes with the MyRocks storage engine! Yes, the engine is experimental, and no, it isn’t recommended for production – but why not get started with the MyRocks Introduction? I tried the installation guide and got everything started very quickly. Read about the current limitations and differences between Percona MyRocks and Facebook MyRocks (considering you’ll really want to use MyRocks in a shipping release – Facebook’s MyRocks requires compiling their tree, and this is really not the recommended way to get going!).

Link List

Upcoming Appearances

Percona’s web site tracks community events, so check that out and see where to listen to Perconians speak. My upcoming appearances are:

  1. db tech show case Tokyo 2017. 5-7 September 2017, Tokyo, Japan
  2. Open Source Summit North America. 11-14 September 2017, Los Angeles, CA, USA
  3. Percona Live Europe Dublin. 25-27 September 2017, Dublin, Ireland
  4. Velocity Europe. 17-20 October 2017, London, UK
  5. Open Source Summit Europe. 23-26 October 2017, Prague, Czech Republic

I’ve been spending time on writing my db tech showcase talk. Will you be in Tokyo, Japan next week? Want to meet up? Don’t hesitate to drop me an email: colin.charles@percona.com.

Feedback

bet365 now purchases Basho assets. The good news for Riak users? “It is our intention to open source all of Basho’s products and all of the source code that they have been working on.” The Register covers this, too.

I look forward to feedback/tips via e-mail at colin.charles@percona.com, or on Twitter @bytebot.

by Colin Charles at September 01, 2017 06:14 PM

August 31, 2017

Peter Zaitsev

Percona Live Europe Featured Talks: Orchestrating ProxySQL with Orchestrator and Consul with Avraham Apelbaum

Colin Charles

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Avraham Apelbaum, DBA and DevOps at Wix.com His talk is titled Orchestrating ProxySQL with Orchestrator and Consul. The combination of ProxySQL and Orchestrator solves many problems, but still requires some manual labor when the configuration changes when there is a network split (and other scenarios). In our conversation, we discussed using Consul to solve some of these issues:

Percona: How did you get into database technology? What do you love about it?

Avraham: On my first day as a soldier in a technology unit of the IDF, I received a HUGE Oracle 8 book and a very low-level design of a DB-based system. “You have one month,” they told me. I finished it all within ten days. Before that, I didn’t even know what a DB was. Today, I’m at Wix managing hundreds of databases that support 100M users!

Percona: You’re presenting a session called “Orchestrating ProxySQL with Orchestrator and Consul”. How do these technologies work together to help provide a high availability solution?

Avraham: ProxySQL is supposed to help you out with high availability (HA) and disaster recovery (DR) for MySQL servers, but it still requires some manual labor when the configuration changes – as a result of a network split, for example. Somehow all ProxySQL servers need to get the new MySQL cluster topology. So to automate all that, I added two more parts: a Consul KV store and a Consul template, which are responsible for updating ProxySQL on every architecture change in the MySQL cluster.

Percona: What is special about this combination of products that works better than other solutions? Is it right all the time, or does it depend on the workload?

Avraham: As DevOps I prefer not to do anything manually. What’s more, no one wants to wake up in the middle of the night because any one of our DB servers can fail. Most everyone, I guess, will have more than one ProxySQL server in their system at some point, so this solution can help them use ProxySql and Orchestrator.

Percona: What do you want attendees to take away from your session? Why should they attend?

Avraham: I am hoping to help people automate their HA and DR solutions. If as a result of my talk someone will earn even one minute off downtime, I’ll be happy.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Avraham: In the DevOps and open source world, it’s all about sharing ideas. It was actually when I attended the talks by ProxySQL and Orchestrator’s creators that I thought of assembling it all up to solve our own problem. So I am looking forward to sharing my idea with others, and getting input from the audience so that everyone can benefit.

Want to find out more about Avraham and RDS migration? Register for Percona Live Europe 2017, and see his talk Orchestrating ProxySQL with Orchestrator and Consul. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at August 31, 2017 07:17 PM

Percona Server for MySQL 5.7.19-17 Is Now Available

Percona Server for MySQL 5.7.18-15

Percona Server for MySQL 5.6Percona announces the release of Percona Server for MySQL 5.7.19-17 on August 31, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.7.19, and including all the bug fixes in it, Percona Server for MySQL 5.7.19-17 is now the current GA release in the Percona Server for MySQL 5.7 series. Percona Server for MySQL is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.7.19-17 milestone on Launchpad.

NOTE: Percona software no longer supports Red Hat Enterprise Linux 5 (including CentOS 5 and other derivatives), Ubuntu 12.04 and older versions. These platforms have reached end of life, won’t be updated and are not recommended for use in production.

 

New Features

  • Included the Percona MyRocks storage engine

    NOTE: MyRocks for Percona Server is currently experimental and not recommended for production deployments until further notice. You are encouraged to try it in a testing environment and provide feedback or report bugs.

  • #1708087: Added the mysql-helpers script to handle checking for missing datadir during startup. Also fixes #1635364.

Platform Support

  • Stopped providing packages for Ubuntu 12.04 due to its end of life.

Bugs Fixed

  • #1669414: Fixed handling of failure to set O_DIRECT on parallel doublewrite buffer file.
  • #1705729: Fixed the postinst script to correctly locate the datadir. Also fixes #1698019.
  • #1709811: Fixed yum upgrade to not enable the mysqld service if it was disabled before the upgrade.
  • #1709834: Fixed the mysqld_safe script to correctly locate the basedir.
  • Other fixes: #1698996#1706055#1706262#1706981

TokuDB Changes

  • TDB-70: Removed redundant fsync of TokuDB redo log during binlog group commit flush stage. This fixes issue that prevented TokuDB to run in reduced durability mode when the binlog was enabled.
  • TDB-72: Fixed issue when renaming a table with non-alphanumeric characters in its name.

Release notes for Percona Server for MySQL 5.7.19-17 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

by Alexey Zhebel at August 31, 2017 06:25 PM

August 30, 2017

Peter Zaitsev

Nested Data Structures in ClickHouse

Nested Data Structures

Nested Data StructuresIn this blog post, we’ll look at nested data structures in ClickHouse and how this can be used with PMM to look at queries.

Nested structures are not common in Relational Database Management Systems. Usually, it’s just flat tables. Sometimes it would be convenient to store unstructured information in structured databases.

We are working to adapt ClickHouse as a long term storage for Percona Monitoring and Management (PMM), and particularly to store detailed information about queries. One of the problems we are trying to solve is to count the different errors that cause a particular query to fail.

For example, for date 2017-08-17 the query:

"SELECT foo FROM bar WHERE id=?"

was executed 1000 times. 25 times it failed with error code “1212”, and eight times it failed with error code “1250”. Of course, the traditional way to store this in relational data would be to have a table "Date, QueryID, ErrorCode, ErrorCnt" and then perform a JOIN to this table. Unfortunately, columnar databases don’t perform well with multiple joins, and often the recommendation is to have de-normalized tables.

We can create a column for each possible ErrorCode, but this is not an optimal solution. There could be thousands of them, and most of the time they would be empty.

In this case, ClickHouse proposes Nested data structures. For our case, these can be defined as:

CREATE TABLE queries
(
    Period Date,
    QueryID UInt32,
    Fingerprint String,
    Errors Nested
    (
        ErrorCode String,
        ErrorCnt UInt32
    )
)Engine=MergeTree(Period,QueryID,8192);

This solution has obvious questions: How do we insert data into this table? How do we extract it?

Let’s start with INSERT. Insert can look like:

INSERT INTO queries VALUES ('2017-08-17',5,'SELECT foo FROM bar WHERE id=?',['1220','1230','1212'],[5,6,2])

which means that the inserted query during 2017-08-17 gave error 1220 five times, error 1230 six times and error 1212 two times.

Now, during a different date, it might produce different errors:

INSERT INTO queries VALUES ('2017-08-18',5,'SELECT foo FROM bar WHERE id=?',['1220','1240','1258'],[3,2,1])

Let’s take a look at ways to SELECT data. A very basic SELECT:

SELECT *
FROM queries
┌─────Period─┬─QueryID─┬─Fingerprint─┬─Errors.ErrorCode───────┬─Errors.ErrorCnt─┐
│ 2017-08-17 │       5 │ SELECT foo  │ ['1220','1230','1212'] │ [5,6,2]         │
│ 2017-08-18 │       5 │ SELECT foo  │ ['1220','1240','1260'] │ [3,16,12]       │
└────────────┴─────────┴─────────────┴────────────────────────┴─────────────────┘

If we want to use a more familiar tabular output, we can use the ARRAY JOIN extension:

SELECT *
FROM queries
ARRAY JOIN Errors
┌─────Period─┬─QueryID─┬─Fingerprint─┬─Errors.ErrorCode─┬─Errors.ErrorCnt─┐
│ 2017-08-17 │       5 │ SELECT foo  │ 1220             │            5    │
│ 2017-08-17 │       5 │ SELECT foo  │ 1230             │            6    │
│ 2017-08-17 │       5 │ SELECT foo  │ 1212             │            2    │
│ 2017-08-18 │       5 │ SELECT foo  │ 1220             │            3    │
│ 2017-08-18 │       5 │ SELECT foo  │ 1240             │           16    │
│ 2017-08-18 │       5 │ SELECT foo  │ 1260             │           12    │
└────────────┴─────────┴─────────────┴──────────────────┴─────────────────┘

However, usually we want to see the aggregation over multiple periods, which can be done with traditional aggregation functions:

SELECT
    QueryID,
    Errors.ErrorCode,
    SUM(Errors.ErrorCnt)
FROM queries
ARRAY JOIN Errors
GROUP BY
    QueryID,
    Errors.ErrorCode
┌─QueryID─┬─Errors.ErrorCode─┬─SUM(Errors.ErrorCnt)─┐
│       5 │ 1212             │                 2    │
│       5 │ 1230             │                 6    │
│       5 │ 1260             │                12    │
│       5 │ 1240             │                16    │
│       5 │ 1220             │                 8    │
└─────────┴──────────────────┴──────────────────────┘

If we want to get really creative and return only one row per QueryID, we can do that as well:

SELECT
    QueryID,
    groupArray((ecode, cnt))
FROM
(
    SELECT
        QueryID,
        ecode,
        sum(ecnt) AS cnt
    FROM queries
    ARRAY JOIN
        Errors.ErrorCode AS ecode,
        Errors.ErrorCnt AS ecnt
    GROUP BY
        QueryID,
        ecode
)
GROUP BY QueryID
┌─QueryID─┬─groupArray(tuple(ecode, cnt))──────────────────────────────┐
│       5 │ [('1230',6),('1212',2),('1260',12),('1220',8),('1240',16)] │
└─────────┴────────────────────────────────────────────────────────────┘

Conclusion

ClickHouse provides flexible ways to store data in a less structured manner and variety of functions to extract and aggregate it – despite being a columnar database.

Happy data warehousing!

by Vadim Tkachenko at August 30, 2017 06:21 PM

Jean-Jerome Schmidt

Multiple Data Center Setups Using Galera Cluster for MySQL or MariaDB

Building high availability, one step at a time

When it comes to database infrastructure, we all want it. We all strive to build a highly available setup. Redundancy is the key. We start to implement redundancy at the lowest level and continue up the stack. It starts with hardware - redundant power supplies, redundant cooling, hot-swap disks. Network layer - multiple NIC’s bonded together and connected to different switches which are using redundant routers. For storage we use disks set in RAID, which gives better performance but also redundancy. Then, on the software level, we use clustering technologies: multiple database nodes working together to implement redundancy: MySQL Cluster, Galera Cluster.

All of this is no good  if you have everything in a single datacenter: when a datacenter goes down, or part of the services (but important ones) go offline, or even if you lose connectivity to the datacenter, your service will go down -  no matter the amount of redundancy in the lower levels. And yes, those things happens.

  • S3 service disruption wreaked havoc in US-East-1 region in February, 2017
  • EC2 and RDS Service Disruption in US-East region in April, 2011
  • EC2, EBS and RDS were disrupted in EU-West region in August, 2011
  • Power outage brought down Rackspace Texas DC in June, 2009
  • UPS failure caused hundreds of servers to go offline in Rackspace London DC in January, 2010

This is by no means a complete list of failures, it’s just the result of a quick Google search. These serve as examples that things may and will go wrong if you put all your eggs into the same basket. One more example would be Hurricane Sandy, which caused enormous exodus of data from US-East to US-West DC’s - at that time you could hardly spin up instances in US-West as everyone rushed to move their infrastructure to the other coast in expectation that North Virginia DC will be seriously affected by the weather.

So, multi-datacenter setups are a must if you want to build a high availability environment. In this blog post, we will discuss how to build such infrastructure using Galera Cluster for MySQL/MariaDB.

Galera concepts

Before we look into particular solutions, let us spend some time explaining two concepts which are very important in highly available, multi-DC Galera setups.

Quorum

High availability requires resources - namely, you need a number of nodes in the cluster to make it highly available. A cluster can tolerate the loss of some of its members, but only to a certain extent. Beyond a certain failure rate, you might be looking at a split-brain scenario.

Let’s take an example with a 2 node setup.  If one of the nodes goes down, how can the other one know that its peer crashed and it’s not a network failure? In that case, the other node might as well be up and running, serving traffic. There is no good way to handle such case… This is why fault tolerance usually starts from three nodes. Galera uses a quorum calculation to determine if it is safe for the cluster to handle traffic, or if it should cease operations. After a failure, all remaining nodes attempt to connect to each other and determine how many of them are up. It’s then compared to the previous state of the cluster, and as long as more than 50% of the nodes are up, the cluster can continue to operate.

This results in following:
2 node cluster - no fault tolerance
3 node cluster - up to 1 crash
4 node cluster - up to 1 crash (if two nodes would crash, only 50% of the cluster would be available, you need more than 50% nodes to survive)
5 node cluster - up to 2 crashes
6 node cluster - up to 2 crashes

You probably see the pattern - you want your cluster to have an odd number of nodes - in terms of high availability there’s no point in moving from 5 to 6 nodes in the cluster. If you want better fault tolerance, you should go for 7 nodes.

Segments

Typically, in a Galera cluster, all communication follows the all to all pattern. Each node talks to all the other nodes in the cluster.

As you may know, each writeset in Galera has to be certified by all of the nodes in the cluster - therefore every write that happened on a node has to be transferred to all of the nodes in the cluster. This works ok in a low-latency environment. But if we are talking about multi-DC setups, we need to consider much higher latency than in a local network. To make it more bearable in clusters spanning over Wide Area Networks, Galera introduced segments.

They work by containing the Galera traffic within a group of nodes (segment). All nodes within a single segment act as if they were in a local network - they assume one to all communication. For cross-segment traffic, things are different - in each of the segments, one “relay” node is chosen, all of the cross-segment traffic goes through those nodes. When a relay node goes down, another node is elected. This does not reduce latency by much - after all, WAN latency will stay the same no matter if you make a connection to one remote host or to multiple remote hosts, but given that WAN links tend to be limited in bandwidth and there might be a charge for the amount of data transferred, such approach allows you to limit the amount of data exchanged between segments. Another time and cost-saving option is the fact that nodes in the same segment are prioritized when a donor is needed - again, this limits the amount of data transferred over the WAN and, most likely, speeds up SST as a local network almost always will be faster than a WAN link.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Galera in multi-DC setups

Now that we’ve got some of these concepts out of the way, let’s look at some other important aspects of multi-DC setups for Galera cluster.

Issues you are about to face

When working in environments spanning across WAN, there are a couple of issues you need to take under consideration when designing your environment.

Quorum calculation

In the previous section, we described how a quorum calculation looks like in Galera cluster - in short, you want to have an odd number of nodes to maximize survivability. All of that is still true in multi-DC setups, but some more elements are added into the mix. First of all, you need to decide if you want Galera to automatically handle a datacenter failure. This will determine how many datacenters you are going to use. Let’s imagine two DC’s - if you’ll split your nodes 50% - 50%, if one datacenter goes down, the second one doesn’t have 50%+1 nodes to maintain its “primary” state. If you split your nodes in an uneven way, using the majority of them in the “main” datacenter, when that datacenter goes down, the “backup” DC won’t have 50% + 1 nodes to form a quorum. You can assign different weights to nodes but the result will be exactly the same - there’s no way to automatically failover between two DC’s without manual intervention. To implement automated failover, you need more than two DC’s. Again, ideally an odd number - three datacenters is a perfectly fine setup. Next, the question is - how many nodes you need to have? You want to have them evenly distributed across the datacenters. The rest is just a matter of how many failed nodes your setup has to handle.

Minimal setup will use one node per datacenter - it has serious drawbacks, though. Every state transfer will require moving data across the WAN and this results in either longer time needed to complete SST or higher costs.

Quite typical setup is to have six nodes, two per datacenter. This setup seems unexpected as it has an even number of nodes. But, when you think of it, it might not be that big of an issue: it’s quite unlikely that three nodes will go down at once, and such a setup will survive a crash of up to two nodes. A whole datacenter may go offline and two remaining DC’s will continue operations. It also has a huge advantage over the minimal setup - when a node goes offline, there’s always a second node in the datacenter which can serve as a donor. Most of the time, the WAN won’t be used for SST.

Of course, you can increase the number of nodes to three per cluster, nine in total. This gives you even better survivability: up to four nodes may crash and the cluster will still survive. On the other hand, you have to keep in mind that, even with the use of segments, more nodes means higher overhead of operations and you can scale out Galera cluster only to a certain extent.

It may happen that there’s no need for a third datacenter because, let’s say, your application is located in only two of them. Of course, the requirement of three datacenters is still valid so you won’t go around it, but it is perfectly fine to use a Galera Arbitrator (garbd) instead of fully loaded database servers.

Garbd can be installed on smaller nodes, even virtual servers. It does not require powerful hardware, it does not store any data nor apply any of the writesets. But it does see all the replication traffic, and takes part in the quorum calculation. Thanks to it, you can deploy setups like four nodes, two per DC + garbd in the third one - you have five nodes in total, and such cluster can accept up to two failures. So it means it can accept a full shutdown of one of the datacenters.

Which option is better for you? There is no best solution for all cases, it all depends on your infrastructure requirements. Luckily, there are different options to pick from: more or less nodes, full 3 DC or 2 DC and garbd in the third one - it’s quite likely you’ll find something suitable for you.

Network latency

When working with multi-DC setups, you have to keep in mind that network latency will be significantly higher than what you’d expect from a local network environment. This may seriously reduce performance of the Galera cluster when you compare it with standalone MySQL instance or a MySQL replication setup. The requirement that all of the nodes have to certify a writeset means that all of the nodes have to receive it, no matter how far away they are. With asynchronous replication, there’s no need to wait before a commit. Of course, replication has other issues and drawbacks, but latency is not the major one. The problem is especially visible when your database has hot spots - rows, which are frequently updated (counters, queues, etc). Those rows cannot be updated more often than once per network round trip. For clusters spanning across the globe, this can easily mean that you won’t be able to update a single row more often than 2 - 3 times per second. If this becomes a limitation for you, it may mean that Galera cluster is not a good fit for your particular workload.

Proxy layer in multi-DC Galera cluster

It’s not enough to have Galera cluster spanning across multiple datacenters, you still need your application to access them. One of the popular methods to hide complexity of the database layer from an application is to utilize a proxy. Proxies are used as an entry point to the databases, they track the state of the database nodes and should always direct traffic to only the nodes that are available. In this section, we’ll try to propose a proxy layer design which could be used for a multi-DC Galera cluster. We’ll use ProxySQL, which gives you quite a bit of flexibility in handling database nodes, but you can use another proxy, as long as it can track the state of Galera nodes.

Where to locate the proxies?

In short, there are two common patterns here: you can either deploy ProxySQL on a separate nodes or you can deploy them on the application hosts. Let’s take a look at pros and cons of each of these setups.

Proxy layer as a separate set of hosts

The first pattern is to build a proxy layer using separate, dedicated hosts. You can deploy ProxySQL on a couple of hosts, and use Virtual IP and keepalived to maintain high availability. An application will use the VIP to connect to the database, and the VIP will ensure that requests will always be routed to an available ProxySQL. The main issue with this setup is that you use at most one of the ProxySQL instances - all standby nodes are not used for routing the traffic. This may force you to use more powerful hardware than you’d typically use. On the other hand, it is easier to maintain the setup - you will have to apply configuration changes on all of the ProxySQL nodes, but there will be just a handful of them. You can also utilize ClusterControl’s option to sync the nodes. Such setup will have to be duplicated on every datacenter that you use.

Proxy installed on application instances

Instead of having a separate set of hosts, ProxySQL can also be installed on the application hosts. Application will connect directly to the ProxySQL on localhost, it could even use unix socket to minimize the overhead of the TCP connection. The main advantage of such a setup is that you have a large number of ProxySQL instances, and the load is evenly distributed across them. If one goes down, only that application host will be affected. The remaining nodes will continue to work. The most serious issue to face is configuration management. With a large number of ProxySQL nodes, it is crucial to come up with an automated method of keeping their configurations in sync. You could use ClusterControl, or a configuration management tool like Puppet.

Tuning of Galera in a WAN environment

Galera defaults are designed for local network and if you want to use it in a WAN environment, some tuning is required. Let’s discuss some of the basic tweaks you can make. Please keep in mind that the precise tuning requires production data and traffic - you can’t just make some changes and assume they are good, you should do proper benchmarking.

Operating system configuration

Let’s start with the operating system configuration. Not all of the modifications proposed here are WAN-related, but it’s always good to remind ourselves what is a good starting point for any MySQL installation.

vm.swappiness = 1

Swappiness controls how aggressive the operating system will use swap. It should not be set to zero because in more recent kernels, it prevents the OS from using swap at all and it may cause serious performance issues.

/sys/block/*/queue/scheduler = deadline/noop

The scheduler for the block device, which MySQL uses, should be set to either deadline or noop. The exact choice depends on the benchmarks but both settings should deliver similar performance, better than default scheduler, CFQ.

For MySQL, you should consider using EXT4 or XFS, depending on the kernel (performance of those filesystems changes from one kernel version to another). Perform some benchmarks to find the better option for you.

In addition to this, you may want to look into sysctl network settings. We will not discuss them in detail (you can find documentation here) but the general idea is to increase buffers, backlogs and timeouts, to make it easier to accommodate for stalls and unstable WAN link.

net.core.optmem_max = 40960
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
net.core.netdev_max_backlog = 50000
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_slow_start_after_idle = 0

In addition to OS tuning you should consider tweaking Galera network - related settings.

evs.suspect_timeout
evs.inactive_timeout

You may want to consider changing the default values of these variables. Both timeouts govern how the cluster evicts failed nodes. Suspect timeout takes place when all of the nodes cannot reach the inactive member. Inactive timeout defines a hard limit of how long a node can stay in the cluster if it’s not responding. Usually you’ll find that the default values work well. But in some cases, especially if you run your Galera cluster over WAN (for example, between AWS regions), increasing those variables may result in more stable performance. We’d suggest to set both of them to PT1M, to make it less likely that WAN link instability will throw a node out of the cluster.

evs.send_window
evs.user_send_window

These variables, evs.send_window and evs.user_send_window, define how many packets can be sent via replication at the same time (evs.send_window) and how many of them may contain data (evs.user_send_window). For high latency connections, it may be worth increasing those values significantly (512 or 1024 for example).

evs.inactive_check_period

The above variable may also be changed. evs.inactive_check_period, by default, is set to one second, which may be too often for a WAN setup. We’d suggest to set it to PT30S.

gcs.fc_factor
gcs.fc_limit

Here we want to minimize chances that flow control will kick in, therefore we’d suggest to set gcs.fc_factor to 1 and increase gcs.fc_limit to, for example, 260.

gcs.max_packet_size

As we are working with the WAN link, where latency is significantly higher, we want to increase size of the packets. A good starting point would be 2097152.

As we mentioned earlier, it is virtually impossible to give a simple recipe on how to set these parameters as it depends on too many factors - you will have to do your own benchmarks, using data as close to your production data as possible, before you can say your system is tuned. Having said that, those settings should give you a starting point for the more precise tuning.

That’s it for now. Galera works pretty well in WAN environments, so do give it a try and let us know how you get on.

by krzysztof at August 30, 2017 08:38 AM

August 29, 2017

MariaDB Foundation

MariaDB 10.3.1 now available

The MariaDB project is pleased to announce the availability of MariaDB 10.3.1, the 2nd alpha release in the MariaDB 10.3 series. See the release notes and changelogs for details. Download MariaDB 10.3.1 Release Notes Changelog What is MariaDB 10.3? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.3.1 now available appeared first on MariaDB.org.

by Ian Gilfillan at August 29, 2017 08:18 PM

Peter Zaitsev

Percona Live Europe Featured Talks: Migrating To and Living on RDS/Aurora with Balazs Pocze

Colin Charles

Percona Live Europe Featured Talk Balazs GizmodoWelcome to another post our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Balazs Pocze, Senior Datastore Engineer at Gizmodo. His talk is titled Migrating To and Living on RDS/Aurora. Gizmodo migrated their platform (Kinja) from a datacenter-based approach to AWS, including the migration of standalone MySQL hosts to RDS/Aurora. In our conversation, we discussed how they achieved this migration:

Percona: How did you get into database technology? What do you love about it?

Balazs: I worked as an Operations/DevOps guy for years before I started working with databases. I guess it happened because I was the person at the company that I worked at the time who dared to deal with the database when something strange happened. Somebody had to hold the hot potato. 😀

I love that being a DBA is like being a bass player in a rock band. When you do your job perfectly, no one ever notices you are there – but the entire show depends on your work.

Percona: You’re presenting a session called “Migrating To and Living on RDS/Aurora”. What reasons were crucial in the decision to migrate to a cloud platform? Performance? Less management? Database demands?

Balazs: Actually, we migrated the entire Kinja (our platform) to the cloud, so migrating the database wasn’t a question for a second. We moved to the cloud because we didn’t want to deal with hardware anyway, we need flexibility. In the data center days, we had to size the DC’s to handle all of our traffic at any given moment. This means we had to burn a lot of money on underutilized machines. In the cloud, we can spin up machines when we need more computing power. In conjunction, our hardware just got old enough so that it made sense to consider what was a better idea: buying lots of expensive hardware, keep it running, dealing with the hardware (the majority of the ops team lives on a different continent than our servers!) or simply migrating everything to the cloud. That was simpler and safer.

But we didn’t just migrate to the cloud, we also migrated to RDS – managed database service instead of servers with a database on them. The reason to start using RDS was that I didn’t want to re-implement all of the automation stacks we had on the data centers. That seemed like too much work with too many points of failure. When I checked how to fix those failure points, the entire project started to look like the Deathstar. The original database stack was growing organically in the given data center scenario, and reimplementing it for the cloud seemed unsafe.

Percona: How smoothly was the transition, and did you hit unexpected complications? How did you overcome them?

Balazs: The transition was smooth and, from our reader’s view, unnoticeable. Since the majority of my talk will be about those complications and the ways we solved them, I think it would be best if I answer this question during my session. 😉

But there’s a non-exhaustive list: we had to switch back from GTID to old-fashioned replication, we had to set up SSL proxies to connect securely the data center and the cloud environment, and after we had to debug a lot of packet loss and TCP overload on the VPN channel. It was fun, actually.

Percona: What do you want attendees to take away from your session? Why should they attend?

Balazs: This session will be about how we had to change our view of the database, and what differences we met in the cloud compared to the hardware world. If somebody plans to migrate to the cloud (especially AWS/RDS), I recommend they check out my talk, because some of the paths we walked down were dead ends. I’ll share what we found, so you don’t have to make the same mistakes we did. It will spare you some time.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Balazs: Three things: hearing about new technologies, learning best practices, and most importantly meeting up with the people I always meet at Percona conferences. There is a really good community with lots of great people. I am always looking forward to seeing them again.

Want to find out more about Balazs and RDS migration? Register for Percona Live Europe 2017, and see his talk Migrating To and Living on RDS/Aurora. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at August 29, 2017 08:15 PM

MariaDB AB

MariaDB Server 10.3.1 Alpha available

MariaDB Server 10.3.1 Alpha available dbart Tue, 08/29/2017 - 13:45

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.3.1. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.3.1 alpha

Release Notes Changelog What is MariaDB Server 10.3?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.3.1. See the release notes and changelog for details.

Login or Register to post comments

by dbart at August 29, 2017 05:45 PM

August 28, 2017

Peter Zaitsev

Looking at Disk Utilization and Saturation

DIsk Utilization and Saturation small

In this blog post, I will look at disk utilization and saturation.

In my previous blog post, I wrote about CPU utilization and saturation, the practical difference between them and how different CPU utilization and saturation impact response times. Now we will look at another critical component of database performance: the storage subsystem. In this post, I will refer to the storage subsystem as “disk” (as a casual catch-all). 

The most common tool for command line IO performance monitoring is

iostat
, which shows information like this:

root@ts140i:~# iostat -x nvme0n1 5
Linux 4.4.0-89-generic (ts140i)         08/05/2017      _x86_64_        (4 CPU)
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          0.51    0.00    2.00    9.45    0.00   88.04
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 3555.57 5887.81 52804.15 87440.73    29.70     0.53    0.06    0.13    0.01   0.05  50.71
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          0.60    0.00    1.06   20.77    0.00   77.57
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 7612.80    0.00 113507.20     0.00    29.82     0.97    0.13    0.13    0.00   0.12  93.68
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          0.50    0.00    1.26    6.08    0.00   92.16
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 7653.20    0.00 113497.60     0.00    29.66     0.99    0.13    0.13    0.00   0.12  93.52

The first line shows the average performance since system start. In some cases, it is useful to compare the current load to the long term average. In this case, as it is a test system, it can be safely ignored. The next line shows the current performance metrics over five seconds intervals (as specified in the command line).

The

iostat
 command reports utilization information in the %util column, and you can look at saturation by either looking at the average request queue size (the avgqu-sz column) or looking at the r_await and w_await columns (which show the average wait for read and write operations). If it goes well above “normal” then the device is over-saturated.

As in my previous blog post, we’ll perform some system Sysbench runs and observe how the

iostat
 command line tool and Percona Monitoring and Management graphs behave.

To focus specifically on the disk, we’re using the Sysbench fileio test. I’m using just one 100GB file, as I’m using DirectIO so all requests hit the disk directly. I’m also using “sync” request submission mode so I can get better control of request concurrency.

I’m using an Intel 750 NVME SSD in this test (though it does not really matter).

Sysbench FileIO 1 Thread

root@ts140i:/mnt/data# sysbench  --threads=1 --time=600 --max-requests=0  fileio --file-num=1 --file-total-size=100G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run
File operations:
   reads/s:                      7113.16
   writes/s:                     0.00
   fsyncs/s:                     0.00
Throughput:
   read, MiB/s:                  111.14
   written, MiB/s:               0.00
General statistics:
   total time:                          600.0001s
   total number of events:              4267910
Latency (ms):
        min:                                  0.07
        avg:                                  0.14
        max:                                  6.18
        95th percentile:                      0.17

A single thread run is always great as a baseline, as with only one request in flight we should expect the best response time possible (though typically not the best throughput possible).

Iostat
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 7612.80    0.00 113507.20     0.00    29.82     0.97    0.13    0.13    0.00   0.12  93.68

Disk LatencyDIsk Utilization and Saturation

The Disk Latency graph confirms the disk IO latency we saw in the

iostat
 command, and it will be highly device-specific. We use it as a baseline to compare changes to with higher concurrency.

Disk IO Utilization

DIsk Utilization and Saturation 2

Disk IO utilization is close to 100% even though we have just one outstanding IO request (queue depth). This is the problem with Linux disk utilization reporting: unlike CPUs, Linux does not have direct visibility on how the IO device is designed. How many “execution units” does it really have? How are they utilized?  Single spinning disks can be seen as a single execution unit while RAID, SSDs and cloud storage (such as EBS) are more than one.

Disk Load

DIsk Utilization and Saturation 3

This graph shows the disk load (or request queue size), which roughly matches the number of threads that are hitting disk as hard as possible.

Saturation (IO Load)

DIsk Utilization and Saturation 4

The IO load on the Saturation Metrics graph shows pretty much the same numbers. The only difference is that unlike Disk IO statistics, it shows the summary for the whole system.

Sysbench FileIO 4 Threads

Now let’s increase IO to four concurrent threads and see how disk responds:

sysbench  --threads=4 --time=600 --max-requests=0  fileio --file-num=1 --file-total-size=100G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run
File operations:
   reads/s:                      26248.44
   writes/s:                     0.00
   fsyncs/s:                     0.00
Throughput:
   read, MiB/s:                  410.13
   written, MiB/s:               0.00
General statistics:
   total time:                          600.0002s
   total number of events:              15749205
Latency (ms):
        min:                                  0.06
        avg:                                  0.15
        max:                                  8.73
        95th percentile:                      0.21

We can see the number of requests scales almost linearly, while request latency changes very little: 0.14ms vs. 0.15ms. This shows the device has enough execution units internally to handle the load in parallel, and there are no other bottlenecks (such as the connection interface).

Iostat
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 28808.60    0.00 427668.00     0.00    29.69     4.05    0.14    0.14    0.00   0.03  99.92

Disk Latency

DIsk Utilization and Saturation 5

Disk Utilization

DIsk Utilization and Saturation 6

Disk Load

DIsk Utilization and Saturation 7

Saturation Metrics (IO Load)

DIsk Utilization and Saturation 8

These stats and graphs show interesting picture: we barely see a response time increase for IO requests, while utilization inches closer to 100% (with four threads submitting requests all the time, it is hard to catch the time when the disk does not have any requests in flight). The load is near four (showing the disk has to handle four requests at the time on average).

Sysbench FileIO 16 Threads

root@ts140i:/mnt/data# sysbench  --threads=16 --time=600 --max-requests=0  fileio --file-num=1 --file-total-size=100G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run
File operations:
   reads/s:                      76845.96
   writes/s:                     0.00
   fsyncs/s:                     0.00
Throughput:
   read, MiB/s:                  1200.72
   written, MiB/s:               0.00
General statistics:
   total time:                          600.0003s
   total number of events:              46107727
Latency (ms):
        min:                                  0.07
        avg:                                  0.21
        max:                                  9.72
        95th percentile:                      0.36

Going from four to 16 threads, we again see a good throughput increase with a mild response time increase. If you look at the results closely, you will notice one more interesting thing: the average response time has increased from 0.15ms to 0.21ms (which is a 40% increase), while the 95% response time has increased from 0.21ms to 0.36ms (which is 71%). I also ran a separate test measuring 99% response time, and the difference is even larger: 0.26ms vs. 0.48ms (or 84%).

This is an important observation to make: once saturation starts to happen, the variance is likely to increase and some of the requests will be disproportionately affected (beyond what the average response time shows).

Iostat
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 82862.20    0.00 1230567.20     0.00    29.70    16.33    0.20    0.20    0.00   0.01 100.00

Disk IO Latency

DIsk Utilization and Saturation 9

Disk IO Utilization

DIsk Utilization and Saturation 10

Disk Load

DIsk Utilization and Saturation 11

Saturation Metrics IO Load

DIsk Utilization and Saturation 12

The graphs show an expected figure: the disk load and IO load from saturation are up to about 16, and utilization remains at 100%.

One thing to notice is increased jitter in the graphs. IO utilization jumps to over 100% and disk IO load spikes to 18, when there should not be as many requests in flight. This comes from how this information is gathered. An attempt is made to sample this data every second, but with the loaded system it takes time for this process to work: sometimes when we try to get the data for a one-second interval but really get data for 1.05- or 0.95-second intervals. When the math is applied to the data, it creates the spikes and dips in the graph when there should be none. You can just ignore them if you’re looking at the big picture.

Sysbench FileIO 64 Threads

Finally, let’s run sysbench with 64 concurrent threads hitting the disk:

root@ts140i:/mnt/data# sysbench  --threads=64 --time=600 --max-requests=0  fileio --file-num=1 --file-total-size=100G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run
File operations:
   reads/s:                      127840.59
   writes/s:                     0.00
   fsyncs/s:                     0.00
Throughput:
   read, MiB/s:                  1997.51
   written, MiB/s:               0.00
General statistics:
   total time:                          600.0014s
   total number of events:              76704744
Latency (ms):
        min:                                  0.08
        avg:                                  0.50
        max:                                  9.34
        95th percentile:                      1.25

We can see the average has risen from 0.21ms to 0.50 (more than two times), and 95% almost tripped from 0.36ms to 1.25ms. From a practical standpoint, we can see some saturation starting to happen, but we’re still not seeing a linear response time increase with increasing numbers of parallel operations as we have seen with CPU saturation. I guess this points to the fact that this IO device has a lot of parallel capacity inside and can process requests more effectively (even going from 16 to 64 concurrent threads).

Over the series of tests, as we increased concurrency from one to 64, we saw response times increase from 0.14ms to 0.5ms (or approximately three times). The 95% response time at this time grew from 0.17ms to 1.25ms (or about seven times). For practical purposes, this is where we see the IO device saturation start to show.

Iostat
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 138090.20    0.00 2049791.20     0.00    29.69    65.99    0.48    0.48    0.00   0.01 100.24

We’ll skip the rest of the graphs as they basically look the same, just with higher latency and 64 requests in flight.

Sysbench FileIO 256 Threads

root@ts140i:/mnt/data# sysbench  --threads=256 --time=600 --max-requests=0  fileio --file-num=1 --file-total-size=100G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run
File operations:
   reads/s:                      131558.79
   writes/s:                     0.00
   fsyncs/s:                     0.00
Throughput:
   read, MiB/s:                  2055.61
   written, MiB/s:               0.00
General statistics:
   total time:                          600.0026s
   total number of events:              78935828
Latency (ms):
        min:                                  0.10
        avg:                                  1.95
        max:                                 17.08
        95th percentile:                      3.89

Iostat
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 142227.60    0.00 2112719.20     0.00    29.71   268.30    1.89    1.89    0.00   0.01 100.00

With 256 threads, finally we’re seeing the linear growth of the average response time that indicates overload and queueing to process requests. There is no easy way to tell if it is due to the IO bus saturation (we’re reading 2GB/sec here) or if it is the internal device processing ability.  

As we’ve seen a less than linear increase in response time going from 16 to 64 connections, and a linear increase going from 64 to 256, we can see the “optimal” concurrency for this device: somewhere between 16 and 64 connections. This allows for peak throughput without a lot of queuing.

Before we get to the summary, I want to make an important note about this particular test. The test is a random reads test, which is a very important pattern for many database workloads, but it might not be the dominant load for your environment. You might be write-bound as well, or have mainly sequential IO access patterns (which could behave differently). For those other workloads, I hope this gives you some ideas on how to also analyze them.

Another Way to Think About Saturation

When I asked the Percona staff for feedback on this blog post by, my colleague Yves Trudeau provided another way to think about saturation: measure saturation as percent increase in the average response time compared to the single user. Like this:

Threads Avg Response Time Saturation
1 0.14
4 0.15 1.07x  or 7%
16 0.21 1.5x  or 50%
64 0.50 3.6x or 260%
256 1.95 13.9x or 1290%

 

Summary

We can see how understanding disk utilization and saturation is much more complicated than for the CPU:

  • The Utilization metric (as reported by
    iostat
     and by PMM) is not very helpful for showing true storage utilization, as it only measures the time when there is at least one request in flight. If you had the same metric for the CPU, it would correspond to something running on at least one of the cores (not very useful for highly parallel systems).
  • Unlike a CPU, Linux tools do not provide us with information about the structure of the underlying storage and how much parallel load it should be able to handle without saturation. Even more so, storage might well have different low-level resources that cause saturation. For example, it could be the network connection, SATA BUS or even the kernel IO stack for older kernels and very fast storage.
  • Saturation as measured by the number of requests in flight is helpful for guessing if there might be saturation, but since we do not know how many requests the device can efficiently process concurrently, just looking the raw metric doesn’t let us determine that the device is overloaded.
  • Avg Response Time is a great metric for looking at saturation, but as with the response time you can’t say what response time is good or bad for this device. You need to look at it in context and compare it to the baseline. When you’re looking at the Avg Response Time, make sure you’re looking at read request response time vs. write request response time separately, and keep the average request size in mind to ensure we are comparing apples to apples.

by Peter Zaitsev at August 28, 2017 04:54 PM

August 25, 2017

Peter Zaitsev

This Week in Data with Colin Charles #3: More Percona Live Europe!

Colin Charles

Percona Live Europe Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

We are five weeks out to the conference! The tutorials and the sessions have been released, and there’s an added bonus – you can now look at all this in a grid view: tutorials, day one and day two. Now that you can visualize what’s being offered, don’t forget to register.

If you want a discount code, feel free to email me at colin.charles@percona.com.

We have some exciting keynotes as well. Some highlights:

  1. MySQL as a Layered Service: How to Use ProxySQL to Control Traffic and Scale Out, given by René Cannaò, the creator of ProxySQL
  2. Why Open Sourcing Our Database Tooling was the Smart Decision, given by Shlomi Noach, creator of Orchestrator, many other tools, and developer at GitHub (so expect some talk about gh-ost)
  3. MyRocks at Facebook and a Roadmap, given by Yoshinori Matsunobu, shepherd of the MyRocks project at Facebook
  4. Real Time DNS Analytics at CloudFlare with ClickHouse, given by Tom Arnfeld
  5. Prometheus for Monitoring Metrics, given by Brian Brazil, core developer of Prometheus
  6. A Q&A session with Charity Majors and Laine Campbell on Database Reliability Engineering, their new upcoming book!

Let’s not forget the usual State of the Dolphin, an update from Oracle’s MySQL team (representative: Geir Høydalsvik), as well as a keynote by Peter Zaitsev (CEO, Percona) and Continuent. There will also be a couple of Percona customers keynoting, so expect information-packed fun mornings! You can see more details about the keynotes here: day one and day two.

Releases

  • Tarantool 1.7.5 stable. The first in the 1.7 series that comes as stable, and it also comes with its own Log Structured Merge Tree (LSM) engine called Vinyl. They wrote this when they found RocksDB insufficient for them. Slides: Vinyl: why we wrote our own write-optimized storage engine rather than chose RocksDB (and check out the video).
  • MariaDB Server 10.2.8. A– as per my previous column, this build merges TokuDB from Percona Server 5.6.36-82.1 (fixing some bugs). There is also a new InnoDB from MySQL 5.7.19 (current GA release). Have you tried MariaDB Backup yet? There are some GIS compatibility fixes (i.e., to make it behave like MySQL 5.7). One thing that piqued my interest is the CONNECT storage engine (typically used for ETL operations) now has beta support for the MONGO table type. No surprises, it’s meant to read MongoDB tables via the MongoDB C Driver API. Definitely something to try!

Link List

Upcoming Appearances

Percona’s website keeps track of community events, so check out where to listen to a Perconian speak. My upcoming appearances are:

  1. db tech show case Tokyo 2017 – 5-7 September 2017, Tokyo, Japan
  2. Open Source Summit North America – 11-14 September 2017, Los Angeles, CA, USA
  3. Percona Live Europe Dublin – 25-27 September 2017, Dublin, Ireland
  4. Velocity Europe – 17-20 October 2017, London, UK
  5. Open Source Summit Europe – 23-26 October 2017, Prague, Czech Republic

Feedback

Bill Bogasky (MariaDB Corporation) says that if you’re looking for commercial support for Riak now that Basho has gone under, you could get it from Erlang Solutions or TI Tokyo. See their announcement: Riak commercial support now available post-Basho. Thanks, Bill!

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

by Colin Charles at August 25, 2017 09:37 PM

Percona Server for MySQL 5.6.37-82.2 Is Now Available

Percona Server for MySQL 5.7.18-15

Percona Server for MySQL 5.6Percona announces the release of Percona Server for MySQL 5.6.37-82.2 on August 25, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.6.37, and including all the bug fixes in it, Percona Server for MySQL 5.6.37-82.2 is now the current GA release in the Percona Server for MySQL 5.6 series. Percona Server for MySQL is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.37-82.2 milestone on Launchpad.

NOTE: Red Hat Enterprise Linux 5 (including CentOS 5 and other derivatives), Ubuntu 12.04 and older versions are no longer supported by Percona software. The reason for this is that these platforms reached end of life, will not receive updates and are not recommended for use in production.

Bugs Fixed

  • #1703105: Fixed overwriting of error log on server startup.
  • #1705729: Fixed the postinst script to correctly locate the datadir.
  • #1709834: Fixed the mysqld_safe script to correctly locate the basedir.
  • Other fixes: #1706262

TokuDB Changes

  • TDB-72: Fixed issue when renaming a table with non-alphanumeric characters in its name.

Platform Support

  • Stopped providing packages for RHEL 5 (CentOS 5) and Ubuntu 12.04.

Release notes for Percona Server for MySQL 5.6.37-82.2 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

by Alexey Zhebel at August 25, 2017 07:04 PM

Jean-Jerome Schmidt

A How-To Guide for Galera Cluster - Updated Tutorial

Since it was originally published more than 63,000 people (to date) have leveraged the MySQL for Galera Cluster Tutorial to both learn about and get started using MySQL Galera Cluster.

Galera Cluster for MySQL is a true Multi-master Cluster which is based on synchronous replication. Galera Cluster is an easy-to-use, high-availability solution, which provides high system uptime, no data loss and scalability to allow for future growth.

Severalnines was a very early adopter of the Galera Cluster technology; which was created by Codership and has since expanded to include versions from Percona and MariaDB.  

Included in this newly updated tutorial are topics like…

  • An introduction to Galera Cluster
  • An explanation of the differences between MySQL Replication and Galera Replication
  • Deployment of Galera Cluster
  • Accessing the Galera Cluster
  • Failure Handling
  • Management and Operations
  • FAQs and Common Questions

Check out the updated tutorial MySQL for Galera Cluster here.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

ClusterControl for Galera

ClusterControl makes it easy for those new to Galera to use the technology and deploy their first clusters. It centralizes the database management into a single interface. ClusterControl automation ensures DBAs and SysAdmins make critical changes to the cluster efficiently with minimal risks.

ClusterControl delivers on an array of features to help manage and monitor your open source database environments:

  • Deploy Database Clusters
  • Add Node, Load Balancer (HAProxy, ProxySQL) or Replication Slave
  • Backup Management
  • Configuration Management
  • Full stack monitoring (DB/LB/Host)
  • Query Monitoring
  • Enable SSL Encryption Galera Replication
  • Node Management
  • Developer Studio with Advisors

Learn more about how ClusterControl can help you drive high availability with Galera Cluster here.

by Severalnines at August 25, 2017 09:59 AM

August 24, 2017

Peter Zaitsev

Percona Live Europe Featured Talks: A Quick Tour of MySQL 8.0 Roles with Giuseppe Maxia

Percona Live Europe 2017

Percona Live Europe Featured Talk Giuseppe MaxiaWelcome to our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This first blog post is with Giuseppe Maxia of VMware. His talk is titled A Quick Tour of MySQL 8.0 Roles. MySQL 8.0 introduced roles, which are a new security and administrative feature that allows DBAs to simplify user management and increases the security of multi-user environments. In our conversation, we discussed MySQL roles and how they can help MySQL DBAs:

Percona: Hello Giuseppe, nice to interview you again (in our last blog we discussed “MySQL Document Store: SQL and NoSQL United”)! What have you been up to since the last Percona Live?

Giuseppe: Hi Dave, glad to be sharing ideas with you again. Since the last Percona conference, I’ve been going through two separate technical paths. For my day job, I work as software explorer on a large high availability in the cloud project. Exploratory software testing is a branch of QA that can easily qualify as a “dream job” for most senior QA engineers. For me, it’s a dynamic process that allows me to combine experience, creativity and development skills. Wildly interesting as it is, this job doesn’t require any MySQL skills. Thus my second technical path goes on in private, and I keep myself up to date with the MySQL world. This year I presented at several conferences and user groups in Europe, Asia, and America. I plan to keep working on MySQL in my own time, just because I like the topic.

Percona: You’re presenting a session called “A Quick Tour of MySQL 8.0 Roles”. What are MySQL roles, and why are they important?

Guiseppe: Roles are a way of simplifying user management. In database administration, users are granted privileges to access schemas, tables or columns (depending on the business needs). When many different users require authorization for different sets of privileges, administrators have to repeat the process of granting privileges several times. This is both tedious and error-prone. Using roles, administrators can define sets of privileges for a user category, and then the user authorization becomes a single statement operation.

In a well-regulated and security-minded organization, administrators should only use roles for privilege management. This policy not only simplifies user management, but also provides meaningful data on privilege usage.

Percona: When getting into role assignment, what are some of the things that DBAs need to watch out for?

Giuseppe: Although roles make everyday tasks easier, they also present some additional challenges. In the MySQL implementation, roles are users with some small differences. While this similarity allows admins to get lazy and assign pre-existing users as roles to other users, this practice would make administration more difficult in the long run. To truly benefit from this new feature, DBAs must get organized, and spend some time planning how they want to orchestrate their roles for maximum efficiency before plunging in.

Percona: Could you share your personal experience with this feature?

Giuseppe: Roles have been on the MySQL community’s wish list for a long time. I remember several third party solutions that tried to implement roles as a hack on top of the existing privileges granting system. I created my own solution many years ago when I had to administer a large set of users with different levels of access.

Anytime a new project promised to ease the roles problem, I gave it a try. None of them truly delivered a secure solution.

When I saw the roles feature in MySQL 8, I tested it extensively, provided feedback to the MySQL team and asked for better usability. I was pleased to see that in the latest release (8.0.2) the team addressed some of my concerns, making the roles both easier to use and more powerful – although at the same time they introduced a new extension (mandatory roles) that could create more problems. All in all, I am pleased with the attitude of the MySQL team: they were willing to listen to my feedback and my proposals for improvement.

Percona: What do you want attendees to take away from your session? Why should they attend?

Giuseppe: When I first proposed this session at Percona Live in Santa Clara, my goal was to explain the various aspects of the new feature. Many users, when hearing about roles, think that it’s a straightforward extension of the existing privileges system. In practice, roles usage is a minefield. Many commands perform seemingly the same operation but often lead to unexpected results.

My session should make the basic operations clear, and teach attendees how to avoid the most common pitfalls.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Giuseppe: First of all, Dublin! I have been an advocate of moving the conference to Ireland, and as soon as I saw that the venue confirmed I booked flight and hotel without waiting for the CfP. At the very minimum, I will be a happy tourist there! As for the conference, there are several topics that I want to follow. The continued improvement of MySQL 8.0 is one, which now seems poised for a GA release in the near future. The explosion of technical solutions that are conquering the community is another: ProxySQL, Orchestrator, gh-ost, MyRocks. I am also interested in the evolution of InnoDB Cluster, which one year ago was presented as the solution to every DBA need (but so far has been less than overwhelming).

There are always lots of sessions with intriguing subjects, and I know already that I won’t be able to attend them all. But I am sure I will learn some new technique or methodology that comes in handy, as happens to me at every MySQL conference.

Want to find out more about Giuseppe and MySQL roles? Register for Percona Live Europe 2017, and see his talk A Quick Tour of MySQL 8.0 Roles. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

by Dave Avery at August 24, 2017 06:49 PM

Jean-Jerome Schmidt

Galera Cluster Comparison - Codership vs Percona vs MariaDB

Galera Cluster is a synchronous multi-master replication plugin for InnoDB or XtraDB storage engine. It offers a number of outstanding features that standard MySQL replication doesn’t - read-write to any cluster node, automatic membership control, automatic node joining, parallel replication on row-level, and still keeping the native look and feel of a MySQL server. This plug-in is open-source and developed by Codership as a patch for standard MySQL. Percona and MariaDB leverage the Galera library in Percona XtraDB Cluster (PXC) and MariaDB Server (MariaDB Galera Cluster for pre 10.1) respectively.

We often get the question - which version of Galera should I use? Percona? MariaDB? Codership? This is not an easy one, since they all use the same Galera plugin that is developed by Codership. Nevertheless, let’s give it a try.

In this blog post, we’ll compare the three vendors and their Galera Cluster releases. We will be using the latest stable version of each vendor available at the time of writing - Galera Cluster for MySQL 5.7.18, Percona XtraDB Cluster 5.7.18 and MariaDB 10.2.7 where all are shipped with InnoDB storage engine 5.7.18.

Database Release

A database vendor who wish to leverage Galera Cluster technology would need to incorporate the WriteSet Replication (wsrep) API patch into its server codebase. This will allow the Galera plugin to work as a wsrep provider, to communicate and replicate transactions (writesets in Galera terms) via a group communication protocol.

The following diagram illustrates the difference between the standalone MySQL server, MySQL Replication and Galera Cluster:

Codership releases the wsrep-patched version of Oracle’s MySQL. MySQL has already released MySQL 5.7 as General Availability (GA) since October 2015. However the first beta wsrep-patched for MySQL was released a year later around October 2016, then became GA in January 2017. It took more than a year to incorporate Galera Cluster into Oracle’s MySQL 5.7 release line.

Percona releases the wsrep-patched version of its Percona Server for MySQL called Percona XtraDB Cluster (PXC). Percona Server for MySQL comes with XtraDB storage engine (a drop-in replacement of InnoDB) and follows the upstream Oracle MySQL releases very closely (including all the bug fixes in it) with some additional features like MyRocks storage engine, TokuDB as well as Percona’s own bug fixes. In a way, you can think of it as an improved version of Oracle’s MySQL, embedded with Galera technology.

MariaDB releases the wsrep-patched version of its MariaDB Server, and it’s already embedded since MariaDB 10.1, where you don’t have to install separate packages for Galera. In the previous versions (5.5 and 10.0 particularly), the Galera variant’s of MariaDB is called MariaDB Galera Cluster (MGC) with separate builds. MariaDB has its own path of releases and versioning and does not follow any upstream like Percona does. The MariaDB server functionality has started diverging from MySQL, so it might not be as straightforward a replacement for MySQL. It still comes with a bunch of great features and performance improvements though.

System Status

Monitoring Galera nodes and the cluster requires the wsrep API to report several statuses, which is exposed through SHOW STATUS statement:

mysql> SHOW STATUS LIKE 'wsrep%';

PXC does have a number of extra statuses, if compared to other variants. The following list shows wsrep related status that can only be found in PXC:

  • wsrep_flow_control_interval
  • wsrep_flow_control_interval_low
  • wsrep_flow_control_interval_high
  • wsrep_flow_control_status
  • wsrep_cert_bucket_count
  • wsrep_gcache_pool_size
  • wsrep_ist_receive_status
  • wsrep_ist_receive_seqno_start
  • wsrep_ist_receive_seqno_current
  • wsrep_ist_receive_seqno_end

While MariaDB only has one extra wsrep status, if compared to the Galera version provided by Codership:

  • wsrep_thread_count

The above does not necessarily tell us that PXC is superior to the others. It means that you can get better insights with more statuses.

Configuration Options

Since Galera is part of MariaDB 10.1 and later, you have to explicitly enable the following option in the configuration file:

wsrep_ready=ON

Note that if you do not enable this option, the server will act as a standard MariaDB installation. For Codership and Percona, this option is enabled by default.

Some Galera-related variables are NOT available across all Galera variants:

Database Server Variable name
Codership’s MySQL Galera Cluster 5.7.18, wsrep 25.12
  • wsrep_mysql_replication_bundle
  • wsrep_preordered
  • wsrep_reject_queries
Percona XtraDB Cluster 5.7.18, wsrep 29.20
  • wsrep_preordered
  • wsrep_reject_queries
  • pxc_encrypt_cluster_traffic
  • pxc_maint_mode
  • pxc_maint_transition_period
  • pxc_strict_mode
MariaDB 10.2.7, wsrep 25.19
  • wsrep_gtid_domain_id
  • wsrep_gtid_mode
  • wsrep_mysql_replication_bundle
  • wsrep_patch_version

The above list might change once the vendor releases a new version. The only point that we would like to highlight here is, do not expect that Galera nodes hold the same set of configuration parameters across all variants. Some configuration variables were introduced by a vendor to specifically complement and improve the database server.

Contributions and Improvements

Database performance is not easily comparable, as it can vary a lot depending on the workloads. For general workloads, the replication performance are fairly similar across all variants. Under some specific workloads, it could be different.

Looking at the latest claims, Percona did an amazing job improving IST performance up to 4x as well as the commit operation. MariaDB also contributes a number of useful features for example WSREP_INFO plugin. On the other hand, Codership is focusing more on core Galera issues issues, including bug fixing and new features. Galera 4.0 has features like intelligent donor selection, huge transaction support, and non-blocking DDL.

The introduction of Percona Xtrabackup (a.k.a xtrabackup) as part of Galera’s SST has improved the SST performance significantly. The syncing process becomes faster and non-blocking to the donor. MariaDB then came up with its own xtrabackup fork called MariaDB Backup (mariabackup) which supported by Galera’s SST method through variable wsrep_sst_method=mariabackup. It also supports installation on Microsoft Windows.

Support

All Galera Cluster variants software are open-source and available for free. This includes the syncing software supported by Galera like mysqldump, rsync, Percona Xtrabackup and MariaDB Backup. For community users, you can seek for support, ask for questions, file a bug report, feature request or even make a pull request to the vendor’s respective support channels:

  Codership Percona MariaDB
Database server public issue tracker MySQL wsrep on Github Percona XtraDB Cluster on Launchpad MariaDB Server on JIRA
Galera issue tracker Galera on Github
Documentation Galera Cluster Documentation Percona XtraDB Cluster Documentation MariaDB Documentation
Support forum Codership Team Groups Percona Forum MariaDB Open Questions

Each vendor provides commercial support services.

Summary

We hope that this comparison gives you a clearer picture and helps you determine which vendor that better suits your need. They all use pretty much the same wsrep libraries, the differences would be mainly on the server side - for instance, if you want to leverage some specific features in MariaDB or Percona Server. You might want to check out this blog that compares the different servers (Oracle MySQL, MariaDB and Percona Server). ClusterControl supports all of the three vendors, so you can easily deploy different clusters and compare them yourself with your own workload, on your own hardware. Do give it a try.

by ashraf at August 24, 2017 09:59 AM

MariaDB AB

5 Essential Practices for Database Security

5 Essential Practices for Database Security Shane Johnson Thu, 08/24/2017 - 01:02

Data breaches are expensive. Between business disruption, loss of customer confidence, legal costs, regulatory fines and any direct losses that may result from a ransomware attack, for instance, the effects can add up to millions. The best defense is a good offense, so let’s look at five key practices to keep your database secure: protect, audit, manage, update and encrypt.

1. Protect against attacks with a database proxy

A database proxy, or gateway proxy, sits between your application and your database, accepting connections from applications and then, on behalf of those applications, connecting to the database. An intelligent database proxy such as our own MaxScale provides filters and modules to deliver security, reliability, scalability, and performance benefits.

The MaxScale Database Firewall Filter parses queries as they come through the filter, and can block things that don't match the whitelist of query types that you want to let through. For example, you might say that a given connection can only do updates and inserts, and that another connection must match certain regular expressions, etc.

Proxies like MaxScale also protect you against DDoS attacks: When too many connections are coming directly into the database server, it can become overloaded. But a proxy absorbs some of that load to limit the effects of such an attack.

2. Set up auditing and robust logging

Auditing and logging go hand-in-hand, but audit logs are much more sophisticated than the general log. Audit logs give you all the information you need to investigate suspicious activities and to conduct root-cause analysis if you do experience a breach. Furthermore, audit logs help ensure compliance with regulations such as GDPR, PCI, HIPPA and SOX. (Learn more about addressing the GDPR with MariaDB TX.)

The MariaDB audit plugin can log a lot of information: all incoming connections, all query executions, and even all accesses of individual tables. You can see who has accessed a table at a given time, and who has inserted or deleted data. The audit plugin can log to a file or the syslog, so if you have existing workflows that rely on the syslog, you can tie straight into those.  

3. Practice stringent user account management

It’s vital that you manage your database user accounts carefully. This is true for nearly all aspects of your IT ecosystem, so we won’t go into detail here. Instead, we’ll simply remind you of the key aspects of user account management:

  • Allow root access only from local clients.

  • Always use strong passwords.

  • Have a separate database user account for each of your applications.

  • Restrict the number of IP addresses that can access your database server.

4. Keep your database software and OS up-to-date

We all know the reasons to keep your software up-to-date, but that doesn’t stop a great many of us from running legacy operating systems and several-versions-old database servers. Let this serve as a reminder that keeping everything current is the only way to protect your data from all the latest threats.

This applies not only to your server software, but to your OS. The WannaCry ransomware attack was made possible by lackadaisical application of Windows OS security patches, after all.

5. Encrypt sensitive data – in your app, in transit, and at rest

We’ve saved the least commonly implemented practice for last. Many organizations give encryption short shrift, but it can be quite valuable. After all, it reduces the incentive for hackers if the work of attempting to break a cipher awaits them after they gain access.

The first phase of encryption happens in the application, before the data gets to your database. If the data is encrypted in the application, then a hacker who compromises your database can’t see what the data is. (This works only for data that is not a key, however.)

Next is encryption of data in transit. That means the data is encrypted over the network as it moves from the client onto your database server (or onto a proxy such as MaxScale). This is basically the same concept as using HTTPS in your web browser. Obviously the server can see the information because it needs to read the form you filled out, and you can read the information because you typed it into the form, but no one in between you and the server should be able to read it.

Finally we come to encryption of data at rest. You can use this to encrypt InnoDB table spaces, the InnoDB redo log and the binary log. This means that you can encrypt almost everything written to disc on a MariaDB server (10.1 and later – remember what we said about keeping your software up to date?).

Now that you know these five essential security practices, how about delving deeper? Download the new white paper: MariaDB TX: Security Overview.

Data breaches are expensive. Between business disruption, loss of customer confidence, legal costs, regulatory fines and any direct losses that may result from a ransomware attack, for instance, the effects can add up to millions. The best defense is a good offense. In this blog, we'll look at five key practices to keep your database secure: protect, audit, manage, update and encrypt.

monaadani dani

monaadani dani

Thu, 09/14/2017 - 06:20

I truly value this post. I've been searching all over for this! Thank heavens I discovered it on this blog . I truly respect your work and i seek in future i will return after more data. like this one.You have filled my heart with joy! Much appreciated once more! I have some good work experience with custom essay writing services [ https://www.buybestessaysonline.com ]and my words are clearly based on what I felt through such processes in the past.

Login or Register to post comments

by Shane Johnson at August 24, 2017 05:02 AM

Peter Zaitsev

Percona Server for MySQL 5.5.57-38.9 Is Now Available

Percona Server for MySQL 5.7.18-15

Percona Server for MySQL 5.5.55-38.8Percona announces the release of Percona Server for MySQL 5.5.57-38.9 on August 23, 2017. Based on MySQL 5.5.57, including all of its bug fixes, Percona Server for MySQL 5.5.57-38.9 is now the current stable release in the 5.5 series.

Percona Server for MySQL is open-source and free. You can find release details in the 5.5.57-38.9 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

NOTE: Red Hat Enterprise Linux 5 (including CentOS 5 and other derivatives), Ubuntu 12.04 and older versions are no longer supported by Percona software. The reason for this is that these platforms reached the end of life, will not receive updates and are not recommended for use in production.

New Features

  • #1702903: Added support of OpenSSL 1.1.

Platform Support

  • Added support and packages for Debian 9 (stretch). Covers only the amd64 architecture.
  • Stopped providing packages for RHEL 5 (CentOS 5) and Ubuntu 12.04.

Bugs Fixed

  • #1622985: Downgraded diagnostic severity from warning to normal note for successful doublewrite buffer recovery.
  • #1661488: Fixed crash of debug server build when two clients connected, one of them enabled userstat and ran either FLUSH CLIENT_STATISTICS or FLUSH USER_STATISTICS, and then both clients exited.
  • #1673656: Added support for wildcards and Subject Alternative Names (SAN) in SSL certificates for --ssl-verify-server-cert. For more information, see the compatibility matrix at the end of this post.
  • #1705729: Fixed the postinst script to correctly locate the datadir.
  • #1709834: Fixed the mysqld_safe script to correctly locate the basedir.
  • Minor fixes: #1160986#1684601#1689998#1690012.

 

Compatibility Matrix

Feature YaSSL OpenSSL < 1.0.2 OpenSSL >= 1.0.2
‘commonName’ validation Yes Yes Yes
SAN validation No Yes Yes
Wildcards support No No Yes

Find the release notes for Percona Server 5.5.57-38.9 in our online documentation. Report bugs on the launchpad bug tracker.

by Alexey Zhebel at August 24, 2017 12:09 AM

August 23, 2017

Peter Zaitsev

Percona Monitoring and Management 1.2.2 is Now Available

Troubleshooting MySQL Performance

Percona Monitoring and Management (PMM)Percona announces the release of Percona Monitoring and Management 1.2.2 on August 23, 2017.

For install and upgrade instructions, see Deploying Percona Monitoring and Management.

This release contains bug fixes related to performance and introduces various improvements. It also contains an updated version of Grafana.

Changes in PMM Server

We introduced the following changes in PMM Server 1.2.2:

Bug fixes

  • PMM-927: The error “Cannot read property ‘hasOwnProperty’ of undefined” was displayed on the QAN page for MongoDB.

    After enabling monitoring and generating data for MongoDB, the PMM client showed the following error message on the QAN page: “Cannot read property ‘hasOwnProperty’ of undefined”. This bug is now fixed.

  • PMM-949: Percona Server was not detected properly, the log_slow_* variables were not properly detected.

  • PMM-1081: Performance Schema Monitor treated queries that didn’t show up in every snapshot as new queries reporting a wrong number of counts between snapshots.

  • PMM-1272: MongoDB: the query empty abstract. This bug is now fixed.

  • PMM-1277: The QPS Graph had inappropriate Prometheus query. This bug is now fixed.

  • PMM-1279: The MongoDB summary did not work in QAN2 if mongodb authentication was activated. This bug is now fixed.

  • PMM-1284: Dashboards pointed to QAN2 instead of QAN. This bug is now fixed.

Improvements

  • PMM-586: The wsrep_evs_repl_latency parameter is now monitored in Grafana dashboards

  • PMM-624: The Grafana User ID remains the same in the pmm-server docker image

  • PMM-1209: OpenStack support is now enabled during the OVA image creation

  • PMM-1211: It is now possible to configure a static IP for an OVA image

    The root password can only be set from the console. If the root password is not changed from the default, a warning message appears on the console requesting the user to change the root password on the root first login from the console. Web/SSH users can neither use the root account password nor detect if the root password is set to the default value.

  • PMM-1221: Grafana updated to version 4.4.3

About Percona Monitoring and Management

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

We’re always happy to help! Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, please use the PMM project in JIRA.

by Borys Belinsky at August 23, 2017 09:16 PM

Migrating Data from an Encrypted Amazon MySQL RDS Instance to an Encrypted Amazon Aurora Instance

Migrating Data

Migrating DataIn this blog post, we’ll discuss migrating data from encrypted Amazon MySQL RDS to encrypted Amazon Aurora.

One of my customers wanted to migrate from an encrypted MySQL RDS instance to an encrypted Aurora instance. They have a pretty large database, therefore using mysqldump or a similar tool was not suitable for them. They also wanted to setup replication between old MySQL RDS and new Aurora instances.

Spoiler: this is possible without any logical dump.

At first, I checked Amazon’s documentation on encryption and found nothing about this type of migration. Even more, if I trust the documentation it looks like they don’t support replication or migration between encrypted MySQL RDS and encrypted Aurora. All instructions are for either “MySQL RDS to MySQL RDS” or “Aurora to Aurora” setups. For example, the documentation says here:

You can create Read Replicas of both encrypted and unencrypted DB clusters. The Read Replica must be encrypted if the source DB cluster is encrypted.

When I tried to create an Aurora read replica of my encrypted MySQL RDS instance, however, the “Enable Encryption” select control was grayed out and I could not change “No” to “Yes”.

I had to find a workaround.

Another idea was creating an encrypted MySQL RDS replica and migrating it to Aurora. While creating encrypted MySQL replica is certainly possible (actually all replicas of encrypted instances must be encrypted) it was not possible to migrate it to any other instance using the standard “Migrate Latest Snapshot” option:

However, the documentation specified that Aurora and MySQL RDS use the same AWS KMS key. As a result, both kinds of encryption should be compatible (if not practically the same). Amazon also has the “AWS Database Migration Service“, which has this promising section in its FAQ:

Q. Can I replicate data from encrypted data sources?

Yes, AWS Database Migration Service can read and write from and to encrypted databases. AWS Database Migration Service connects to your database endpoints on the SQL interface layer. If you use the Transparent Data Encryption features of Oracle or SQL Server, AWS Database Migration Service will be able to extract decrypted data from such sources and replicate it to the target. The same applies to storage-level encryption. As long as AWS Database Migration Service has the correct credentials to the database source, it will be able to connect to the source and propagate data (in decrypted form) to the target. We recommend using encryption-at-rest on the target to maintain the confidentiality of your information. If you use application-level encryption, the data will be transmitted through AWS Database Migration Service as is, in encrypted format, and then inserted into the target database.

I decided to give it a try. And it worked!

The next step was to make this newly migrated Aurora encrypted instance a read replica of the original MySQL RDS instance. This is easy in part with the help of great how-to on migration by Adrian Cantrill. As suggested, you only need to find the master’s binary log file, current position and supply them to the stored routine

mysql.rds_set_external_master
. Then start replication using the stored routine
mysql.rds_start_replication
.

Conclusion: While AWS Database Migration Service has limitations for both source and target databases, this solution allows you to migrate encrypted instances easily and securely.

Save

Save

Save

Save

by Sveta Smirnova at August 23, 2017 07:51 PM

August 22, 2017

Peter Zaitsev

The MySQL High Availability Landscape in 2017 (The Adults)

In this blog post, we’ll look at some of the MySQL high availability solution options.

In the previous post of this series, we looked at the MySQL high availability (HA) solutions that have been around for a long time. I called these solutions “the elders.” Some of these solutions (like replication) are heavily used today and have been improved from release to release of MySQL.

This post focuses on the MySQL high availability solutions that have appeared over the last five years and gained a fair amount of traction in the community. I chose to include this group only two solutions: Galera and RDS Aurora. I’ll use the term “Galera” generically: it covers Galera Cluster, MariaDB Cluster and Percona XtraDB Cluster. I debated for some time whether or not to include Aurora. I don’t like the fact that they use closed source code. Given the tight integration with the AWS environment, what is the commercial risk of opening the source code? That question evades me, but I am not on the business side of technology. 🙂

Galera

When I say “Galera,” it means a replication protocol supported by a library provided by Codeship, a Finnish company. The library needs hooks inside the MySQL source code for the replication protocol to work. In Percona XtraDB cluster, I counted 66 .cc files where the word “wsrep” is present. As you can see, it is not a small task to add support for Galera to MySQL. Not all the implementations are similar. Percona, for example, focused more on stability and usability at the expense of new features.

high availability

Let’s start with a quick description of a Galera-based cluster. Unless you don’t care about split-brain, a Galera cluster needs at least three nodes. The Galera replication protocol is nearly synchronous, which is a huge gain compared to regular MySQL replication. It performs transactions almost simultaneously on all the nodes, and the protocol ensures the same commit order. The transactions are almost synchronous because there are incoming queues on each node to improve performance. The presence of these incoming queues forces an extra step: the certification. The certification compares an incoming transaction with the ones already queued. If there a conflict, it returns a deadlock error.

For performance reasons, the certification process must be quick so that the incoming queue stays in memory. Since the number of transactions defines the size of the queue, the presence of large transactions uses a lot of memory. There are safeguards against memory overload, so be aware that transactions like:

update ATableWithMillionsRows set colA=1;

will likely fail. That’s the first important limitation of a Galera-based cluster: the size of the number transactions is limited.

It is also critical to uniquely identify conflicting rows. The best way to achieve an efficient row comparison is to make sure all the tables have a primary key. In a Galera-based cluster, your tables need primary keys otherwise you’ll run into trouble. That’s the second limitation of a Galera based cluster: the need for primary keys. Personally, I think that a table should always have a primary key – but I have seen many oddities…

Another design characteristic is the need for an acknowledgment by all the nodes when a transaction commits. That means the network link with the largest latency between two nodes will set the floor value of the transactional latency. It is an important factor to consider when deploying a Galera-based cluster over a WAN. Similarly, an overloaded node can slow down the cluster if it cannot acknowledge the transaction in a timely manner. In most cases, adding slave threads will allow you to overcome the throughput limitations imposed by the network latency. Each transaction will suffer from latency, but more of them will be able to run at the same time and maintain the throughput.

The exception here is when there are “hot rows.” A hot row is a row that is hammered by updates all the time. The transactions affecting hot rows cannot be executed in parallel and are thus limited by the network latency.

Since Galera-based clusters are very popular, they must also have some good points. The first and most obvious is the full-durability support. Even if the node on which you executed a transaction crashes a fraction of second after the commit statement returned, the data is present on the other nodes incoming queues. In my opinion, it is the main reason for the demise of the share storage solution. Before Galera-based clusters, the shared storage solution was the only other solution guaranteeing no data loss in case of a crash.

While the standby node is unusable with the shared storage solution, all the nodes of a Galera-based cluster are available and are almost in sync. All the nodes can be used for reads without stressing too much about replication lag. If you accept a higher risk of deadlock errors, you can even write on all nodes.

Finally, and not the least, there is an automatic provisioning service for the new nodes called SST. During the SST process, a joiner node asks the cluster for a donor. One of the existing nodes agrees to be the donor and initiate a full backup. The backup is streamed over the network to the joiner and restored there. When the backup completes, the joiner performs an IST to get the recent updates and, once applied, joins the cluster. The most common SST method uses the Percona XtraBackup utility. When using SST for XtraBackup, the cluster is fully available during the SST, although it may degrade performance. This feature really simplifies the operational side of things.

The technology is very popular. Of course, I am a bit biased since I work for Percona and one of our flagship products is Percona XtraDB Cluster – an implementation of the Galera protocol. Other than standard MySQL replication, it is by far the most common HA solution used by the customers I work with.

RDS Aurora

The second “adult” MySQL high availability solution is RDS Aurora. I hesitated to add Aurora here, mainly because it is not an open-source technology. I must also admit that I haven’t followed the latest developments around Aurora very closely. So, let’s describe Aurora.

There are three major parts in Aurora: at least one database server, the writer node and the storage.

high availability

What makes Aurora special is the storage layer has its own processing logic. I don’t know if the processing logic is part of the writer node AWS instance or part of the storage service directly, since the source code is not available. Anyway, I’ll call that layer the appliers. The applier role is to apply redo log fragments that then allow the writer node to write only the redo log fragments (normally written to the InnoDB log files). The appliers read those fragments and modify the pages in the storage. If a node requests a page that has pending redo fragments to be applied, they get applied before returning the page.

From the writer node perspective, there are much fewer writes. There is also no direct upper bound in terms of a number of fragments to be queued, so it is a bit like having

innodb_log_file_size
 set to an extremely large value. Also, since Aurora doesn’t need to flush pages, if the write node needs to read from the storage, and there are no free pages in the buffer pool, it can just discard one even if it is “dirty.” Actually, there are no dirty pages in the buffer pool.

So far, that seems to be very good for high write loads with spikes. What about the reader nodes? These reader nodes receive the updates from the writer nodes. If the update concerns a page they have in their buffer pool, they can modify it in place or discard it and read again from the storage. Again, without the source code, it is hard to tell the implementation. The point is, the readers have the same data as the master, they just can’t lag behind.

Apart from the impossibility of any reader lag, the other good point of Aurora is the storage. InnoDB pages are saved as objects in an object store, not like in a regular file on a file system. That means you don’t need to over-provision your storage ahead of time. You pay for what you are using – actually the maximum you ever use. InnoDB tablespaces do not shrink, even with Aurora.

Furthermore, if you have a 5TB dataset and your workload is such that you would need ten servers (one writer and nine readers), you still need only 5TB of storage if you are not replicating to other AZ. If we compare with regular MySQL and replication, you would have one master and nine slaves, each with 5TB of storage, for a total of 50TB. On top of that, you’ll have at least ten times the write IOPS.

So, storage wise, we have something that could be very interesting for applications with large datasets and highly concurrent read heavy workloads. You ensure high availability with the ability to promote a reader to writer automatically. You access the primary or the readers through endpoints that automatically connect to the correct instances. Finally, you can replicate the storage to multiple availability zones for DR.

Of course, such an architecture comes with a trade-off. If you experiment with Aurora, you’ll quickly discover that the smallest instance types underperform while the largest ones perform in a more expected manner. It is also quite easy to overload the appliers. Just perform the following queries:

update ATableWithMillionsRows set colA=1;
select count(*) from ATableWithMillionsRows where colA=1;

given that the table ATableWithMillionsRows is larger than the buffer pool. The select will hang for a long time because the appliers are overloaded by the number of pages to update.

In term of adoption, we have some customers at Percona using Aurora, but not that many. It could be that users of Aurora do not naturally go to Percona for services and support. I also wonder about the decision to keep the source code closed. It is certainly not a positive marketing factor in a community like the MySQL community. Since the Aurora technology seems extremely bounded to their ecosystem, is there really a risk for the technology to be reused by a competitor? With a better understanding of the technology through open access to the source, Amazon could have received valuable contributions. It would also be much easier to understand, tune and recommend Aurora.

Further reading:

by Yves Trudeau at August 22, 2017 05:51 PM

Jean-Jerome Schmidt

Galera Cluster: All the Severalnines Resources

Galera Cluster is a true multi-master cluster solution for MySQL and MariaDB, based on synchronous replication. Galera Cluster is easy-to-use, provides high-availability, as well as scalability for certain workloads.

ClusterControl provides advanced deployment, management, monitoring, and scaling functionality to get your Galera clusters up-and-running using proven methodologies.

Here are just some of the great resources we’ve developed for Galera Cluster over the last few years...

Tutorials

Galera Cluster for MySQL

Galera allows applications to read and write from any MySQL Server. Galera enables synchronous replication for InnoDB, creating a true multi-master cluster of MySQL servers. Allows for synchronous replication between data centers. Our tutorial covers MySQL Galera concepts and explains how to deploy and manage a Galera cluster.

Read the Tutorial

Deploying a Galera Cluster for MySQL on Amazon VPC

This tutorial shows you how to deploy a multi-master synchronous Galera Cluster for MySQL with Amazon's Virtual Private Cloud (Amazon VPC) service.

Read the Tutorial

Training: Galera Cluster For System Administrators, DBAs And DevOps

The course is designed for system administrators & database administrators looking to gain more in depth expertise in the automation and management of Galera Clusters.

Book Your Seat

On-Demand Webinars

MySQL Tutorial - Backup Tips for MySQL, MariaDB & Galera Cluster

In this webinar, Krzysztof Książek, Senior Support Engineer at Severalnines, discusses backup strategies and best practices for MySQL, MariaDB and Galera clusters; including a live demo on how to do this with ClusterControl.

Watch the replay

9 DevOps Tips for Going in Production with Galera Cluster for MySQL / MariaDB

In this webinar replay, we guide you through 9 key tips to consider before taking Galera Cluster for MySQL / MariaDB into production.

Watch the replay

Deep Dive Into How To Monitor MySQL or MariaDB Galera Cluster / Percona XtraDB Cluster

Our colleague Krzysztof Książek provided a deep-dive session on what to monitor in Galera Cluster for MySQL & MariaDB. Krzysztof is a MySQL DBA with experience in managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

Watch the replay

Become a MySQL DBA - webinar series: Schema Changes for MySQL Replication & Galera Cluster

In this webinar, we discuss how to implement schema changes in the least impacting way to your operations and ensure availability of your database. We also cover some real-life examples and discuss how to handle them.

Watch the replay

Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Cluster

In this webinar, we walk you through what you need to know in order to migrate from standalone or a master-slave MySQL / MariaDB setup to Galera Cluster.

Watch the replay

Introducing Galera 3.0

In this webinar you'll learn all about the new Galera Cluster capabilities in version 3.0.

Watch the replay

Top Blogs

MySQL on Docker: Running Galera Cluster on Kubernetes

In our previous posts, we showed how one can run Galera Cluster on Docker Swarm, and discussed some of the limitations with regards to production environments. Kubernetes is widely used as orchestration tool, and we’ll see whether we can leverage it to achieve production-grade Galera Cluster on Docker.

Read More

ClusterControl for Galera Cluster for MySQL

Galera Cluster is widely supported by ClusterControl. With over four thousand deployments and more than sixteen thousand configurations, you can be assured that ClusterControl is more than capable of helping you manage your Galera setup.

Read More

How Galera Cluster Enables High Availability for High Traffic Websites

This post gives an insight into how Galera can help to build HA websites.

Read More

How to Set Up Asynchronous Replication from Galera Cluster to Standalone MySQL server with GTID

Hybrid replication, i.e. combining Galera and asynchronous MySQL replication in the same setup, became much easier since GTID got introduced in MySQL 5.6. In this blog post, we will show you how to replicate a Galera Cluster to a MySQL server with GTID, and how to failover the replication in case the master node fails.

Read More

Full Restore of a MySQL or MariaDB Galera Cluster from Backup

Performing regular backups of your database cluster is imperative for high availability and disaster recovery. This blog post provides a series of best practices on how to fully restore a MySQL or MariaDB Galera Cluster from backup.

Read More

How to Bootstrap MySQL or MariaDB Galera Cluster

Unlike standard MySQL server and MySQL Cluster, the way to start a MySQL or MariaDB Galera Cluster is a bit different. Galera requires you to start a node in a cluster as a reference point, before the remaining nodes are able to join and form the cluster. This process is known as cluster bootstrap. Bootstrapping is an initial step to introduce a database node as primary component, before others see it as a reference point to sync up data.

Read More

Schema changes in Galera cluster for MySQL and MariaDB - how to avoid RSU locks

This post shows you how to avoid locking existing queries when performing rolling schema upgrades in Galera Cluster for MySQL and MariaDB.

Read More

Deploy an asynchronous slave to Galera Cluster for MySQL - The Easy Way

Due to its synchronous nature, Galera performance can be limited by the slowest node in the cluster. So running heavy reporting queries or making frequent backups on one node, or putting a node across a slow WAN link to a remote data center might indirectly affect cluster performance. Combining Galera and asynchronous MySQL replication in the same setup, aka Hybrid Replication, can help

Read More

Top Videos

ClusterControl for Galera Cluster - All Inclusive Database Management System

Watch the Video

Galera Cluster - ClusterControl Product Demonstration

Watch the Video

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

ClusterControl for Galera

ClusterControl makes it easy for those new to Galera to use the technology and deploy their first clusters. It centralizes the database management into a single interface. ClusterControl automation ensures DBAs and SysAdmins make critical changes to the cluster efficiently with minimal risks.

ClusterControl delivers on an array of features to help manage and monitor your open source database environments:

  • Deploy Database Clusters
  • Add Node, Load Balancer (HAProxy, ProxySQL) or Replication Slave
  • Backup Management
  • Configuration Management
  • Full stack monitoring (DB/LB/Host)
  • Query Monitoring
  • Enable SSL Encryption Galera Replication
  • Node Management
  • Developer Studio with Advisors

Learn more about how ClusterControl can help you drive high availability with Galera Cluster here.

We hope that these resources prove useful!

Happy Clustering!

by Severalnines at August 22, 2017 09:18 AM

August 21, 2017

Peter Zaitsev

Percona Server for MongoDB 3.2.16-3.6 is Now Available

Percona Server for MongoDB 3.2

Percona Server for MongoDB 3.2Percona announces the release of Percona Server for MongoDB 3.2.16-3.6 on August 21, 2017. Downloads are available here and from the Percona Software Repositories.

Percona Server for MongoDB is a highly scalable, zero-maintenance downtime database supporting the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocks and Percona Memory Engine storage engines, as well as several enterprise features. Percona Server for MongoDB requires no changes to MongoDB applications or code.

Note:

Percona deprecated the PerconaFT storage engine. Future releases won’t contain this engine.

This release is based on MongoDB 3.2.16 and does not include any additional changes.

by Borys Belinsky at August 21, 2017 04:34 PM

August 20, 2017

Valeriy Kravchuk

Fun with Bugs #55 - On Some Public Bugs Fixed in MySQL 8.0.2

I do not care much about MySQL 8.0.x at the moment, as it's far from being GA and is work in progress. It is not yet used by customers whom I have to support. But I know about many interesting changes and improvements there that, eventually, are going to influence all main forks and kinds of MySQL. So, it would not be wise to ignore MySQL 8.0.c entirely even for me.

For this post I decided to briefly check what community reported bugs were fixed in the recent release, 8.0.2, based on release notes. For me it's a measure of community interest in MySQL 8.0.x and Oracle's interest in further working with MySQL Community. I ended up with the following, short enough list of bug fixes in the categories I usually care about (InnoDB, partitioning, replication and optimizer):
  • The very first InnoDB bug mentioned in the release notes, Bug #85043, is private. I fail to see any valid reason for a bug in the version currently under development and not declared GA to remain private after the fix is released. If only it affects GA versions, and this is the case. The bug is fixed in 5.7.19 as well, as you can see in my previous post.
  • Another bug that is related to InnoDB and optimizer, is Bug #81031. It was also fixed in MySQL 5.6.37 and 5.7.19.
  • Bug #84038 - "Errors when restarting MySQL after FLUSH TABLES FOR EXPORT, RENAME and DROP", was also fixed in MySQL 5.7.19. I am actually surprised that as this stage we still have older InnoDB internal data dictionary tables in MySQL 8.0.x.
  • Group replication related Bug #85667, Bug #85047, Bug #84728  and Bug #84733 were also listed as fixed in MySQL 5.7.19.
  • Same situation with normal async replication bugs: Bug #83184, Bug #82283, Bug #81232, Bug #77406 etc. It's expected to see fixes applied to the oldest version affected and then fixes merged to newer versions.
  • The first really unique fix in 8.0.2 that I found was Bug #85639 - "XA transactions are 'unsafe' for RPL using SBR". It was reported by João Gramacho (who probably works for Oracle) originally for MySQL 5.7 and is going to be fixed in MySQL 5.7.20 also.
  • Replication-related Bug #85739 is still private. Release notes say:
    "Issuing SHOW SLAVE STATUS FOR CHANNEL 'group_replication_recovery' following a restart of a server using group replication led to an unplanned shutdown."
  • Yet another private replication bugs fixed in 8.0.2 are: Bug #85405, Bug #85084, Bug #84646Bug #84471, Bug #82467 and Bug #80368. I do not know who reported them, what versions are affected (but I suspect .5.7.x also) and why are they remaining private after being fixed in 8.0.2.
  • The first 8.0.x specific public bug report I've found in the release notes was reported by Andrey Hristov and verified by Umesh Shastry. It is Bug #85937 - "Unchecked read after allocated buffer".
  • Bug #86120 - "Upgrading a MySQL instance with a table name of >64 chars results in breakage", was reported by Daniël van Eeden and verified by Umesh Shastry. I expect a lot of problem reports when users starts to upgrade to 8.0... See also Bug #84889 - "MYSQL 8.0.1 - MYSQLD ERRORLOG UPGRADE ERRORS AT SERVER START LIVE UPGRADE", by Susan Koerner.
  • Bug #85994 - "Out-of-bounds read in mysqld.cc fix_paths", was reported by Laurynas Biveinis and verified by Umesh Shastry. Percona seems to care a lot about improving MySQL 8.0 (as well as other Oracle MySQL GA versions). See also Bug #85678 by Laurynas and Bug #85059 by Roel Van de Paar.
  • Jon Olav Hauglid seems to care about the new data dictionary code quality, so he had reported related public bugs: Bug #85811, Bug #85800, and Bug #83473.
  • Bug #85704 - "mysql 8.0.x crashes when old-style trigger misses the "created" line in .TRG", was reported by Shane Bester. Workaround was also suggested by Jesper Krogh. Shane also reported this nice regression Bug #83019 - "queries in "show processlist" oscillate with constant times higher each day", that is now fixed in 8.0.2.
  • Bug #85614 - "alter table fails when default character set changes to utf8mb4", was reported by Tor Didriksen. He had also reported Bug #85224 - "Illegal mix of collations for time/varchar".
  • Bug #85518 - "Distinct operations on temp tables allocate too little memory for sort keys", was reported by Steinar Gunderson. He had also reported Bug #85487 - "num_tmp_files in filesort optimizer trace is nonsensical".
  • Bug #85179 - "Assert in sql/field.cc:... virtual String* Field_varstring::val_str", was reported by Matthias Leich.
I skipped several bugs that are fixed also in older versions. Many of them were already discussed in my posts. I also skipped all build/compilation/packaging bugs for now.

To summarize, while total number of public bug reports fixed in MySQL 8.0.2 is notable, many of these bugs were reported by few Oracle engineers who are still brave enough to report bugs in public. From Community, it seems only mostly Percona and Booking.com engineers do care to check MySQL 8.0.x at this early stage and report bugs. I am especially concerned with the number of private bug reports mentioned in the release notes of 8.0.2...

by Valeriy Kravchuk (noreply@blogger.com) at August 20, 2017 02:56 PM

August 19, 2017

MariaDB Foundation

MariaDB 10.2.8 and MariaDB Galera Cluster 10.0.32 now available

The MariaDB project is pleased to announce the availability of MariaDB 10.2.8 and MariaDB Galera Cluster 10.0.32. See the release notes and changelogs for details. Download MariaDB 10.2.8 Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Galera Cluster 10.0.32 Release Notes Changelog What is MariaDB Galera Cluster? […]

The post MariaDB 10.2.8 and MariaDB Galera Cluster 10.0.32 now available appeared first on MariaDB.org.

by Ian Gilfillan at August 19, 2017 05:56 PM

August 18, 2017

MariaDB AB

MariaDB Server 10.2.8 now available

MariaDB Server 10.2.8 now available dbart Fri, 08/18/2017 - 14:06

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.8. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.2.8

Release Notes Changelog What is MariaDB Server 10.2?


MariaDB Galera Cluster 10.0.32 is also now available for download.

Download MariaDB Galera Cluster 10.0.32

Release Notes Changelog What is MariaDB Galera Cluster?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.8. See the release notes and changelog for details.

Login or Register to post comments

by dbart at August 18, 2017 06:06 PM

Peter Zaitsev

This Week in Data with Colin Charles: Percona Live Europe!

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Has a week passed already? Welcome back to the second column. A lot of time has been spent neck deep in getting speakers accepted and scheduled for Percona Live Open Source Database Conference Europe 2017 in Dublin, as well organizing the conference sponsors.

Percona Live Europe Dublin

Percona Live Europe Colin CharlesAt the time of writing, we are six weeks away from the conference, so a little over a month! Have you registered yet?

We have 12 tutorials that cover a wide range of topics: ProxySQL (from the author Rene Cannao), Orchestrator (from the author Shlomi Noach), practical Couchbase (to name a few). If we did a technology word cloud, the coverage includes MongoDB, Docker, Elastic, Percona Monitoring and Management (PMM), Percona XtraDB Cluster 5.7, MySQL InnoDB Cluster and Group Replication.

In addition to that, if you’re a MySQL beginner (or thinking of a career change) there is a six-hour boot camp titled MySQL in a Nutshell (Part 1 and Part 2)!. Come prepared with your laptop, and leave a MySQL DBA!

Sessions are scheduled, and most of the content is already online: check out day 1, and day 2. We have 104 sessions scheduled, so there’s plenty to choose from.

Remember that you have till 7:00 a.m. UTC-1, August 16th, 2017 to book the group rate at the event venue for €250/night. Use code PERCON.

Releases

  • orchestrator/raft: Pre-release 3.0 is available. I’m a huge fan of Orchestrator, and now you can setup high availability for orchestrator via the Raft consensus protocol.
  • MariaDB 10.0.32 is out, and it comes with a new Percona XtraDB, Percona TokuDB and a new InnoDB. You’ll want this release if you’re using TokuDB, as it merges from TokuDB 5.6.36-82.1 (which fixes the two issues problem).
  • If you encountered the TokuDB problems above, you’ll want to look at MariaDB 10.1.26. One surprise hidden in the release notes: MariaDB Backup is now a stable/GA release. Have you used it yet?

Link List

I look forward to feedback/tips via e-mail at colin.charles@percona.com or I’m @bytebot on Twitter.

by Colin Charles at August 18, 2017 04:36 PM

August 17, 2017

Peter Zaitsev

IMDb Data in a Graph Database

Graph Database 1

Graph Database 1In this first of its kind, Percona welcomes Dehowe Feng, Software Developer from Bitnine as a guest blogger. In his blog post, Dehowe discusses how viewing imported data from IMDb into a graph database (AgensGraph) lets you quickly see how data nodes relate to each other. This blog echoes a talk given by Bitnine at the Percona Live Open Source Database Conference 2017.

Graphs help illustrate the relationships between entities through nodes, drawing connections between people and objects. Relationships in IMDb are inherently visual. Seeing how things are connected grants us a better understanding of the context underneath. By importing IMDb data as graph data, you simplify the schema can obtain key insights.

In this post, we will examine how importing IMDb into a graph database (in this case, AgensGraph) allows us to look at data relationships in a much more visual way, providing more intuitive insights into the nature of related data.

For install instructions to the importing scripts, go here.

Internet Movie Database (IMDb) owned by Amazon.com is one of the largest movie databases. It contains 4.1 million titles and 7.7 million personalities (https://en.wikipedia.org/wiki/IMDb).

Relational Schema for IMDb

Graph Database 2

Relational Schema of IMDb Info

Picture courtesy of user ofthelit on StackOverflow, https://goo.gl/SpS6Ca

Because IMDb’s file format is not easy to read and parse, rather than implementing the file directly we use an additional step to load it into relational tables. For this project, we used IMDbpy to load relational data into AgensGraph in relational form. The above figure is the relational schema which IMDbpy created. This schema is somewhat complicated, but essentially there are four basic entries: Production, Person, Company and Keyword. Because there are many N-to-N relationships between these entities, the relational schema has more tables than the number of entities. This makes the schema harder to understand. For example, a person can be related to many movies and a movie can have many characters.

Concise Graph Modeling

From there, we developed our own graph schema using Production, Person, Company and Keyword as our nodes (or end data points).

Productions lie at the “center” of the graph, with everything leading to them. Keywords describing Productions, Persons and Companies are credited for their contributions to Productions. Productions are linked to other productions as well.

Graph Database 3

Simplified Graph Database Schema

With the data in graph form, one can easily see the connections between all the nodes. The data can be visualized as a network and querying the data with Cypher allows users to explore the connections between entities.

Compared to the relational schema of IMDb, the graph schema is much simpler to understand. By merging related information for the main entities into nodes, we can access all relevant information to that node through that node, rather than having to match IDs across tables to get the information that we need. If we want to examine how a node relates to another node, we can query its edges to see the connections it forms. Being able to visually “draw a connection” from one node to another helps to illustrate how they are connected.

Furthermore, the labels of the edges describe how the nodes are connected. Edge labels in the IMDb Graph describe what kind of connection is formed, and pertinent information may be stored in attributes in the edges. For example, for the connections ACTOR_IN and ACTRESS_IN, we store role data, such as character name and character id.

Data Migration

To make vertexes’ and edges’ properties we use “views”, which join related tables. The data is migrated into a graph format by querying the relational data using selects and joins into a single table with the necessary information for creating each node.

For example, here is the SQL query used to create the jsonb_keyword view:

CREATE VIEW jsonb_keyword AS
SELECT row_to_json(row(keyword)) AS data
FROM keyword;

We use a view to make importing queries simpler. Once this view is created, its content can be migrated into the graph. After the graph is created, the graph_path is set, and the VLABEL is created, we can use the convenient LOAD keyword to load the JSON values from the relational table into the graph:

LOAD FROM jsonb_keyword AS keywords
CREATE (a:Keyword = data(keywords) );

Note that here LOAD is used to load data in from a relational table, but LOAD can also be used to load data from external sources as well.

Creating edges is a similar process. We load edges from the tables that store id tuples of the between the entities after creating their ELABELs:

LOAD FROM movie_keyword AS rel_key_movie
MATCH (a:Keyword), (b:Production)
WHERE a.id::int = (rel_key_movie).keyword_id AND
b.id::int = (rel_key_movie).movie_id
CREATE (a)-[:KEYWORD_OF]->(b);

As you can see, AgensGraph is not restricted to the CSV format when importing data. We can import relational data into its graph portion using the LOAD feature and SQL statements to refine our data sets.

How is information stored?

Most of the pertinent information is held in the nodes (vertexes). Nodes are labeled either as Productions, Persons, Companies or Keywords, and their relative information is stored as JSONs. Since IMDB information is constantly updated, many fields for certain entities are left incomplete. Since JSON is semi-structured, if an entity does not have a certain piece of information the field will not exist at all – rather than having a field and marking it as NULL.

We also use nested JSON arrays to store data that may have multiple fields, such as quotes that persons might have said or alternate titles to productions. This makes it possible to store “duplicate” fields in each node.

How can this information be used?

In the graph IMDb database, querying between entities is very easy to learn. Using the Cypher Query Language, a user can find things such as all actors that acted in a certain production, all productions that a person has worked on or all other companies that have worked with a certain company on any production. Graph database strength is the simplicity of visualizing the data. There are many ways you can query a graph database to find what you need!

Find the name of all actors that acted in Night at the Museum:

MATCH (a:Person)-[:ACTOR_IN]->(b:Production)
WHERE title = 'Night at the Museum'
RETURN a.name,b.title;

Result:

name | title
-----------------------+---------------------
Asprinio, Stephen | Night at the Museum
Blais, Richard | Night at the Museum
Bougere, Teagle F. | Night at the Museum
Bourdain, Anthony | Night at the Museum
Cherry, Jake | Night at the Museum
Cheng, Paul Chih-Ping | Night at the Museum
...
(56 rows)

Find all productions that Ben Stiller worked on:

MATCH (a:Person)-[b]->(c:Production)
WHERE a.name = 'Stiller, Ben'
RETURN a.name,label(b),c.title;

Result:

name | label | title
-------------+-------------+-----------------------------------------------
...
Stiller, Ben | actor_in | The Heartbreak Kid: The Egg Toss
Stiller, Ben | producer_of | The Hardy Men
Stiller, Ben | actor_in | The Heartbreak Kid: Ben & Jerry
Stiller, Ben | producer_of | The Polka King
Stiller, Ben | actor_in | The Heartbreak Kid
Stiller, Ben | actor_in | The Watch
Stiller, Ben | actor_in | The History of 'Walter Mitty'
Stiller, Ben | producer_of | The Making of 'The Pick of Destiny'
Stiller, Ben | actor_in | The Making of 'The Pick of Destiny'
...
(901 rows)

Find all actresses that worked with Sarah Jessica Parker:

MATCH (a:Person)-[b:ACTRESS_IN]->(c:Production)<-[d:ACTRESS_IN]-(e:Person)
WHERE a.name = 'Parker, Sarah Jessica'
RETURN DISTINCT e.name;

Result:

name
---------------------------------
Aaliyah
Aaron, Caroline
Aaron, Kelly
Abascal, Nati
Abbott, Diane
Abdul, Paula
...
(3524 rows)

Summary

The most powerful aspects of a graph database are flexibility and visualization capabilities.

In the future, we plan to implement a one-step importing script. Currently, the importing script is two-phased: the first step is to load into relational tables and the second step is to load into the graph. Additionally, AgensGraph has worked with Gephi to release a data import plugin. The Gephi Connector allows for graph visualization and analysis. For more information, please visit www.bitnine.net and www.agensgraph.com.

by Dehowe Feng at August 17, 2017 07:36 PM

August 16, 2017

Peter Zaitsev

Percona Monitoring and Management 1.2.1 is Now Available

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM)Percona announces the release of Percona Monitoring and Management 1.2.1 on August 16, 2017.

For install and upgrade instructions, see Deploying Percona Monitoring and Management.

This hotfix release improves memory consumption.

Changes in PMM Server

We’ve introduced the following changes in PMM Server 1.2.1:

Bug fixes

  • PMM-1280: PMM server affected by nGinx CVE-2017-7529. An integer overflow exploit could result in a DOS (Denial of Service) for the affected nginx service with the max_ranges directive not set. This problem is solved by setting the set max_ranges directive to 1 in the nGinx configuration.

Improvements

  • PMM-1232: Update the default value of the METRICS_MEMORY configuration setting. Previous versions of PMM Server used a different value for the METRICS_MEMORY configuration setting which allowed Prometheus to use up to 768MB of memory. PMM Server 1.2.0 used the storage.local.target-heap-size setting, its default value is 256MB. Unintentionally, this value reduced the amount of memory that Prometheus could use. As a result, the performance of Prometheus was affected. To improve the performance of Prometheus, the default setting of storage.local.target-heap-size has been set to 768 MB.

About Percona Monitoring and Management

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

We’re always happy to help! Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, please use the PMM project in JIRA.

by Borys Belinsky at August 16, 2017 05:31 PM

Upcoming Webinar Thursday, August 17: Efficient CRUD Queries in MongoDB

CRUD QueriesJoin Percona’s, Senior Technical Operations Architect Tim Vaillancourt as he presents “Efficient CRUD Queries in MongoDB” on Thursday, August 17, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

MongoDB has its own commands and function structures that ask the database to do work. In this talk, we will discuss how queries, updates, deletes and inserts work. However, we will go beyond these actions and also review what operators you should and shouldn’t use, and how they might actually drive your schema choices. Then we will talk about operationally sound ways for bulk deleting and inserting when you want to limit the impact on production (if other tools are too aggressive).

Register for the webinar here.

Timothy Vaillancourt, Sr. Technical Operations Architect for MongoDB

Tim joined Percona in 2016 as Sr. Technical Operations Architect for MongoDB, with the goal of making MongoDB operations as smooth as possible. With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming, combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack (with many lessons learned). Tim is based in Amsterdam, NL and enjoys traveling, coding and music. Before Percona, Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping launch and operate some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) smoothly. At the same time, he also led the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS. Prior to moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

by Emily Ikuta at August 16, 2017 04:02 PM

August 15, 2017

Peter Zaitsev

Upcoming Webinar Wednesday August 16: Lock, Stock and Backup – Data Guaranteed

Backup

BackupJoin Percona’s, Technical Services Manager, Jervin Real as he presents Lock, Stock and Backup: Data Guaranteed on Wednesday, August 16, 2017 at 7:00 am PDT / 10:00 am EDT (UTC-7).

Backups are crucial in a world where data is digital and uptime is revenue. Environments are no longer bound to traditional data centers, and span multiple cloud providers and many heterogeneous environments. We need bulletproof backups and impeccable recovery processes. This talk aims to answer the question “How should I backup my MySQL databases?” by providing 3-2-1 backup designs, best practices and real-world solutions leveraging key technologies, automation techniques and major cloud provider services.

Register for the webinar here.

Jervin RealJervin Real

As Technical Services Manager, Jervin partners with Percona’s customers on building reliable and highly performant MySQL infrastructures while also doing other fun stuff like watching cat videos on the internet. Jervin joined Percona in April 2010. Starting as a PHP programmer, Jervin quickly learned the LAMP stack. He has worked on several high-traffic sites and a number of specialized web applications (such as mobile content distribution). Before joining Percona, Jervin also worked with several hosting companies, providing care for customer hosted services and data on both Linux and Windows.

by Emily Ikuta at August 15, 2017 04:45 PM

Jean-Jerome Schmidt

What’s New With MySQL Replication in MySQL 8.0

Replication in MySQL has been around for a long time, and has been steadily improving over the years. It has been more like evolution rather than revolution. This is perfectly understandable, as replication is an important feature that many depend on - it has to work.

In the last MySQL versions, we’ve seen improvements in replication performance through support for applying transactions in parallel. In MySQL 5.6, parallelization was done on schema level - all transactions which have been executed in separate schemas could be executed at once. This was a nice improvement for those workloads that had multiple schemas on a single server, and the load was distributed more or less evenly across the schemas.

In MySQL 5.7, another parallelization method was added, so called “logical clock”. It allowed to get some level of concurrency on a slave, even if all your data has been stored in a single schema. It was based, in short, on the fact that some transactions would commit together because of a latency added by hardware. You could even add that latency manually, to achieve better parallelization on the slaves using binlog_group_commit_sync_delay.

This solution was really nice but not without drawbacks. Every delay in committing a transaction could eventually affect user-facing parts of the application. Sure, you can set delays within a range of several milliseconds, but even then, it’s additional latency which slows down the app.

Replication performance improvements in MySQL 8.0

MySQL 8.0, which as of now (August 2017) is still in beta state, brings some nice improvements to replication. Originally, it was developed for Group Replication (GR), but as GR uses regular replication under the hood, “normal” MySQL replication benefited from it. The improvement we mentioned is dependency tracking information stored in the binary log. What happens is that MySQL 8.0 now has a way to store information about which rows were affected by a given transaction (so called writeset), and it compares writesets from different transactions. This makes it possible to identify those transactions which did not work on the same subset of rows and, therefore, these may be applied in parallel. This may allow to increase the parallelization level by several times compared to the implementation from MySQL 5.7. What you need to keep in mind is that, eventually, a slave will see a different view of the data, one that never appeared on the master. This is because transactions may be applied in a different order than on the master. This should not be a problem though. The current implementation of multithreaded replication in MySQL 5.7 may also cause this issue unless you explicitly enable slave-preserve-commit-order.

To control this new behavior, a variable binlog_transaction_dependency_tracking has been introduced. It can take three values:

  • COMMIT_ORDER: this is the default one, it uses the default mechanism available in MySQL 5.7.
  • WRITESET: It enables better parallelization and the master starts to store writeset data in binary log.
  • WRITESET_SESSION: This ensures that transactions will be executed on the slave in order and the issue with a slave that sees a state of database which never was seen on the master is eliminated. It reduces parallelization but it still can provide better throughput than the default settings.

Benchmark

In July, on mysqlhighavailability.com, Vitor Oliveira wrote a post where he tried to measure the performance of new modes. He used the best case scenario - no durability whatsoever, to showcase the difference between old and new modes. We decided to use the same approach, this time in a more real-world setup: binary log enabled with log_slave_updates. Durability settings were left to default (so, sync_binlog=1 - that’s new default in MySQL 8.0, doublewrite buffer enabled, InnoDB checksums enabled etc.) Only exception in durability was innodb_flush_log_at_trx_commit set to 2.

We used m4.2xl instances, 32G, 8 cores (so slave_parallel_workers was set to 8). We also used sysbench, oltp_read_write.lua script. 16 million rows in 32 tables were stored on 1000GB gp2 volume (that’s 3000 IOPS). We tested the performance of all of the modes for 1, 2, 4, 8, 16 and 32 concurrent sysbench connections. Process was as follows: stop slave, execute 100k transactions, start slave and calculate how long it takes to clear the slave lag.

First of all, we don’t really know what happened when sysbench was executed using 1 thread only. Each test was executed five times after a warmup run. This particular configuration was tested two times - results are stable: single-threaded workload was the fastest. We will be looking into it further to understand what happened.

Other than that, the rest of the results are in line with what we expected. COMMIT_ORDER is the slowest one, especially for low traffic, 2-8 threads. WRITESET_SESSION performs typically better than COMMIT_ORDER but it’s slower than WRITESET for low-concurrent traffic.

How it can help me?

The first advantage is obvious: if your workload is on the slow side yet your slaves have tendency to fall back in replication, they can benefit from improved replication performance as soon as the master will be upgraded to 8.0. Two notes here: first - this feature is backward compatible and 5.7 slaves can also benefit from it. Second - a reminder that 8.0 is still in beta state, we don’t encourage you to use beta software on production, although in dire need, this is an option to test. This feature can help you not only when your slaves are lagging. They may be fully caught up but when you create a new slave or reprovision existing one, that slave will be lagging. Having the ability to use “WRITESET” mode will make the process of provisioning a new host much faster.

All in all, this feature will have much bigger impact that you may think. Given all of the benchmarks showing regressions in performance when MySQL handles traffic of low concurrency, anything which can help to speed up the replication in such environments is a huge improvement.

If you use intermediate masters, this is also a feature to look for. Any intermediate master adds some serialization into how transactions are handled and executed - in real world, the workload on an intermediate master will almost always be less parallel than on the master. Utilizing writesets to allow better parallelization not only improves parallelization on the intermediate master but it also can improve parallelization on all of its slaves. It is even possible (although it would require serious testing to verify all pieces will fit correctly) to use an 8.0 intermediate master to improve replication performance of your slaves (please keep in mind that MySQL 5.7 slave can understand writeset data and use it even though it cannot generate it on its own). Of course, replicating from 8.0 to 5.7 sounds quite tricky (and it’s not only because 8.0 is still beta). Under some circumstances, this may work and can speed up CPU utilization on your 5.7 slaves.

Other changes in MySQL replication

Introducing writesets, while it is the most interesting, it is not the only change that happened to MySQL replication in MySQL 8.0. Let’s go through some other, also important changes. If you happen to use a master older than MySQL 5.0, 8.0 won’t support its binary log format. We don’t expect to see many such setups, but if you use some very old MySQL with replication, it’s definitely a time to upgrade.

Default values have changed to make sure that replication is as crash-safe as possible: master_info_repository and relay_log_info_repository are set to TABLE. Expire_log_days has also been changed - now the default value is 30. In addition to expire_log_days, a new variable has been added, binlog_expire_log_seconds, which allows for more fine-grained binlog rotation policy. Some additional timestamps have been added to the binary log to improve observability of replication lag, introducing microsecond granularity.

By all means, this is not a full list of changes and features related to MySQL replication. If you’d like to learn more, you can check the MySQL changelogs. Make sure you reviewed all of them - so far, features have been added in all 8.0 versions.

As you can see, MySQL replication is still changing and becoming better. As we said at the beginning, it has to be a slow-paced process but it’s really great to see what is ahead. It’s also nice to see the work for Group Replication trickling down and reused in the “regular” MySQL replication.

by krzysztof at August 15, 2017 09:29 AM

August 14, 2017

MariaDB AB

Sensitive Data Masking with MariaDB MaxScale

Sensitive Data Masking with MariaDB MaxScale Dipti Joshi Mon, 08/14/2017 - 09:11

Protecting personal and sensitive data, and complying with security and privacy regulations, is high priority for organizations. This includes personally identifiable information (PII), protected health information (PHI), payment card information (subject to PCI-DSS regulation) and intellectual property (subject to ITAR and EAR regulations). In many cases, if not most, it needs to be redacted or masked when accessed (internally and/or externally).

Data redaction obfuscates all or part of the data, reducing unnecessary exposure of sensitive data while at the same time maintaining its usability. Various terms such as data masking, data obfuscation and data anonymization are used to describe this functionality in databases. Data redaction allows an organization to:

  • Meet regulations

  • Protect against insider threat

  • Use production data for non-production use cases (e.g., testing and training)

MariaDB TX and MariaDB AX are complete solutions for high performance transactional and analytical workloads, respectively. Included in both of these solutions is MariaDB MaxScale, a next generation database proxy. In addition to load balancing and query routing, this database proxy provides enhanced security, like encryption of data in flight and masking of sensitive end user data.  

In this blog, we show you how to redact data using the masking filter in MariaDB MaxScale.

Data Masking

With the masking filter, the value of a particular column returned by a query can be obfuscated. For instance, suppose there is a table patients that, among other columns, contains the column ssn where the social security number or national identification number of a patient is stored. With the masking filter, it is possible to specify that when the ssn field is queried, a masked value is returned instead of actual ssn. 

For example:

> SELECT name, ssn FROM patients;

Without masking of SSN column, query result would be:

+-------+-------------+
+ name  | ssn         |
+-------+-------------+
| Alice | 721-07-4426 |
| Bob   | 435-22-3267 |
…

And with masking of SSN column, the query result would be

+-------+-------------+
+ name  | ssn         |
+-------+-------------+
| Alice | XXXXXXXXXXX |
| Bob   | XXXXXXXXXXX |
...

Example configuration of masking filter in MaxScale configuration looks like following:

[MyMasking]
type=filter
module=masking
warn_type_mismatch=always
large_payload=abort
rules=masking_rules.json

[MyService]
type=service
...
filters=MyMasking

And the masking_rules.json will look like this

{
   "rules": [
       {
           "replace": {
               "column": "ssn"
           },
           "with": {
               "fill": "X"
           }
       }
   ]
}

Now doing the following query will show masked data for ssn column

> select name, ssn from patients;

+---------------+--------------+

| name          | ssn          |

+---------------+--------------+

| John Doe      | XXXXXXXXXXX  |

| Jack Smith    | XXXXXXXXXXX  |

| Jane Richards | XXXXXXXXXXX  |

+---------------+--------------+

> select * from patients;

+------+---------------+-------------+--------+-------+ 

| id   | name          | ssn         | gender | age  | 
|+------+--------------+-------------+--------+-------+

|    1 | John Doe      | XXXXXXXXXXX | M      |   55 |

|    2 | Jack Smith    | XXXXXXXXXXX | M      |   38 |

|    3 | Jane Richards | XXXXXXXXXXX | F      |   48 |

+------+---------------+-------------+--------+-------+
> select name, ssn as xyz from patients;

+---------------+--------------+

| name          | xyz          |

+---------------+--------------+

| John Doe      | XXXXXXXXXXX  |

| Jack Smith    | XXXXXXXXXXX  |

| Jane Richards | XXXXXXXXXXX  |

+---------------+--------------+

We have additional enhancements planned for the masking filter in the next release (MariaDB MaxScale 2.2), including partial data masking. 

For details on how to setup the rules, please see MariaDB MaxScale masking filter guide.

If you would like to prevent users from using functions on columns to be masked, you may use the database firewall filter to block such queries.  For details on how to configure black-listing and white-listing rules with firewall filter, please see MariaDB MaxScale Database Firewall filter guide.

To learn more about all the security features of MariaDB TX solution, attend this webinar to learn about advanced security features for data protection.

In this blog, we show you how you can use MariaDB MaxScale’s masking filter for sensitive user data masking.

 

Login or Register to post comments

by Dipti Joshi at August 14, 2017 01:11 PM

August 12, 2017

Valeriy Kravchuk

More on Studying MySQL Hashes in gdb, and How P_S Code May Help

I have to get back to the topic of checking user variables in gdb to clarify few more details. In his comment Shane Bester kindly noted that calling functions defined in MySQL code is not going to work when core dump is studied. So, I ended up with a need to check what does the my_hash_element() function I've used really do, to be ready to repeat that step by step manually. Surely I could skip that and use Python and Shane himself did, but structures of HASH type are widely used in MySQL, so I'd better know how to investigate them manually than blindly use existing code.

Quick search with grep for my_hash_element shows:
[root@centos mysql-server]# grep -rn my_hash_element *
include/hash.h:94:uchar *my_hash_element(HASH *hash, ulong idx);
mysys/hash.c:734:uchar *my_hash_element(HASH *hash, ulong idx)plugin/keyring/hash_to_buffer_serializer.cc:34:      if(store_key_in_buffer(reinterpret_cast<const IKey *>(my_hash_element(keys_hash, i)),
plugin/version_token/version_token.cc:135:  while ((token_obj= (version_token_st *) my_hash_element(&version_tokens_hash, i)))
plugin/version_token/version_token.cc:879:    while ((token_obj= (version_token_st *) my_hash_element(&version_tokens_hash, i)))
sql/sql_base.cc:1051:    TABLE_SHARE *share= (TABLE_SHARE *)my_hash_element(&table_def_cache, idx);
sql/sql_base.cc:1262:        share= (TABLE_SHARE*) my_hash_element(&table_def_cache, idx);
sql/sql_udf.cc:277:    udf_func *udf=(udf_func*) my_hash_element(&udf_hash,idx);
...
sql/table_cache.cc:180:      (Table_cache_element*) my_hash_element(&m_cache, idx);
sql/rpl_gtid.h:2123:        Node *node= (Node *)my_hash_element(hash, i);
sql/rpl_gtid.h:2274:          node= (Node *)my_hash_element(hash, node_index);
sql/rpl_tblmap.cc:168:    entry *e= (entry *)my_hash_element(&m_table_ids, i);
sql/rpl_master.cc:238:    SLAVE_INFO* si = (SLAVE_INFO*) my_hash_element(&slave_list, i);
sql-common/client.c:3245:        LEX_STRING *attr= (LEX_STRING *) my_hash_element(attrs, idx);
storage/perfschema/table_uvar_by_thread.cc:76:    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));storage/ndb/include/util/HashMap.hpp:155:    Entry* entry = (Entry*)my_hash_element(&m_hash, (ulong)i);
storage/ndb/include/util/HashMap.hpp:169:    Entry* entry = (Entry*)my_hash_element((HASH*)&m_hash, (ulong)i);
[root@centos mysql-server]#
That is, HASH structure is used everywhere in MySQL, from keyring to UDFs and table cache, to replication and NDB Cluster, with everything in between. If I can navigate to each HASH element and dump/print it, I can better understand a lot of code, if needed. If anyone cares, HASH is defined in a very simple way in include/hash.h:
typedef struct st_hash {
  size_t key_offset,key_length;         /* Length of key if const length */
  size_t blength;
  ulong records;
  uint flags;
  DYNAMIC_ARRAY array;                          /* Place for hash_keys */
  my_hash_get_key get_key;
  void (*free)(void *);
  CHARSET_INFO *charset;
  my_hash_function hash_function;
  PSI_memory_key m_psi_key;
} HASH;
It relies on DYNAMIC_ARRAY to store keys.

The code of the my_hash_element function in mysys/hash.c is very simple:
uchar *my_hash_element(HASH *hash, ulong idx)
{
  if (idx < hash->records)
    return dynamic_element(&hash->array,idx,HASH_LINK*)->data;
  return 0;
}
Quick search for dynamic_element shows that it's actually a macro:
[root@centos mysql-server]# grep -rn dynamic_element *
client/mysqldump.c:1608:    my_err= dynamic_element(&ignore_error, i, uint *);
extra/comp_err.c:471:      tmp= dynamic_element(&tmp_error->msg, i, struct message*);
extra/comp_err.c:692:    tmp= dynamic_element(&err->msg, i, struct message*);
extra/comp_err.c:803:  first= dynamic_element(&err->msg, 0, struct message*);
include/my_sys.h:769:#define dynamic_element(array,array_index,type) \mysys/hash.c:126:  HASH_LINK *data= dynamic_element(&hash->array, 0, HASH_LINK*);
...
that is defined in include/my_sys.h as follows:
#define dynamic_element(array,array_index,type) \
  ((type)((array)->buffer) +(array_index))
So, now it's clear what to do in gdb, having in mind what array do we use. Let me start the session, find a thread I am interested in and try to check elements one by one:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fc3a037b700 (LWP 3061))]#0  0x00007fc3d2cb3383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 5
(gdb) p do_command::thd->user_vars
$2 = {key_offset = 0, key_length = 0, blength = 4, records = 3, flags = 0,
  array = {buffer = 0x7fc3ba2b3560 "\377\377\377\377", elements = 3,
    max_element = 16, alloc_increment = 32, size_of_element = 16,
    m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
(gdb) set $uvars=&(do_command::thd->user_vars)
(gdb) p $uvars
$3 = (HASH *) 0x7fc3b9fad280
...
(gdb) p &($uvars->array)
$5 = (DYNAMIC_ARRAY *) 0x7fc3b9fad2a8
(gdb) p ((HASH_LINK*)((&($uvars->array))->buffer) + (0))
$6 = (HASH_LINK *) 0x7fc3ba2b3560
(gdb) p ((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data
$7 = (uchar *) 0x7fc3b9fc80e0 "H\201\374\271\303\177"
(gdb) p (user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data)
$8 = (user_var_entry *) 0x7fc3b9fc80e0
(gdb) p *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data)
$9 = {static extra_size = 8, m_ptr = 0x7fc3b9fc8148 "bbb", m_length = 3,
  m_type = STRING_RESULT, m_owner = 0x7fc3b9fad000, m_catalog = {
    str = 0x100000000 <Address 0x100000000 out of bounds>,
    length = 416611827727}, entry_name = {m_str = 0x7fc3b9fc8150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 25,
  used_query_id = 25, unsigned_flag = false}
...
(gdb) p *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (2))->data)
$11 = {static extra_size = 8, m_ptr = 0x7fc3b9e6e220 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc3b9fad000, m_catalog = {str = 0x0,
    length = 0}, entry_name = {m_str = 0x7fc3b9fc8290 "c", m_length = 1},
  collation = {collation = 0x1ded740, derivation = DERIVATION_IMPLICIT,
    repertoire = 3}, update_query_id = 25, used_query_id = 25,
  unsigned_flag = false}
(gdb)
I tried to highlight important details above. With gdb variables it's a matter of proper type casts and dereferencing. In general, I was printing content of item (user variable in this case) with index N as *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (N))->data).

Now, back to printing the variables.  Let's see how this is done in performance_schema, in storage/perfschema/table_uvar_by_thread.cc:
     74   for (;;)
     75   {
     76     sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));
     77     if (sql_uvar == NULL)
     78       break;
...
     98     /* Copy VARIABLE_NAME */
     99     const char *name= sql_uvar->entry_name.ptr();
    100     size_t name_length= sql_uvar->entry_name.length();
    101     DBUG_ASSERT(name_length <= sizeof(pfs_uvar.m_name));
    102     pfs_uvar.m_name.make_row(name, name_length);
    103
    104     /* Copy VARIABLE_VALUE */
    105     my_bool null_value;
    106     String *str_value;
    107     String str_buffer;
    108     uint decimals= 0;
    109     str_value= sql_uvar->val_str(& null_value, & str_buffer, decimals);    110     if (str_value != NULL)
    111     {
    112       pfs_uvar.m_value.make_row(str_value->ptr(), str_value->length());
    113     }
    114     else
    115     {
    116       pfs_uvar.m_value.make_row(NULL, 0);
    117     }
    118
    119     index++;    120   }

So, there we check elements by index until there is no element with such index, and apply the val_str() function of the class. While debugging live server we can do the same, but if we care to see how it works step by step, here is the code from sql/item_func.cc:
String *user_var_entry::val_str(my_bool *null_value, String *str,
                                uint decimals) const
{
  if ((*null_value= (m_ptr == 0)))
    return (String*) 0;

  switch (m_type) {
  case REAL_RESULT:
    str->set_real(*(double*) m_ptr, decimals, collation.collation);    break;
  case INT_RESULT:
    if (!unsigned_flag)
      str->set(*(longlong*) m_ptr, collation.collation);    else
      str->set(*(ulonglong*) m_ptr, collation.collation);    break;
  case DECIMAL_RESULT:
    str_set_decimal((my_decimal *) m_ptr, str, collation.collation);
    break;
  case STRING_RESULT:
    if (str->copy(m_ptr, m_length, collation.collation))
      str= 0;                                   // EOM error
  case ROW_RESULT:
    DBUG_ASSERT(1);                             // Impossible
    break;
  }
  return(str);
}
For INT_RESULT and REAL_RESULT it's all clear, and Shane did essentially the same in his Python code. For strings we have to copy proper items into a zero terminated string or use methods of String class if we debug on a live server to get the entire string data. For DECIMAL_RESULT I checked the implementation of str_set_decimal() that relies on decimal2string() eventually that... looks somewhat complicated (check yourself in strings/decimal.c). So, I'd better to just print my_decimal structure in gdb, for any practical purposes, instead of re-implementing this function in Python.

To summarize, HASH structure is widely used in MySQL and it is easy to dump any of these hashes in gdb, item by item, in the same way as, for example, Performance Schema in MySQL 5.7 does this of user variables. Getting string representation of my_decimal "manually" is complicated.

by Valeriy Kravchuk (noreply@blogger.com) at August 12, 2017 04:45 PM

August 11, 2017

MariaDB AB

Extending the Power of MariaDB ColumnStore with User Defined Functions

Extending the Power of MariaDB ColumnStore with User Defined Functions david_thompson_g Fri, 08/11/2017 - 18:40

 

Introduction

MariaDB ColumnStore 1.0 supports User Defined Functions (UDF) for query extensibility. This allows you to create custom filters and transformations to suit any need. This blog outlines adding support for distributed JSON query filtering. 

An important MariaDB ColumnStore concept to grasp is that there are distributed and non-distributed functions. Distributed functions are executed at the PM nodes supporting query execution scale out. Non distributed functions are MariaDB Server functions that are executed within the UM node. As a result, MariaDB ColumnStore requires two distinct implementations of any function.

The next release, MariaDB ColumnStore 1.1, will bring support for User Defined Aggregate Functions and User Defined Window Functions.

Getting Started

To develop a User Defined Function requires familiarity with C, C++ and Linux.  At a high level the process involves:

  • Setting up a development environment to build MariaDB ColumnStore from source
  • Implementing the MariaDB Server UDF interface
  • Implementing the MariaDB ColumnStore Distributed UDF interface

You can find more information in the MariaDB knowledge base: User Defined Functions.

Example - JSON Pointer Query

In this example, I will create a simple UDF for JSON pointer queries using the RapidJSON library from Tencent. It will have the following syntax:

json_ptr(, )

The first argument is the JSON string. The second argument is the JSON pointer string.

Build Files

RapidJSON is a C++ header only library, so it must be included with the source code. I’ve chosen to add the RapidJSON source code as a submodule in the mariadb-columnstore-engine source code repository:

git submodule add https://github.com/miloyip/rapidjson utils/udfsdk/rapidjson

This will create a submodule rapidjson under the utils/udfsdk directory where the UDF SDK code is defined. 

The rapidjson include directory is added to the cmake file utils/udfsdk/CMakeLists.txt to make it available for use (last line below)

include_directories( ${ENGINE_COMMON_INCLUDES}
                     ../../dbcon/mysql
                     rapidjson/include)


MariaDB UDF SDK 

To implement the MariaDB Server UDF SDK, three procedures must be implemented in utils/udfsdk/udfmysql.cpp:

  • _init  : initialization / parameter validation
  • : function implementation.
  • _deinit : any clean up needed, for example freeing of allocated memory

The init function, in this example, will validate the argument count:

my_bool json_ptr_init(UDF_INIT* initid, UDF_ARGS* args, char* message)
{
    if (args->arg_count != 2)
    {
        strcpy(message,"json_ptr() requires two arguments: expression, path");
        return 1;
    }

    return 0;
}

The function implementation will take the JSON string, parse this into an object structure, execute the JSON pointer path and return a string representation of the results:

 1 string json_ptr(UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error)
 2 {
 3     string json = cvtArgToString(args->arg_type[0], args->args[0]);
 4     string path = cvtArgToString(args->arg_type[1], args->args[1]);
 5     Document d;
 6     d.Parse(json.c_str());
 7    if (Value *v = Pointer(path.c_str()).Get(d)) {
 8         rapidjson::StringBuffer sb;
 9         Writer writer(sb);
10         v->Accept(writer);
11         return string(sb.GetString());
12     }
13     else {
14         return string();
15     }
16 }

Here is a brief explanation of the code:

  • Lines 3-4 : read the 2 arguments and convert to string.
  • Lines 5-6 : Parse the JSON string into a RapidJSON Document object .
  • Line 7 : Execute the JSON pointer path against the JSON document into the Value object.
  • Lines 8-11 : If the Value object is not null then serialize Value to a string. The Value class also offers strongly typed accessors that may also be used. Note that string values will be serialized with surrounding double quotes.
  • Line 14: If the Value object is null then return the empty string.

The following RapidJSON includes are required at the top of the file:

#include "rapidjson/document.h"
#include "rapidjson/pointer.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/writer.h"

using namespace rapidjson;


The json_ptr_deinit function must be declared, but it is empty – there is nothing we need to do for clean up.

void json_ptr_deinit(UDF_INIT* initid)
{
}

MariaDB ColumnStore Distributed UDF SDK

The MariaDB ColumnStore distributed UDF SDK requires defining a class instance and registering an instance for it to be usable.

First the class must be declared in utils/udfsdk/udfsdk.h. The simplest approach is to clone one of the existing reference implementations:

class json_ptr : public funcexp::Func
{
  public:
    json_ptr() : Func("json_ptr") {}

    virtual ~json_ptr() {}

    execplan::CalpontSystemCatalog::ColType operationType(          
      funcexp::FunctionParm& fp, 
      execplan::CalpontSystemCatalog::ColType& resultType);

    virtual int64_t getIntVal(
      rowgroup::Row& row,
      funcexp::FunctionParm& fp,
      bool& isNull,
      execplan::CalpontSystemCatalog::ColType& op_ct);

    virtual double getDoubleVal(
      rowgroup::Row& row,
      funcexp::FunctionParm& fp,
      bool& isNull,
      execplan::CalpontSystemCatalog::ColType& op_ct);

     virtual float getFloatVal(
       rowgroup::Row& row,
       funcexp::FunctionParm& fp,
       bool& isNull,
       execplan::CalpontSystemCatalog::ColType& op_ct);

     virtual std::string getStrVal(
       rowgroup::Row& row,
       funcexp::FunctionParm& fp,
       bool& isNull,
       execplan::CalpontSystemCatalog::ColType& op_ct);

     virtual bool getBoolVal(
       rowgroup::Row& row,
       funcexp::FunctionParm& fp,
       bool& isNull,
       execplan::CalpontSystemCatalog::ColType& op_ct);

      virtual execplan::IDB_Decimal getDecimalVal(
        rowgroup::Row& row,
        funcexp::FunctionParm& fp,
        bool& isNull,                                                    
        execplan::CalpontSystemCatalog::ColType& op_ct);
   
      virtual int32_t getDateIntVal(
        rowgroup::Row& row,
        funcexp::FunctionParm& fp,
        bool& isNull,
        execplan::CalpontSystemCatalog::ColType& op_ct);

      virtual int64_t getDatetimeIntVal(
        rowgroup::Row& row,
        funcexp::FunctionParm& fp,
        bool& isNull,
        execplan::CalpontSystemCatalog::ColType& op_ct);

  private:
    void log_debug(std::string arg1, std::string arg2);

  };
}

It can be seen that the following methods are defined:

  • operationType which is used to indicate the return type of the function, which could be dynamic.
  • A number of getVal methods, which perform the operation for a given return type.
  • An optional private log_debug method, which could be used for debug logs in your implementation.

Next the class is implemented in utils/udfsdk/udfsdk.cpp.  First an entry must added to register the class by name (in lower class). This is added to the UDFMap function:

FuncMap UDFSDK::UDFMap() const
{
    FuncMap fm;
    fm["mcs_add"] = new MCS_add();
    fm["mcs_isnull"] = new MCS_isnull();
    fm["json_ptr"] = new json_ptr(); -- new entry for json_ptr function
    return fm;
}

The class is implemented, for brevity only a subset of methods is shown below:

    CalpontSystemCatalog::ColType json_ptr::operationType (
      FunctionParm& fp,                                                           
      CalpontSystemCatalog::ColType& resultType) {
        assert (fp.size() == 2);
        return fp[0]->data()->resultType();
    }

    void json_ptr::log_debug(string arg1, string arg2) {
        logging::LoggingID lid(28); // 28 = primproc
        logging::MessageLog ml(lid);

        logging::Message::Args args;
        logging::Message message(2);
        args.add(arg1);
        args.add(arg2);
        message.format( args );
        ml.logDebugMessage( message );
    }

    string json_ptr::getStrVal(Row& row,
                               FunctionParm& parm,
                               bool& isNull,
                               CalpontSystemCatalog::ColType& op_ct) {
        string json = parm[0]->data()->getStrVal(row, isNull);
        string path = parm[1]->data()->getStrVal(row, isNull);

        Document d;
        d.Parse(json.c_str());
        if (Value *v = Pointer(path.c_str()).Get(d)) {
            StringBuffer sb;
            Writer writer(sb);
            v->Accept(writer);
            return string(sb.GetString());
        }
        else {
            return string();
        }
    }

    double json_ptr::getDoubleVal(Row& row,
                                 FunctionParm& parm,
                                 bool& isNull,
                                 CalpontSystemCatalog::ColType& op_ct)
    {
        throw logic_error("Invalid API called json_ptr::getDoubleVal");
    }


The following methods are implemented:

  • operationType which validates there are exactly 2 arguments and specifies the return type as as string.
  • Optional log_debug method which illustrates how to log to the debug log.
  • getStrVal which performs the JSON evaluation. This is similar to the MariaDB UDF implementation with the exception of being strongly typed and the arguments are retrieved differently.
  • getDoubleVal illustrates throwing an error stating that this is not a supported operation. The method could be implemented of course but for simplicity this was not done.

The complete set of changes can be seen in github.

Building and Using the Example

This example can be built from a branch created from the 1.0.10 code. Once you have cloned the mariadb-columnstore-engine source tree, use the following instructions to build and install on the same server:

git checkout json_ptr_udf
git submodule update --init
cmake .
make -j4
cd utils/udfsdk/
sudo cp libudf_mysql.so.1.0.0 libudfsdk.so.1.0.0 /usr/local/mariadb/columnstore/lib

For a multi server setup, the library files should be copied to the same location on each server. After this, restart the MariaDB ColumnStore instance to start using the functions.

mcsadmin restartSystem

The function is registered using the create function syntax:

create function json_ptr returns string soname 'libudf_mysql.so';

Now a simple example will show the user defined function in action:

create table animal(
id int not null, 
creature varchar(30) not null, 
name varchar(30), 
age decimal(18) ,
attributes varchar(1000)
) engine=columnstore;

insert into animal(id, creature, name, age, attributes) 
values (1, 'tiger', 'roger', 10, '{"fierce": true, "colors": ["white", "orange", "black"]}'), 
       (2, 'tiger', 'sally', 2, '{"fierce": true, "colors": ["white", "orange", "black"]}'), 
       (3, 'lion', 'michael', 56, '{"fierce": false, "colors": ["grey"], "weight" : 9000}');

select name, json_ptr(attributes, '/fierce') is_fierce 
from animal 
where json_ptr(attributes, '/weight') = '9000';
+---------+-----------+
| name    | is_fierce |
+---------+-----------+
| michael | false     |
+---------+-----------+
1 row in set (0.05 sec)

In the example, you can see that the where clause uses the json_ptr function to filter on the weight element in the attributes JSON column and then the fierce element is retrieved in the select clause.

I hope this inspires you to further enhance this example or come up with your own user defined functions to extend the query capabilities of MariaDB ColumnStore.

Getting started? Download MariaDB ColumnStore today and learn how you can get started with MariaDB ColumnStore in 10 minutes.
 

MariaDB ColumnStore 1.0 supports User Defined Functions (UDF) for query extensibility. This allows you to create custom filters and transformations to suit any need. This blog will outline adding support for distributed querying of JSON data. 

Login or Register to post comments

by david_thompson_g at August 11, 2017 10:40 PM

Peter Zaitsev

Learning MySQL 5.7: Q & A

MySQL 5.7

MySQL 5.7In this post I’ll answer questions I received in my Wednesday, July 19, 2017, webinar Learning MySQL 5.7!

First, thank you all who attended the webinar. The link to the slides and the webinar recording can be found here.

I received a number of interesting questions in the webinar that I’ve followed up with below.

Would there be a big difference on passing from 5.1 to 5.6 before going to 5.7 or, at this point, would it be roughly the same?

The biggest risk of jumping between versions, in this case 5.1 to 5.6, is reverting in case of problems. Rollbacks don’t happen often, but they do happen and you have to make sure you have the infrastructure in place whenever you decide to execute. These upgrade steps are not officially supported by Oracle nor even recommended here at Percona. Having said that, as long as your tests (checksums, pt-upgrade) and rollback plan works, this shouldn’t be a problem.

One unforgettable issue I have personally encountered is an upgrade from 5.1 via dump and reload to 5.6. The 5.6 version ran with ROW binlog format preventing replication back to 5.1 because of the limitation with the TIMESTAMP columns. Similarly, downgrading without replication means you have to deal with changes to the MySQL system schema, which obviously require some form of downtime.

Additionally, replication from 5.7 to 5.5 will not work because of the additional metadata information that 5.7 creates (i.e., GTID even when GTID is disabled).

After in-place upgrade a Percona XtraDB Cluster from 5.5 to 5.7 (through 5.6),

innodb_file_per_table
 is enabled by default and the database is now almost twice the size. It was a 40 GB DB now it’s 80 GB due to every table has its own file but ibdata1 is still 40 GB. Is there any solution for this (that doesn’t involve mysqldump and drop tables) and how can this be avoided in future upgrades?

The reason this might be the case is that after upgrading, a number (or possibly all) of tables were [re]created. This would obviously create separate tablespaces for each. One way I can think of reclaiming that disk space is through a familiar upgrade path:

  1. Detach one of the nodes and make is an async replica of the remaining nodes in the cluster
  2. Dump and reload data from this node, then resume replication
  3. Join the other nodes from the cluster as additional nodes of a new cluster using the async replica
  4. Once there is only one node remaining in the original cluster, you can switch to the new cluster for production
  5. Rejoin the last node from the original cluster into the new cluster to complete the process

Depending on the semantics of your switch, it may or may not involve a downtime. For example, if you use ProxySQL this should be a transparent operation.

One way to avoid this problem is by testing. Testing the upgrade process in a lab will expose this kind of information even before deploying the new version into production, allowing you to adjust your process accordingly.

What is a possible impact on upgrades going from the old table format to Barracuda?

So far I am not aware of any negative impact – except if you upgrade and need to downgrade but have since created indexes with prefixes larger than what was supported on the previous version (see large_index_prefix and Barracuda documentation).

Upgrading to Barracuda and one of the supported row formats specifically allows memory constrained systems to save a little more. With BLOB/TEXT column stored off the page, they will not fill the buffer pool unless they are needed.

How do you run mysql_upgrade in parallel?

Good question, I actually wrote about it here.

Can you elaborate on ALTER progress features, and is it also applicable to “Optimization ” query?

I was not able to get more details on the “Optimization” part of this question. I can only assume this too was meant to be table rebuild via OPTIMIZE TABLE. First I would like to point out that OPTIMIZE has been an online DDL operation from 5.6 (with few limitations). As such, there is almost no point in monitoring. Also, for the cases where the online DDL does not apply to OPTIMIZE, under the hood, this is ALTER TABLE .. FORCE – a full table rebuild.

Now, for the actual ALTER process doing a table copy/rebuild, MySQL 5.7 provides some form of progress indication as to how much work has been done. However, it does not necessarily provide an estimate of the actual time it would take to complete. Each ALTER process has different phases which can vary under different conditions. Alternatively, you can also employ other ways of monitoring progress as described in the post.

We are migrated from 5.7.11 to 5.7.17 Percona Server and facing “

Column 1 of table 'x.x' cannot be converted from type 'varchar(100)' to type 'varchar(100)'
”.

This is interesting – what we have seen so far are errors with different datatypes or sizes, which most likely means inconsistency from the table structures if the error is coming from replication. We will need more information on what steps were taken during the upgrade to tell what happened here. Our forums would be the best place to continue this conversation. To begin with, perhaps slave_type_conversions might help if the table structures in replication are the same.

Is the Boost Geometry almost on par with Postgres GIS functions?

I cannot answer this with authority or certainty. I’ve used GIS functions in MySQL, but not developed code for it. Although Boost::Geometry was chosen because of its well-designed API, rapid development and license compatibility, it does not necessarily mean it is more mature than PostGIS (which is widely adopted).

What is the best bulk insert method for MySQL 5.7?

The best option can be different in many situations, so we have to put context here. For this reason, let me give some example scenarios and what might work best:

  • On an upgrade process where you are doing a full dump and reload, parallelizing the process by using mydumper/myloader or mysqlpump will save a lot of time depending the hardware resource available.
  • Bulk INSERT from your application that happens at regular intervals – multi-row inserts are always ideal to reduce disk writes per insert. LOAD DATA INFILE is also a popular option if you can.

Again, thank you for attending the webinar – if you have additional questions head on out to the Percona Forums!

by Jervin Real at August 11, 2017 06:00 PM

August 10, 2017

MariaDB Foundation

MariaDB 10.1.26 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.26. See the release notes and changelogs for details. Download MariaDB 10.1.26 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.1.26 now available appeared first on MariaDB.org.

by Ian Gilfillan at August 10, 2017 07:21 PM

MariaDB AB

MariaDB Server 10.1.26 now available

MariaDB Server 10.1.26 now available dbart Thu, 08/10/2017 - 11:52

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.1.26. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.1.26

Release Notes Changelog What is MariaDB Server 10.1?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.1.26. See the release notes and changelog for details.

wiliam jani

wiliam jani

Tue, 09/12/2017 - 05:45

http://7star.pk

http://7star.pk

http://7star.pk

Login or Register to post comments

by dbart at August 10, 2017 03:52 PM