Planet MariaDB

June 27, 2016

Peter Zaitsev

Webinar Wednesday June 29: Percona XtraDB Cluster Reference Architecture

Percona XtraDB Cluster Reference Architecture

Percona XtraDB Cluster Reference ArchitecturePlease join Jay Janssen for the webinar Percona XtraDB Cluster Reference Architecture Wednesday, June 29 at 10:00 AM PDT (UTC- 7).

A reference architecture shows a typical, common, best-practice deployment of a system with all the surrounding infrastructure. In the case of database clusters, this can include the hosting platform, load balancing, monitoring, backups, etc.

Percona published a commonly referred to Percona XtraDB Cluster reference architecture on the Percona blog in 2012 (which is included in the current manual). However, this architecture is out of date.

This talk will present a revised and updated Percona XtraDB Cluster reference architecture for 2016, including:

  • Load balancing
  • Read/Write splitting
  • Monitoring
  • Backups

This will include some variants, such as:

  • AWS hosting
  • WAN deployments
  • Async slaves

Register now.

Percona XtraDB Cluster Reference ArchitectureJay Janssen, Managing Principal Architect

Jay came to Percona in 2011 after working seven years for Yahoo! Jay worked in a variety of fields including High Availability architectures, MySQL training, tool building, global server load balancing, multi-datacenter environments, operationalization, and monitoring. Jay holds a B.S. in Computer Science from Rochester Institute of Technology. He and his wife live with their four children in upstate New York.

by Dave Avery at June 27, 2016 06:56 PM

The need for parallel crash recovery in MySQL

parallel crash recovery in MySQL

parallel crash recovery in MySQLIn this blog, I will discuss how parallel crash recovery in MySQL benefits several processes.

I recently filed an Oracle feature request to make crash recovery faster by running in multiple threads.

This might not seem very important, because MySQL does not crash that often. When it does crash, however, crash recovery can take 45 mins – as I showed in this post:

What is a big innodb_log_file_size?

Even in that case, it still might not be a big issue as you often failover to a slave.

However, crash recovery plays important part in the following processes:

  • Backups with Percona XtraBackup (and MySQL Enterprise Backups) and backups with filesystem snapshots.
    • Crash recovery is part of the backup process, and it is important to make the backup task faster.
  • State Snapshot Transfer in Percona XtraDB Cluster.
    • SST, either XtraBackup or rsync bases, also relies on the crash recovery process – so the faster it is done, the faster a new node joins the cluster.
    • It might seem that Oracle shouldn’t care about Percona XtraDB Cluster. But they are working on MySQL Group Replication. I suspect that when Group Replication copies data to the new node, it will also rely on some kind of snapshot technique. Unless they aren’t serious about this feature and will recommend mysqldump/mysqlpump for data copying).
  • My recent proof of concept for Automatic Slave propagation in Docker environment also uses Percona XtraBackup, and therefore crash recovery for new slaves.

In general, any process that involves MySQL/InnoDB data transfer will benefit from a faster crash recovery. In its current state uses just one thread to read and process data. This limits performance on modern hardware, which uses multiple CPU cores and fast SSD drives.

It is also important to consider that the crash recovery time affects how big log files can be. If we improve the crash recovery time, we can store very big InnoDB log files (which positively affects performance in general).

Percona is working on ways to make it faster. However, if faster recovery times are important to you environment, I encourage you to let Oracle know that you want to see parallel crash recovery in MySQL.

by Vadim Tkachenko at June 27, 2016 06:14 PM

Mr. D vs crayon plugin

this is a common sql query highlighted by crayon:

select * from blog where user like '%anyone%'

let’s try to cheat crayon

select * from blog where user like '%anyone%'

by Leonardo Erpi at June 27, 2016 05:00 PM

Jean-Jerome Schmidt

MySQL on Docker: Building the Container Image

Building a docker image for MySQL is essential if you’d like to customize MySQL to suit your needs. The image can then be used to quickly start running MySQL containers, which can be operated individually. In our previous post, we covered the basics of running MySQL as container. For that purpose, we used the MySQL image from the official Docker repository. In this blog post, we’ll show you how to build your own MySQL image for Docker.

What is a container image?

A container requires an image to run. A container image is like a virtual machine template. It has all the required stuff to run the container. That includes operating system, software packages, drivers, configuration files and helper scripts packed in one bundle.

When running MySQL on a physical host, here is what you would normally do:

  1. Prepare a host with proper networking
  2. Install operating system of your choice
  3. Install MySQL packages via package repository
  4. Modify the MySQL configuration to suit your needs
  5. Start the MySQL service

Running a MySQL Docker image would look like this:

  1. Install Docker engine on the physical host
  2. Download a MySQL image from public (Docker Hub) or private repository, or build your own MySQL image
  3. Run the MySQL container based on the image, which is similar to starting the MySQL service

As you can see, the Docker approach contains less deployment steps to get MySQL up and running. 99% of the time, the MySQL service running in container will usually work in any kind of environment as long as you have the Docker engine running. Building a MySQL container image requires process flow, since Docker expects only one process per container.

Consider the following:

The above illustrates the following actions:

  1. The image is pulled from Docker Hub on the machine host by using:

    $ docker pull mysql
  2. Spin up two MySQL containers and map them with their respective volume:

    $ docker run -d --name=mysql1 -e MYSQL_ROOT_PASSWORD=’mypassword’ -v /storage/mysql1/mysql-datadir:/var/lib/mysql mysql
    $ docker run -d --name=mysql2 -e MYSQL_ROOT_PASSWORD=’mypassword’ -v /storage/mysql2/mysql-datadir:/var/lib/mysql mysql

How to build a MySQL image?

Take note that in this exercise, we are going to extend an existing MySQL image by adding Percona Xtrabackup onto it. We will then publish our image to Docker Hub and setup an automated build.

Base Image

To build a MySQL container image, we’ll start by pulling a base image. You can pull an image which contains a vanilla operating system of your choice, and start building the MySQL image from scratch:

$ docker pull debian
Using default tag: latest
Trying to pull repository docker.io/library/debian ... latest: Pulling from library/debian

17bd2058e0c6: Pull complete
f854eed3f31f: Pull complete
Digest: sha256:ff779f80153d8220904ec3ec6016ac6dd51bcc77e217587689feffcd7acf96a0
Status: Downloaded newer image for docker.io/debian:latest

However, this is perhaps not best practice. There are tons of MySQL container images available on Docker Hub that we can re-use and enhance with more functionality. For example, the MySQL image created by the Docker team may not contain things that we need, e.g., Percona Xtrabackup (PXB). PXB needs to have access to the local file system in order to perform hot backups. Thus, we have to install PXB on top of the MySQL container images created by Docker. Inheriting this MySQL image allows us to leverage the work done by the Docker team. We do not need to maintain the MySQL server parts, especially when a new version is released.

For now, let’s pull the MySQL image of our choice to the machine host. We are going to use MySQL 5.6 as the base image:

$ docker pull mysql:5.6
Trying to pull repository docker.io/library/mysql ... 5.6: Pulling from library/mysql

3153a44fc5c3: Pull complete
ac82a224055d: Pull complete
e3ce3c9ce67d: Pull complete
57c790472a9d: Pull complete
49772bf40877: Pull complete
73f07a1d187e: Pull complete
3446fa8ab4bb: Pull complete
70c40ffe6275: Pull complete
54672d2ddb6f: Pull complete
Digest: sha256:03646869dfecf96f443640f8b9040fbde78a96269aaf47bbfbb505a4c1adcad9
Status: Downloaded newer image for docker.io/mysql:5.6

Let’s verify what images we do have now:

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
docker.io/debian              latest              f854eed3f31f        5 days ago          125.1 MB
docker.io/mysql               latest              b0e2c14c7e92        3 weeks ago         378.4 MB
docker.io/mysql               5.6                 54672d2ddb6f        3 weeks ago         329 MB

There are three Docker images available in the host, debian:latest, mysql:latest (MySQL 5.7) and mysql:5.6 (MySQL 5.6).

There are two different ways to build a new image, we’ll cover this in the next section.

Building the image

Essentially, there are two ways to build the image:

  1. Make changes to the base image and commit
  2. Use Dockerfile - A text file that contains all the commands to build an image

1) Change & Commit

By using this approach, you make the changes directly into the container image and commit. The commit operation will not include any data contained in volumes mounted inside the container. By default, the container being committed and its processes will be paused while the image is committed. This reduces the likelihood of encountering data corruption during the process of creating the commit.

Let’s run a MySQL container with a volume. Create a volume directory on the machine host and spin a MySQL instance based on the MySQL 5.6 image we have downloaded:

$ mkdir -p /storage/test-mysql/datadir
$ docker run -d --name=test-mysql -e MYSQL_ROOT_PASSWORD=’mypassword’ -v /storage/test-mysql/datadir:/var/lib/mysql mysql:5.6

Verify if the container is running:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
4e75117289ea        mysql:5.6           "docker-entrypoint.sh"   5 seconds ago       Up 5 seconds        3306/tcp            test-mysql

Enter the container’s interactive shell:

$ docker exec -it test-mysql /bin/bash
root@4e75117289ea:/#

This is a Debian 8 (Jessie) image with minimal installation. Some common commands like wget are not available and we need to install these beforehand:

$ apt-get update && apt-get install wget

Install the appropriate Percona apt repository and install the latest PXB 2.3:

$ wget https://repo.percona.com/apt/percona-release_0.1-3.jessie_all.deb
$ dpkg -i percona-release_0.1-3.jessie_all.deb
$ apt-get update
$ apt-get install percona-xtrabackup-23
$ mkdir -p /backup/xtrabackups
$ exit

That’s it. Those were changes we’ve made to the container. Let’s commit this container as another image so we can use it later. First, retrieve the container ID:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
4e75117289ea        mysql:5.6           "docker-entrypoint.sh"   12 minutes ago      Up 12 minutes       3306/tcp            test-mysql

Then, commit and push the changes to another image called “local/mysql-pxb:5.6”:

$ docker commit 4e75117289ea mysql-pxb:5.6
830fea426cfb27d4a520c25f90de60517b711c607dda576fca93ff3a5b03c48f

We can now see that our new image is ready:

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
local/mysql-pxb               5.6                 830fea426cfb        9 seconds ago       589.7 MB
docker.io/debian              latest              f854eed3f31f        5 days ago          125.1 MB
docker.io/mysql               latest              b0e2c14c7e92        3 weeks ago         378.4 MB
docker.io/mysql               5.6                 54672d2ddb6f        3 weeks ago         329 MB

This is just an example to show you how to create an image. We are not going to test this image further, nor upload it to the Docker Hub. Instead, we’ll use Dockerfile as described in the next section.

2) Dockerfile

Contrary to the change and commit approach, another way is to compile all the necessary steps that we used above in a text file called Dockerfile. Generally, it is better to use Dockerfiles to manage your images in a documented and maintainable way. We are not going to cover each of the syntax used in Dockerfile, you can refer to the Docker documentation for details.

Let’s start by creating a directory to place the Dockerfile:

$ mkdir -p ~/docker/severalnines/mysql-pxb
$ cd ~/docker/severalnines/mysql-pxb

Create a new file called Dockerfile:

$ vim Dockerfile

And add the following lines:

## MySQL 5.6 with Percona Xtrabackup

## Pull the mysql:5.6 image
FROM mysql:5.6

## The maintainer name and email
MAINTAINER Ashraf Sharif <ashraf@s9s.com>

## List all packages that we want to install
ENV PACKAGE percona-xtrabackup-23

# Install requirement (wget)
RUN apt-get update && apt-get install -y wget

# Install Percona apt repository and Percona Xtrabackup
RUN wget https://repo.percona.com/apt/percona-release_0.1-3.jessie_all.deb && \
    dpkg -i percona-release_0.1-3.jessie_all.deb && \
    apt-get update && \
    apt-get install -y $PACKAGE

# Create the backup destination
RUN mkdir -p /backup/xtrabackups

# Allow mountable backup path
VOLUME ["/backup/xtrabackup"]

Save the file. What we are doing here is exactly the same with “change and commit” approach, where we extend the functionality of the existing MySQL image downloaded from Docker Hub by installing Percona apt repository and Percona Xtrabackup into it.

We can now build a new image from this Dockerfile:

$ docker build --rm=true -t severalnines/mysql-pxb:5.6 .

We saved the image with a proper naming format “username/image_name:tag”, which is required by Docker Hub if you would like to push and store the image there.

Let’s take a look on what we have now:

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
severalnines/mysql-pxb        5.6                 c619042c5b91        18 minutes ago      591.4 MB
local/mysql-pxb               5.6                 830fea426cfb        15 hours ago        589.7 MB
docker.io/debian              latest              f854eed3f31f        6 days ago          125.1 MB
docker.io/mysql               latest              b0e2c14c7e92        3 weeks ago         378.4 MB
docker.io/mysql               5.6                 54672d2ddb6f        3 weeks ago         329 MB

There are two same images, “local/mysql-pxb” and “severalnines/mysql-pxb”, where the latter was built by using Dockerfile. e are going to push it into Docker Hub.

This is what we have now in our machine host:

Next, we are going to test the image and make further modifications if necessary.

Testing

Let’s run a MySQL container and perform a backup using innobackupex. Create volume path directories on the machine host beforehand:

$ mkdir -p /storage/mysql-pxb/datadir
$ mkdir -p /storage/backups
$ docker run -d \
--name mysql-server \
-v /storage/mysql-server/datadir:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=mypassword \
severalnines/mysql-pxb:5.6

The above command runs a MySQL container called “mysql-server” from the newly built image, severalnines/mysql-pxb:5.6, which can be illustrated as below:

From Percona Xtrabackup documentation, the simplest innobackupex command is:

$ innobackupex --user=”[user]” --password=”[password]” [backup path]

Based on the above, we can execute the backup command in another container (mysql-run-backup), link it to the running MySQL container (mysql-server) and take advantage of the environment variables available for linked containers. By doing this, we don’t have to specify the credentials like host, port, username and password when running the innobackupex command.

Let’s run a linked container and perform the backup by using Docker’s environment variable:

$ docker run -it \
--link mysql-server:mysql \
--name=mysql-run-backup \
-v /storage/mysql-server/datadir:/var/lib/mysql \
-v /storage/backups:/backups \
--rm=true \
severalnines/mysql-pxb:5.6 \
sh -c 'exec innobackupex --host="$MYSQL_PORT_3306_TCP_ADDR" --port="$MYSQL_PORT_3306_TCP_PORT" --user=root --password="$MYSQL_ENV_MYSQL_ROOT_PASSWORD" /backups'

Our both containers are now working together like this:

However, the “run” command was pretty long and not very user friendly. We can simplify this by using a bash script. Create a bash script under the same path as the Dockerfile:

$ cd ~/docker/severalnines/mysql-pxb
$ vim run_backup.sh

And add the following lines:

# Run innobackupex
BACKUP_PATH=/backups

innobackupex --host="$MYSQL_PORT_3306_TCP_ADDR" \
--port="$MYSQL_PORT_3306_TCP_PORT" \
--user=root \
--password="$MYSQL_ENV_MYSQL_ROOT_PASSWORD" \
$BACKUP_PATH

Give the script an executable permission:

$ chmod 755 run_backup.sh

Then, use the ADD command to copy the bash script into the image when building it. Our final version of Dockerfile is now:

## MySQL 5.6 with Percona Xtrabackup

## Pull the mysql:5.6 image
FROM mysql:5.6

## The maintainer name and email
MAINTAINER Your Name <email@domain.xyz>

## List all packages that we want to install
ENV PACKAGE percona-xtrabackup-22

# Install requirement (wget)
RUN apt-get update && apt-get install -y wget

# Install Percona apt repository and Percona Xtrabackup
RUN wget https://repo.percona.com/apt/percona-release_0.1-3.jessie_all.deb && \
        dpkg -i percona-release_0.1-3.jessie_all.deb && \
        apt-get update && \
        apt-get install -y $PACKAGE

# Create backup directory
RUN mkdir -p /backups

# Copy the script to simplify backup command
ADD run_backup.sh /run_backup.sh

# Mountable backup path
VOLUME ["/backups"]

Rebuild the image:

$ docker build --rm=true -t severalnines/mysql-pxb:5.6 .

Run a new container with the new simplified command “/run_backup.sh”:

$ docker run -it \
--link mysql-server:mysql \
--name=mysql-run-backup \
-v /storage/mysql-server/datadir:/var/lib/mysql \
-v /storage/backups:/backups \
--rm=true \
severalnines/mysql-pxb:5.6 \
sh -c 'exec /run_backup.sh'

You should see Xtrabackup output on the screen. Ensure you get the “completed OK” line indicating the backup is successfully created:

...
innobackupex: Backup created in directory '/backups/2016-06-17_17-07-54'
160617 17:07:57  innobackupex: Connection to database server closed
160617 17:07:57  innobackupex: completed OK!

The container will then exit (we ran the image in foreground without --detach parameter) and automatically removed by Docker since we specified “--rm=true” in the “run” command line. On the machine host, we can see the backups are there:

$ ls -1 /storage/backups/
2016-06-17_13-01-58
2016-06-17_13-07-28
2016-06-17_14-02-50

Now our Dockerfile and the image is ready to be published.

Upload to Docker Hub

Docker Hub greatly facilitates Docker image distribution and collaboration. It comes with a bunch of great features like image storage, automatic build and test, link to code repositories and integration with Docker Cloud. It also supports private image repository for those who don’t want to publish the images to the public.

In order to share the images onto Docker Hub, you must first register. After an account is created, run the following command on the machine host to authenticate to the site:

$ docker login
Username: severalnines
Password:
Email: ashraf@domain.com
WARNING: login credentials saved in /root/.docker/config.json
Login Succeeded

Now you can push this repository to the registry designated by its name or tag:

$ docker push severalnines/mysql-pxb:5.6

Do you really want to push to public registry? [y/n]: y
The push refers to a repository [docker.io/severalnines/mysql-pxb] (len: 1)
c619042c5b91: Pushed
cb679f373840: Pushed
29db1ab3b7c2: Pushed
eadfe1149bcf: Pushed
73f07a1d187e: Pushed
d574478a62d8: Pushed
9acd57cae939: Pushed
b14dd0099b51: Pushed
c03269bf1687: Pushed
f50b60617e9c: Pushed
ed751d9dbe3b: Pushed
23286f48d129: Pushed
5.6: digest: sha256:5cf2d7c339a9902ac3059b9ddac11beb4abac63e50fc24e6dfe0f6687cdfa712 size: 20156

Once completed, the image will be available publicly in Docker Hub for user “severalnines”:

Great! Our image is now live and hosted on the Docker Hub where the community can start pulling it.

Automated Build

It’s great to have Docker Hub store the image for you, for free. But, what if a newer version of Percona Xtrabackup is available and you want the image to have it? Or, what if we just want to make a slight modification to our Dockerfile? Should we repeat the building steps over and over again? The answer is no - if you make use of the automated build. Just tell Docker Hub where the code repository is (Github or BitBucket), it will keep an eye for any changes in the revision control of the repository and trigger the build process automatically

Automated builds have several advantages:

  • Images built in this way are built exactly as specified.
  • The Dockerfile is available to anyone with access to your Docker Hub repository.
  • Your repository is kept up-to-date with code changes automatically.

In this example, we use Github. Firstly, create a Github repository. Clone the Github repository into the machine host, and then push the Dockerfile and the run_backup.sh script into Github repository:

$ git clone severalnines/docker-mysql-pxb
$ cd severalnines/docker-mysql-pxb
$ git add *
$ git commit -m ‘first release’
$ git push origin master

Next, link your Docker account with the Github repository. Refer to the instructions here. Once linked, go to Docker Hub > Create Automated Build > Create Auto-build Github > choose the repository and then specify the Docker repository name “mysql-pxb” and add a short description of the repository, similar to the following screenshot:

Then, configure the Build Settings according to the code branch, Dockerfile location and image tag:

In the above example, we also created a Dockerfile for MySQL 5.7 and set it as the default using the “latest” tag. Click on the “Trigger” button to immediately trigger a build job. Docker Hub will then put it in a queue and build the image accordingly.

Here is the final look of the Docker Hub page once you have linked the account and configured automated build:

That’s all. Now what you have to do is just make a change and commit to the Github repository, then Docker will re-build the image for you automatically. If you have a README.md file inside the Github repository, Docker Hub will also pull the content of it into the “Full Description” section of the Docker repository page. Here is what the build status should look like under “Build Details” tab:

We strongly suggest you to have a look at the Best Practice for Writing Dockerfiles. In fact, if you’re creating an official Image, you must stick to those practices.

That concludes this blog post. In the next post, we will look into Docker internetworking for MySQL containers.

by Severalnines at June 27, 2016 10:51 AM

June 24, 2016

MariaDB Foundation

MariaDB 10.0.26 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.26. See the release notes and changelog for details on this release. Download MariaDB 10.0.26 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.0.26 now available appeared first on MariaDB.org.

by Daniel Bartholomew at June 24, 2016 05:28 PM

Peter Zaitsev

Percona Toolkit 2.2.18 is now available

Percona Toolkit

Percona ToolkitPercona announces the availability of Percona Toolkit 2.2.18, released on June 24, 2016.

Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release is the current stable release in the 2.2 series. It includes new features and bug fixes as well as improved MySQL 5.7 compatibility. Full details are below. Downloads are available here and from the Percona Software Repositories.

New features:

  • 1537416: pt-stalk now sorts the output of transactions by id
  • 1553340: Added “Shared” memory info to pt-summary
  • PT-24: Added the --no-vertical-format option for pt-query-digest, allowing compatibility with non-standard MySQL clients that don’t support the G directive at the end of a statement

Bug fixes:

  • 1402776: Fixed error when parsing tcpdump capture with pt-query-digest
  • 1521880: Improved pt-online-schema-change plugin documentation
  • 1547225: Clarified the description of the --attribute-value-limit option for pt-query-digest
  • 1569564: Fixed all PERL-based tools to return a zero exit status when run with the --version option
  • 1576036: Fixed error that sometimes prevented to choose the primary key as index, when using the --where option for pt-table-checksum
  • 1585412: Fixed the inability of pt-query-digest to parse the general log generated by MySQL (and Percona Server) 5.7 instance
  • PT-36: Clarified the description of the --verbose option for pt-slave-restart

You can find release details in the release notes and the 2.2.18 milestone on Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

by Alexey Zhebel at June 24, 2016 04:18 PM

Percona Server for MongoDB 3.0.12-1.7 is now available

Print

Percona Server for MongoDBPercona announces the release of Percona Server for MongoDB 3.0.12-1.7 on June 24, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.12-1.7 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.12, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The MongoRocks storage engine is still under development. There is currently no officially released version of MongoRocks that can be recommended for production.


This release includes all changes from MongoDB 3.0.12, and the following known issue that will be fixed in a future release:

  • The --version does not correctly report the software version. The effected binaries are:
    • bsondump
    • mongodump
    • mongoexport
    • mongofiles
    • mongoimport
    • mongooplog
    • mongorestore
    • mongostat
    • mongotop

The release notes are available in the official documentation.

 

by Alexey Zhebel at June 24, 2016 04:14 PM

June 23, 2016

Peter Zaitsev

Migrate from MS SQL Server to MySQL

Migrate from MS SQL Server to MySQL

In this blog series, I will share my experiences as I migrate commercial databases (i.e., Microsoft SQL or Oracle) to open source (MySQL). More specifically, we will look at how you can migrate from MS SQL Server to MySQL.

For this first blog post I’ve chosen Jira database as an example, and used three different tools to migrate Jira database in Microsoft SQL Server to MySQL:

  1. MySQL Workbench (opensource)
  2. Amazon DMS (cloud tool)
  3. Ispirer MnMTK 2015 (commercial tool)

When I started my research, I was under the impression that Jira database would be easy to migrate (no stored procedures, no triggers, etc.). It turned out that there were some problems that I was able to fix.

One of the reasons I chose Jira as opposed to some standard MS SQL database (such as AdventureWorks2014) is that it is a non-standard choice. Most of the software vendors use standard databases to test their software, and it works perfectly on those standard databases. Jira is not a usual choice and will be closer to real life.

MySQL Workbench

MySQL Workbench supports Microsoft SQL Server migration. The migration is straightforward except the issues with character sets. I have experienced the error “Could not successfully convert UCS-2 string to UTF-8”.

workbench_errors

It turns out (with the help of Martin Brennan’s blog) that we will need to use “ODBC (FreeTDS)” drive for MS SQL, and enable sending Unicode data as UTF8:

workbench_drive_params

After changing those settings, I was able to successfully migrate Jira database from MS SQL to MySQL.

Advantages and disadvantages:

  • Plus: free and open source tool, multi-platform
  • Plus: successful migration for Jira
  • Plus: supports multi-threaded migrations (increase worker tasks if needed, default value is 2)
  • Minus: needed some tweaks to work with character sets
  • Minus: not very easy to debug errors

Amazon DMS

AWS Database Migration Service supports migrating from MS SQL to MySQL, but the actual migration method is different from other tools. It uses the source database (MS SQL server in this case) replication feature to stream the data to the target database (MySQL). Amazon DMS starts a temporary “migration” instance that is used to stream data. Both the source and destination database can be in AWS (EC2 or RDS) or outside AWS (no restriction).

amazon_dms

The important limitation for MS SQL migration: it only works with MS SQL Server versions that support replication (subscription service). It doesn’t work with MS SQL Express edition. Also, if the subscription service is not enabled the DMS can’t even see the schema(s) to migrate (full list of limitations for MS SQL Server migration).

I’ve also gotten errors around the “constraint” name being too long:

2016-04-02T18:20:23 [TARGET_LOAD ]E: Failed to execute statement: 'ALTER TABLE `dbo`.`AO_38321B_CUSTOM_CONTENT_LINK`
ADD CONSTRAINT `AO_38321B_CUSTOM_CONTENT_LINK_pk_AO_38321B_CUSTOM_CONTENT_LINK_ID` PRIMARY KEY ( `ID` )'
[122502] ODBC general error. (ar_odbc_stmt.c:4048)

The problem here is that “AO_38321B_CUSTOM_CONTENT_LINK_pk_AO_38321B_CUSTOM_CONTENT_LINK_ID” is too long a string for MySQL. At the same time, this name does not really matter as this is the PRIMARY KEY.

After changing the “constraint” name in MS SQL to smaller strings, I could migrate all tables to MySQL.

Amazon DMS notes: Amazon DMS lets you migrate from a database located anywhere (not necessarily in AWS) to another database located anywhere (not necessarily in AWS) — however, the traffic will go thru AWS. Thus the migration path is fastest and the most beneficial if either the source or target (or both) instances are in AWS (for example, ec2 or rds instances).

Advantages and disadvantages:

  • Plus: successful migration for Jira
  • Plus: Multithreaded execution (this is a huge advantage for migrating large databases)
  • Plus: In addition to migration, you can also use a replication link between the SQL Server and the new MySQL to fetch the new changes. This is a huge advantage when migrating a large database with lots of traffic and tight downtime requirements.
  • Minus: replication should be enabled to perform a migration, which means that migrating from SQL Server Express isn’t supported.
  • (Can be plus and minus): All traffic is going through a cloud environment.

Potential issues

(This section has been updated) I’ve figured out that the table structures generated by Workbench, Amazon DMS and SQLWays are different. For example:

Workbench generated:

CREATE TABLE `AO_38321B_CUSTOM_CONTENT_LINK` (
  `CONTENT_KEY` varchar(255) DEFAULT NULL,
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `LINK_LABEL` varchar(255) DEFAULT NULL,
  `LINK_URL` varchar(255) DEFAULT NULL,
  `SEQUENCE` int(11) DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `index_ao_38321b_cus1828044926` (`CONTENT_KEY`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

AWS DMS generated:

CREATE TABLE `AO_38321B_CUSTOM_CONTENT_LINK` (
  `CONTENT_KEY` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
  `ID` int(11) NOT NULL,
  `LINK_LABEL` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
  `LINK_URL` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
  `SEQUENCE` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

SQLWays wizard generated:

CREATE TABLE `AO_38321B_CUSTOM_CONTENT_LINK` (
  `CONTENT_KEY` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `LINK_LABEL` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `LINK_URL` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `SEQUENCE` int(11) DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `index_ao_38321b_cus1828044926` (`CONTENT_KEY`(191))
) ENGINE=InnoDB DEFAULT CHARSET=latin1

In AWS DMS version, the index on CONTENT_KEY is missing and ID is not declared as auto_increment. However, the Workbench “lost” the character set. SQLWays wizard has created partial key (191 characters).

At the same time, I was able to start Jira on top of two versions of the MySQL database (migrated by using Workbench and Amazon DMS).

Ispirer MnMTK 2015

Ispirer MnMTK 2015 toolkit is a commercial (not open-source) software application that lets you migrate from MS SQL Server to MySQL (among other databases). Ispirer has provided me with a demo license so I can test the migration.

I was able to migrate the Jira database from MS SQL to MySQL with the Ispirer SQLWays Wizard:

sql_ways_wizard

One issue with this process is that SQL Ways Wizard relies on the MySQL command line utility (“mysql”), which should be in the path. If you do not have MySQL installed on the migration machine, or it is not in the path, the migration will fail:

sql_ways_no_path

To fix simply add the MySQL “bin” directory to the path. In addition, you can use the SQL Ways Wizard to generate scripts and run those scripts on the destination host where the utilities are installed.

Advantages and disadvantages:

  • Plus: successful migration for Jira
  • Plus: support from SQL ways: can work on fixing potential migration issues (requires paid license)
  • Plus: can convert stored procedures, triggers, foreign key constraints
  • Minus: commercial, not open source software.
  • Minus: only runs on Windows (however, target machine for the database migration can be Linux)
  • Minus: no multi-treaded migration support (can manually run multiple instances of SQL Ways)

Conclusion

All tools I tried finally worked, but at the same time I was surprised with the number of issues I found. Migrating a very simple database (no stored procedures, no triggers, no foreign key constraints) should be easier.

Another surprise was that all tools are focused on a nice GUI with “next” buttons. For migrating one database to another, I would prefer using a command line tool interface (may be similar to Percona toolkit or iconv):

# dbmigrate --source user:pass@sqlserverhost:1433 --target user:pass@mysqlhost:3309 --parallel 8 --verbose --overwrite

Actually, Ispirer MnMTK does have a command line migration utility included, but it only works on Windows.

Until somebody develops a better command line tool, any of the above solutions will help you migrate from MS SQL Server to MySQL.

by Alexander Rubin at June 23, 2016 10:56 PM

Erkan Yanar

Another awesome Conference: IT-Tage 2016

Ahoi,
There will be the IT-Tage 2016 conference. Where I'm going to give two talks.
One will be about Docker being more then a technical revolution.
The other one is about Best Practices for Docker+MySQL/MariaDB \o/

As always: fun first!
Erkan

by erkan at June 23, 2016 11:49 AM

Jean-Jerome Schmidt

Planets9s - How to monitor MongoDB, what to test when upgrading to MySQL 5.7 and more

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

New webinar on how to monitor MongoDB (if you’re really a MySQL DBA)

As you may know, MongoDB offers many metrics through various status overviews and commands, but which ones really matter to you? How do you trend and alert on them? What is the meaning behind the metrics? In this new webinar on July 12th, we’ll discuss the most important ones and describe them in ordinary plain MySQL DBA language. And we’ll have a look at the (open source) solutions available for MongoDB monitoring and trending, including how to leverage ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

Sign up for the webinar

What to test before upgrading to MySQL 5.7

In this blog post, we look at some of the important things to keep in mind when preparing and performing tests for an upgrade to MySQL 5.7. As we saw in a previous post, there are some important changes between MySQL 5.6 and 5.7. Since the behaviour of some existing features been altered, some detailed testing of the upgrade is in order. This new blog post (and associated whitepaper) show you how.

Read the blog

Deploy & monitor MySQL Galera clusters on Digital Ocean with NinesControl

Designed with the needs of developers in mind, NinesControl enables users to easily deploy and monitor (MySQL) Galera clusters on DigitalOcean. Droplets are launched and managed using your own DigitalOcean account. We’d love to get your feedback on this new solution if you haven’t tested it yet, so please check it out and let us know what you think.

Try NinesControl

Become a PostgreSQL DBA: Provisioning & Deployment

In this blog post, we address the following questions from the MySQL DBA standpoint: why use both MySQL and PostgreSQL in one environment? Is there any value in having a replicated PostgreSQL setup running alongside a Galera Cluster? We also discuss different methods of deploying PostgreSQL.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 23, 2016 11:01 AM

June 22, 2016

Peter Zaitsev

Troubleshooting configuration issues: Q & A

Troubleshooting configuration issues

Troubleshooting configuration issuesIn this blog, I will provide answers to the Q & A for the Troubleshooting configuration issues webinar.

First, I want to thank you for attending the June, 9 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: What are the predefined roles installed per default?

A: MySQL does not support roles by default. But, if you need roles you can either use MariaDB or emulate roles with Percona PAM Plugin and proxy users. But MySQL has pre-built user accounts. One of them is

root@localhost
, which has all available privileges, and anonymous account that can access the test database without a password. See the user manual for more information.

Q: How do you determine the root cause for a query that just hangs?

A: There are several possible reasons for a hanging query. They include:

  1. Poorly optimized query – it doesn’t really “hang,” but just requires a lot of time to execute. Dealing with these types of queries was discussed in the Troubleshooting Slow Queries webinar.
  2. The query is waiting for a lock set by another query or transaction. These issues were discussed in the Troubleshooting locking issues webinar
  3. A bug in MySQL.

When I see hanging queries I look into all the above possibilities. But I prefer to start at query optimization.

Q: Can we get the powerpoint slides?

A: I do not use PowerPoint to create slides. Slides are available in PDF format only.

Q: 1) Is it safety to use SQL_LOG_BIN = 0 for specific connections (statements), for example for DELETE, when we need to keep old data on a slave, but delete from master? What are side-effects? Can it break replication?

A: Using

SQL_LOG_BIN = 0
 itself is safe, but you need to understand what you are doing. For example, if you delete data in a table that has a unique key on the master, and then insert a row that has the same unique value that existed in one of rows you deleted, replication will fail with “Duplicate key” error.

Q: Is it reasonable to disable query_cache_type (set 0) on MySQL instances with very big (huge) amount of data?

A: Yes. I would recommend it.

Q: 3) How does the number of innodb_buffer_pool_instances affect performance? Does a lot of innodb_buffer_pool_instances = high performance?

A: InnoDB introduced buffer pool instances to reduce contention “as different threads read and write to cached pages“. However, they improve performance only if you have many concurrent threads inside InnoDB.

Q: I have a question, where can I download the threadpool plugin at Percona? I checked your download page and couldn’t find it. Is it bundled in the Percona official release? wW have 5.6.28,29 and 30 and there is no thread.so in the plugin directory. Can you let me know how to get it?

A: Percona built thread pool into Percona Server; a separate download isn’t necessary. See the user manual for instructions.

Save

by Sveta Smirnova at June 22, 2016 11:18 PM

Sneak peek at the Percona Live Europe Amsterdam 2016 talks

Percona Live Europe Amsterdam 2016 talks

Percona Live Europe Amsterdam 2016 talksOn behalf of the Percona Live Conference Committee, I am excited to announce the sneak peek schedule for the Percona Live Europe Amsterdam 2016 talks!

Percona Live Europe will feature a variety of formal tracks and sessions related to MySQL, MongoDB and ODBMS from members of the open source community. With many slots to fill, there will be no shortage of great content this year. Though we won’t finalize the entire conference schedule until mid-July, this preview list of talks is sure to whet your appetite! So without further ado, here is the SNEAK PEEK SCHEDULE!

Want to Present at Percona Live Europe Amsterdam?

We are still looking for people to give talks! The committee has begun their work rating talks, but there is still time to submit. We are looking for a range of interesting talks and subjects around MySQL, MongoDB, and ODBMS. Some topics of interest include:

  • Database Virtualization
  • Integration or coexistence between SQL and NoSQL
  • Automating service life-cycle
  • Load balancing
  • Cluster control
  • Multiple Data Store Technologies and Management Solutions

But hurry, the call for papers closes July 18th! If selected, you get a free pass, and the chance to dazzle your open source community peers. Apply now!

Become a Percona Live Europe Amsterdam Sponsor

Sponsor the Percona Live Europe Amsterdam conference. There are still sponsorship opportunities at Percona Live Europe. Being a sponsor allows you to demonstrate thought leadership, promote brand awareness and support the open source community. Need a reason? Here are ten reasons to become a sponsor. Sponsor the Percona Live Europe now.

Sponsor Percona Live Europe now.

by Kortney Runyan at June 22, 2016 05:46 PM

Jean-Jerome Schmidt

Webinar: Become a MongoDB DBA - What to Monitor (if you’re really a MySQLer)

To operate MongoDB efficiently, you need to have insight into database performance. And with that in mind, we’ll dive into monitoring in this second webinar in the ‘Become a MongoDB DBA’ series.

MongoDB offers many metrics through various status overviews and commands, but which ones really matter to you? How do you trend and alert on them? What is the meaning behind the metrics?

We’ll discuss the most important ones and describe them in ordinary plain MySQL DBA language. And we’ll have a look at the open source tools available for MongoDB monitoring and trending.

Finally, we’ll show you how to leverage ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, July 12th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, July 12th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • How does MongoDB monitoring compare to MySQL
  • Key MongoDB metrics to know about
  • Trending or alerting?
  • Available open source MongoDB monitoring tools
  • How to monitor MongoDB using ClusterControl
  • Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 16 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

We look forward to “seeing” you there!

This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a MongoDB DBA’ blog series.

by Severalnines at June 22, 2016 01:44 PM

June 21, 2016

Peter Zaitsev

Docker automatic MySQL slave propagation

Docker automatic MySQL slave propagation

Docker automatic MySQL slave propagationIn this post, we’ll discuss Docker automatic MySQL slave propagation for help with scaling.

In my previous posts on the Docker environment, I covered Percona XtraDB Cluster. Percona XtraDB Cluster can automatically scale by conveniently adding new nodes using the highly automated State Snapshot Transfer. State Snapshot Transfer allows a new node to copy data from an existing node (I still want to see how this is possible with MySQL Group Replication).

This is not the case with regular MySQL Replication. With MySQL Replication, the slave setup still requires manual steps (well, unless you’ve already scripted it for your environment). At least these are “simple” steps (ha!). Percona XtraBackup can setup replication with less work (see this link for details: https://www.percona.com/doc/percona-xtrabackup/2.4/howtos/setting_up_replication.html), but it still requires poking around and switching between servers.

However, nothing stops us from making it more automatic (similar to SST in Percona XtraDB Cluster), especially with Docker images. Why Docker? Because Docker provides a highly-controlled environment where we can orchestrate how scripts get executed. Severalnines provides a great intro into MySQL with Docker.

There are a few more components for this setup:

Before jumping to my solution, I should point to some work in this area by Joyent: https://www.joyent.com/blog/dbaas-simplicity-no-lock-in.

I propose my image https://hub.docker.com/r/perconalab/ps-master-slave/, with sources on GitHub https://github.com/percona/percona-docker/tree/master/percona-server-master-slave.

First, we need to start a master node:

docker run -d -p 3306:3306 --net=replicaset_net
 --name=replicaset_master
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

I assume that we’ve created the network replicaset_net already, either bridge or overlay.

You can create a slave by pointing to the master container:

docker run -d -p 3306 --net=replicaset_net
 --name=replicaset_slave1
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e MASTER_HOST=replicaset_master
 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

The started node will automatically connect to MASTER_HOST, copy the data and perform all the steps needed to start the slave.

You can even copy data from a running slave, instead of the master, like this:

docker run -d -p 3306 --net=replicaset_net
 --name=replicaset_slave2
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e MASTER_HOST=replicaset_master
 -e SLAVE_HOST=replicaset_slave1
 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

This node will copy data from SLAVE_HOST, but then will point itself to MASTER_HOST.

Docker Network lets you use container names "replicaset_master" and "replicaset_slave1" instead of IP addresses, which is very convenient.

As the result of above, we have one master and two slaves running. We can start as many slave nodes as needed.

Please remember, this is more proof-of-concept than “production ready” images, but it gives a good direction for implementation.

by Vadim Tkachenko at June 21, 2016 11:42 PM

Jean-Jerome Schmidt

Severalnines Launches #Galera CrowdChat

Today we launch our live CrowdChat on everything #galera!

This CrowdChat is brought to you by Severalnines and is hosted by a community of subject matter experts. CrowdChat is a community platform that works across Facebook, Twitter, and LinkedIn to allow users to discuss a topic using a specific #hashtag. This crowdchat focuses on the hashtag #galera. So if you’re a DBA, architect, CTO, or a database novice register to join and become part of the conversation!

Join this online community to interact with experts on Galera clusters. Get your questions answered and join the conversation around everything #galera.

Register free

Meet the experts

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

Krzysztof Książek is a Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

Ashraf Sharif is a System Support Engineer at Severalnines. He has previously worked as principal consultant and head of support team and delivered clustering solutions for big websites in the South East Asia region. His professional interests focus on system scalability and high availability.

Vinay Joosery is a passionate advocate and builder of concepts and businesses around Big Data computing infrastructures. Prior to co-founding Severalnines, Vinay held the post of Vice-President EMEA at Pentaho Corporation - the Open Source BI leader. He has also held senior management roles at MySQL / Sun Microsystems / Oracle, where he headed the Global MySQL Telecoms Unit, and built the business around MySQL's High Availability and Clustering product lines. Prior to that, Vinay served as Director of Sales & Marketing at Ericsson Alzato, an Ericsson-owned venture focused on large scale real-time databases.

by Severalnines at June 21, 2016 02:50 PM

June 20, 2016

Peter Zaitsev

Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host

Percona XtraDB Cluster Reference Architecture

Percona XtraDB Cluster nodes with Linux Network namespacesThis post is a continuance of my Docker series, and examines Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host.

In this blog I want to look into a lower-level building block: Linux Network Namespace.

The same as with cgroups, Docker uses Linux Network Namespace for resource isolation. I was looking into cgroup a year ago, and now I want to understand more about Network Namespace.

The goal is to both understand a bit more about Docker internals, and to see how we can provide network isolation for different processes within the same host. You might need to isolate process when running several MySQL or MongoDB instances on the same server (which might come in handy during testing). In this case, I needed to test ProxySQL without Docker.

We can always use different ports for different MySQL instances (such as 3306, 3307, 3308), but it quickly gets complicated.

We could also use IP address aliases for an existing network interface, and use bind=<IP.ADD.RE.SS> for each instance. But since Percona XtraDB Cluster can use three different IP ports and network channels for communications, this also quickly gets complicated.

Linux Network Namespace provides greater network isolation for resources so that it can be a better fit for Percona XtraDB Cluster nodes. Now, setting up Network namespaces in and of itself can be confusing; my recommendation is if you can use Docker, use Docker instead. It provides isolation on process ID and mount points, and takes care of all the script plumbing to create and destroy networks. As you will see in our scripts, we need to talk about directory location for datadirs.

Let’s create a network for Percona XtraDB Cluster with Network Namespaces.

I will try to do the following:

  • Start four nodes of Percona XtraDB Cluster
  • For each node, create separate network namespace so the nodes will be able to allocate network ports 3306, 4567, 4568 without conflicts
  • Assign the nodes IP addresses: 10.200.10.2-10.200.10.5
  • Create a “bridge interface” for the nodes to communicate, using IP address 10.200.10.1.

For reference, I took ideas from this post: Linux Switching – Interconnecting Namespaces

First, we must create the bridge interface on the host:

BRIDGE=br-pxc
brctl addbr $BRIDGE
brctl stp $BRIDGE off
ip addr add 10.200.10.1/24 dev $BRIDGE
ip link set dev $BRIDGE up

Next, we create four namespaces (one per Percona XtraDB Cluster node) using the following logic:

for i in 1 2 3 4
do
 ip netns add pxc_ns$i
 ip link add pxc-veth$i type veth peer name br-pxc-veth$i
 brctl addif $BRIDGE br-pxc-veth$i
 ip link set pxc-veth$i netns pxc_ns$i
 ip netns exec pxc_ns$i ip addr add 10.200.10.$((i+1))/24 dev pxc-veth$i
 ip netns exec pxc_ns$i ip link set dev pxc-veth$i up
 ip link set dev br-pxc-veth$i up
 ip netns exec pxc_ns$i ip link set lo up
 ip netns exec pxc_ns$i ip route add default via 10.200.10.1
done

We see the following interfaces on the host:

1153: br-pxc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
 link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff
 inet 10.200.10.1/24 scope global br-pxc
 valid_lft forever preferred_lft forever
 inet6 fe80::2ccd:6ff:fe04:c7d5/64 scope link
 valid_lft forever preferred_lft forever
1154: br-pxc-veth1@if1155: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether c6:28:2d:23:3b:a4 brd ff:ff:ff:ff:ff:ff link-netnsid 8
 inet6 fe80::c428:2dff:fe23:3ba4/64 scope link
 valid_lft forever preferred_lft forever
1156: br-pxc-veth2@if1157: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff link-netnsid 12
 inet6 fe80::3032:4cff:fe36:2287/64 scope link
 valid_lft forever preferred_lft forever
1158: br-pxc-veth3@if1159: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether 8a:3a:c1:e0:8a:67 brd ff:ff:ff:ff:ff:ff link-netnsid 13
 inet6 fe80::883a:c1ff:fee0:8a67/64 scope link
 valid_lft forever preferred_lft forever
1160: br-pxc-veth4@if1161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether aa:56:7f:41:1d:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 11
 inet6 fe80::a856:7fff:fe41:1dc3/64 scope link
 valid_lft forever preferred_lft forever

We also see the following network namespaces:

# ip netns
pxc_ns4 (id: 11)
pxc_ns3 (id: 13)
pxc_ns2 (id: 12)
pxc_ns1 (id: 8)

After that, we can check the namespace IP address:

# ip netns exec pxc_ns3 bash
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
 valid_lft forever preferred_lft forever
 inet6 ::1/128 scope host
 valid_lft forever preferred_lft forever
1159: pxc-veth3@if1158: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
 link/ether 4a:ad:be:6a:aa:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
 inet 10.200.10.4/24 scope global pxc-veth3
 valid_lft forever preferred_lft forever
 inet6 fe80::48ad:beff:fe6a:aac6/64 scope link
 valid_lft forever preferred_lft forever

To enable communication from inside the network namespace to the external world, we should add some iptables rules, e.g.:

iptables -t nat -A POSTROUTING -s 10.200.10.0/255.255.255.0 -o enp2s0f0 -j MASQUERADE
iptables -A FORWARD -i enp2s0f0 -o $BRIDGE -j ACCEPT
iptables -A FORWARD -o enp2s0f0 -i $BRIDGE -j ACCEPT

where enp2s0f0 is an interface that has an external IP address (by some reason modern Linux distros decided to use names like enp2s0f0 for network interfaces, instead old good "eth0").

To start a node (or mysqld instance) inside a network namespace, we should use

ip netns exec prefix
 for commands.

For example to start Percona XtraDB Cluster first node, in the namespace pxc_ns1, with IP address 10.200.10.2, we use:

ip netns exec pxc_ns1 mysqld --defaults-file=node.cnf --datadir=/data/datadir/node1 --socket=/tmp/node1_mysql.sock --user=root --wsrep_cluster_name=cluster1

To start following nodes:

NODE=2 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://10.200.10.2" --wsrep_cluster_name=cluster1  
NODE=3 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://10.200.10.2" --wsrep_cluster_name=cluster1  
etc

As the result of this procedure, we have four Percona XtraDB Cluster nodes running in an individual network namespace, not worrying about IP address and ports conflicts. We also allocated a dedicated IP range for our cluster.

This procedure isn’t trivial, but it is easy to script. I also think provides a good understanding what Docker, LXC or other containerization technologies do behind the scenes with networks.

 

by Vadim Tkachenko at June 20, 2016 10:52 PM

Webinar Thursday June 23: Choosing a MySQL High Availability Solution Today

MySQL High Availability

Please join Percona, Technical Account Manager, Michael Patrick on Thursday, June 23, 2016 at 10 AM PDT (UTC-7) as he presents “Choosing a MySQL High Availability Solution Today.”

High availability (HA) is one of the solutions to improve performance, avoid data outages, and recover quickly from disasters. An HA environment helps guarantee that your database doesn’t have a single point of failure, accommodates rapid growth and exponentially increasing database size, and enables the applications that power your business.

Michael will discuss various topologies for achieving High Availability with MySQL.

Topics include:

  • Percona XtraDB Cluster
  • DRBD
  • MHA
  • MySQL Orchestrator

Each solution has advantages and challenges. Attendees will gain a deeper understanding of how to choose the best solution for their needs while avoiding some of the pitfalls of making the wrong choices. Avoid the costly mistakes that commonly cause outages and lost revenue. Plus get the latest and greatest developments in the technologies!

Register now.

MySQL High AvailabilityMichael Patrick
Technical Account Manager

Mike came to Percona in 2015 after working for a variety of large corporations running hundreds of MySQL and Percona XtraDB Clusters in production environments. He is skilled in performance tuning, server auditing, high availability, multi-data center replication, migration, and other MySQL-related activities. Mike holds a B.S. in Computer Science from East Tennessee State University. In his off time, he enjoys Martial Arts and Cave Exploration. He lives in East Tennessee with his wife and he has four children.

by Dave Avery at June 20, 2016 07:28 PM

Shlomi Noach

Solving the non-atomic table swap, Take II

Following up and improving on Solving the Facebook-OSC non-atomic table swap problem, we present a better, safe solution.

Quick, quickest recap:

We are working on a triggerless online schema migration solution. It is based on an asynchronous approach, similarly to the FB osc and as opposed to the synchronous solution as used by pt-online-schema-change.

We asynchronously synchronize (is that even a valid statement?) between some table tbl and a ghost table ghost, and at some time we want to cut-over: swap the two; kick out tbl and put ghost in its place and under its name.

However, we cannot use the single statement rename tbl to tbl_old, ghost to tbl, because we use the asynchronous approach, where at the time we lock tbl for writes, we still have some events we need to process and apply onto ghost before swapping the two.

And MySQL does not allow a lock tables tbl write; ... ; rename tbl to tbl_old, ghost to tbl.

In Solving the Facebook-OSC non-atomic table swap problem we suggested a way that works, unless when it doesn't work. Read the caveat at the end of the post. Premature death of a connection that participates in the algorithm causes a chain reaction that leads to the premature execution of the rename statement, potentially before we've applied those remaining events. This leads to data inconsistency between the old table and the new table, and is unacceptable.

To that effect, we were more inclined to go with the Facebook solution, which makes a two-step: lock tables tbl write; alter table tbl rename to tbl_old; ... ; alter table ghost rename to tbl;

This two-step solution is guaranteed not to have data inconsistency. Alas, it also implies an outage. There's a brief moment, in between the two renames, and during that time where we apply those last changes, where the table tbl is simply not there.

Not all applications will fail gracefully on such a scenario.

UDF

We looked at a solution based on UDFs, where we would create global wait conditions, that are not connection based.

We don't like UDFs. You need to compile them for every new version. Puppetizing their setup is not fun. We wouldn't like maintaining this. We wouldn't like doing the operations for this. Neither would the community.

We want to make this a community solution. Can we do without UDF?

Rewriting MySQL

We wish to avoid forking our own version of MySQL. It's not what we do and it's a pain.

A pure MySQL solution?

We found a solution to embrace; it is optimistic, and safe. hat optimistic means is explained further on, but let's discuss safe:

The previous solution we came up with as unsafe because breakage of a single component in the algorithm would lead to inconsistent data. The algorithm itself was fine, as long as no one would break it from the outside. This is the concern: what if some crazy cronjob that cleans up connections (kills idle connections, kills long running transactions) or some unfortunate user command kills one of the connections involved in the cut-over phase? This is not something that would happen every day, but can we protect against it? Our priority is to keep our data intact.

The solution allows breakage. Even in the face of death of connections, data is not lost/corrupted, and at worst -- causes a FB-like, recoverable outage scenario.

A step towards the solution, a flawed one

I wish to illustrate something that looks like it would work, but in fact has a hidden flaw. We will later improve on that solution.

Let's assume we have tblghost tables. We execute the following by multiple connections; we call them C1, C2, C3, ...:

  • C1: lock tables tbl write;
  • C2, C3, ..., C17: normal app connections, issuing insert, delete, update on tbl. Because of the lock, they are naturally blocked.
  • We apply those last event we need to apply onto ghost. No new events are coming our way because tbl is blocked.
  • C18: rename table tbl to tbl_old, ghost to tbl; (blocked as well)
  • C1: unlock tables(everything gets released)

Let's consider the above, and see why it is flawed. But first, why it would typically work in the first place.

  • Connections C2, ..., C17 came first, and C18 came later. Nevertheless MySQL prioritizes C18 and moves it up the queue of waiting queries on tbl. When we unlock, C18 is the first to execute.
  • We only issue the rename once we're satisfied we've applied those changes. We only unlock once we're satisfied that the rename has been executed.
  • If for some reason C1 disconnects before we issue the rename - no problem, we just retry from scratch.

What's the flaw?

We rename when C1 holds the lock. We check with C1 that it is alive and kicking. Yep, it's connected and holding the lock. Are you sure? Yep, I'm good! Really really sure? Yep! OK then, let's rename!

"Oh darn", says C1, "now that you went ahead to rename, but just before you actually sent the request, I decided to take time off and terminate". Or, more realistically, some job would kill C1.

What happens now? The rename is not there yet. All those queries get released, and are immediately applied onto tbl, and then the rename applies, kicks all those changes into oblivion, and puts ghost in place, where it immediately receives further writes.

Those blocking queries were committed but never to be seen again.

So here's another way to look at the problem: the rename made it through even though the connection C1 died just prior to that, whereas we would have loved the rename to abort upon such case.

Is there a way in MySQL to cause an operation to fail or block when another connection dies? It's the other way around! Connections hold locks, and those get released when they die!

But there's a way...

Three step, safe, optimistic solution

Here are the steps to a safe solution:

  • C1: lock tables tbl write;
  • C2, C3, ..., C17: normal app connections, issuing insert, delete, update on tbl. Because of the lock, they are naturally blocked.
  • We apply those last event we need to apply onto ghost. No new events are coming our way because tbl is blocked.
  • C18: checking that C1 is still alive, then rename table tbl to tbl_old
  • C19: checking to see that C18's rename is in place (via show processlist), and that C1 is still alive; then issues: rename table ghost to tbl
  • (meanwhile more queries approach tbl, it doesn't matter, they all get deprioritized, same as C2...C17)
  • C1: unlock tables

What just happened? Let's first explain some stuff:

  • C18's rename gets prioritized over the DMLs, even though it came later. That is how MySQL prioritizes queries on metadata-locked tables.
  • C18 checks C1 is still alive, but as before, there's always the chance C1 will die just at the wrong time -- we're going to address that.
  • C19 is interested to see that C18 began execution, but potentially C18 will crash by the time C19 actually issues its own rename -- we're going to address that
  • C19's query sounds weird. At that time tbl still exists. You'd expect it to fail immediately -- but it does not. It's valid. This is because tbl's metadata lock is in use.
  • C19 gets prioritized over all the DMLs, but is known to be behind C18. The two stay in same order of arrival. So, C18 is known to execute before C19.
  • When C1 unlocks, C18 executes first.
  • Metadata lock is still in place on tbl even though it doesn't actually exist, because of C19.
  • C19 operates next.
  • Finally all the DMLs execute.

What happens on failures?

  • If C1 dies just as C18 is about to issue the rename, we get an outage: tbl is renamed to tbl_old, and the queries get released and complain the table is just not there.
    • C19 will not initiate because it is executed after C18 and checks that C1 is alive -- which turns to be untrue.
    • So we know we have outage, and we quickly rename tbl_old to tbl; and go drink coffee, then begin it all again.
    • The outage is unfortunate, but does not put our data in danger.
  • If C1 happens to die just as C19 is about to issue its rename, there's no data integrity: at this point we've already asserted the tables are in sync. As C1 dies, C18 will immediately rename tbl to tbl_old. An outage will occur, but not for long, because C19 will next issue rename ghost to tbl, and close the gap. We suffered a minor outage, but no rollback. We roll forward.
  • If C18 happens to die just as C19 is about to issue its rename, nothing bad happens: C19 is still blocking for as long as C1 is running. We find out C18 died, and release C1. C19 attempts to rename ghost onto tbl, but tbl exists and the query fails. The metadata lock is released and all the queries resume operation on the original tbl. The operation failed but without error. We will need to try the entire cycle again.
  • If both C1 and C18 fail at the time C19 is about to begin its rename, same as above.
  • If C18 fails as C19 is already in place, same as above.
  • If C1 fails as C19 is already in place, it's as good as having it issue the unlock tables. We're happy.
  • If C19 fails at any given point, we suffer outage. We revert by rename tbl_old to tbl

This solution relies on the notion that if a previous connection failed, we would not be able to rename ghost to tbl because the table would still be there. That's what we were looking for; but instead of looking at locks, which get released when a connection terminates, we used a persistent entity: a table.

Conclusion

The algorithm above is optimistic: if no connections get weirdly killed, it's a valid locking solution, and queries & app are unaware that anything happened (granted, app will notice write latency). If connections do get weirdly killed, we get table-outage at worst case -- an outage that is already considered to be a valid solution anyhow. The algorithm will not allow data corruption.

by shlomi at June 20, 2016 09:26 AM

Jean-Jerome Schmidt

What to test before upgrading to MySQL 5.7

As we saw in a previous post, there are some important changes between MySQL 5.6 and 5.7. Since the behaviour of some existing features been altered, some detailed testing of the upgrade is in order. In this blog post, we will look at some of the important things to keep in mind when preparing and performing these tests.

How to design a test environment?

Your test environment has to be as similar to your production environment as possible. We are talking here about using the same hardware, using the same dataset and running the same query mix. If you have a complex replication topology, try to replicate it in test as well. Of course, sometimes it’s not possible to use real data or real queries because of  security concerns. But keep in mind that the results get less and less reliable as you have more differences between test and production.

Collecting queries for regression tests

To run real-world queries, you need to collect them first. There are many ways to do that - you can enable the slow query log and log all queries there (long_query_time=0), you can use tcpdump and capture the traffic somewhere between MySQL and the application. It’s very important to keep in mind that you need to collect all types of queries. If your query mix is stable all the time, probably 20 - 30minutes of traffic will suffice. If you have some variation in it, though, things will be different. Maybe your application runs different queries at different times of the day. Maybe there’s some ETL process that gets kicked off at a certain scheduled time. Maybe you run some reporting. So on and so forth - the bottomline is - you have to collect all types of queries in order to make tests as realistic as possible.

Performance regression tests

Some simple tests can be executed by a DBA - tools like Percona Playback or pt-upgrade can help here. Especially pt-upgrade is useful as it can collect performance data about queries executed on different MySQL hosts and prepare nice reports about them. When running pt-upgrade, you need to make sure your test nodes are at the exact same state - ideally, cold, no data in cache or buffers. You need to keep in mind that query performance can be affected by network latency, therefore it’s better to execute pt-upgrade locally.

Tests on the application side

Simple tests like pt-upgrade won’t replace detailed acceptance tests executed from the application. This step is a must for any upgrade runbook. The more data you can collect while running tests on the application, the better. Exact process will differ between applications but the idea is simple - check everything you can and make sure all aspects of your application work correctly with a new version.

The above is a brief overview of the testing process.  You can find more details in our ebook “Upgrading to MySQL 5.7”.

by Severalnines at June 20, 2016 07:45 AM

June 18, 2016

Henrik Ingo

Creating Impress.js presentations in colored JSON with Highlight.js

Last month I wrote about impress.js, and how I've started using it for my presentations. This has been going well, and during the past month I've actually given 2 more presentations using impress.js:

Dreams and fears of a database consultant
MongoDB and GIS

(You really have to click those links, embedding these presentations wouldn't make them justice!)

read more

by hingo at June 18, 2016 02:54 PM

June 17, 2016

Peter Zaitsev

InnoDB locks and transaction isolation level

Troubleshooting configuration issues

InnoDB locks and transaction isolationWhat is the difference between InnoDB locks and transaction isolation level? We’ll discuss it in this post.

Recently I received a question from a user about one of my earlier blog posts. Since it wasn’t sent as a comment, I will answer it here. The question:

> I am reading your article:
> https://www.percona.com/resources/technical-presentations/troubleshooting-locking-issues-percona-mysql-webinar

> Full table scan locks whole table.

> Some bad select (read) query can do full table scan on InnoDB, does it lock whole table please?

> My understanding was that SELECT (read) blocks another DML only in MyISAM.

To answer this question, we to need understand two different concepts: locking and transaction isolation level.

In the default transaction isolation mode for InnoDB, 

REPEATABLE READ
 and the lower
TRANSACTION ISOLATION
levels,
SELECT
  doesn’t block any DML unless it uses 
SELECT ... FOR UPDATE
 or
SELECT ... LOCK IN SHARE MODE
. On slide 20 (31) of my presentation, I use the 
SELECT ... FOR UPDATE
 statement (this is why a lock is seen).

However, if transaction isolation mode is

SERIALIZABLE
, then 
SELECT
 can block updates. You can see this in the example below:

mysql1> set transaction isolation level serializable;
Query OK, 0 rows affected (0,00 sec)
mysql1> begin;
Query OK, 0 rows affected (0,00 sec)
mysql1> select * from employees join titles using(emp_no);
c936e6fc4c6cbaf77679ba5013339dff  -
443308 rows in set (29,69 sec)
mysql2> begin;
Query OK, 0 rows affected (0,00 sec)
mysql2> update titles set title='Engineer' where emp_no=15504;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Regarding the different level locks set by InnoDB, I recommend that you follow the links on slide 37 (67) of the presentation, especially “InnoDB Locking Explained With Stick Figures” by Bill Karwin. Innodb table locks are also explained very well in this blog post by Peter Zaitsev.

by Sveta Smirnova at June 17, 2016 06:49 PM

MariaDB Foundation

MariaDB 5.5.50 and updated connectors now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.50, MariaDB Connector/J 1.4.6, and MariaDB Connector/ODBC 2.0.11. See the release notes and changelogs for details on these releases. Download MariaDB 5.5.50 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/J 1.4.6 Release Notes Changelog […]

The post MariaDB 5.5.50 and updated connectors now available appeared first on MariaDB.org.

by Daniel Bartholomew at June 17, 2016 05:32 PM

Peter Zaitsev

MongoDB Security: Why pay for Enterprise when Open Source has you covered?

MongoDB Security

MongoDB SecurityDoes ensuring MongoDB security justify the cost of the Enterprise version? In my opinion, the answer is no.

MongoDB Inc© blasted an email with a study showing that the average cost of a data breach can be $5.9M. You can find the key finding in IBM’s 2015 report here:

NH Learning Solutions

Key findings:

Data breaches cost the most in the US and Germany and the lowest in Brazil and India. The average per capita cost of a data breach is $217 in the US and $211 in Germany. The lowest cost is in Brazil ($78) and India ($56). The average total organizational cost in the US is $6.5 million and in Germany $4.9 million. The lowest organizational cost is in Brazil ($1.8 million) and India ($1.5 million).

So it might be even worse than you thought! The study presented a clear per-record cost, which helps companies determine the real cost to them. This brings us to the recently-published MongoDB Security Architecture white paper. While being a great document, it raises some questions that should be addressed. We should dissect exactly what paying for an enterprise build gets you that is not already available in the open source community version. 

The key security features in above white paper are:

  • Authentication. LDAP Authentication centralizes things with your company directory (for PCI)
  • Authorization. What role-based access controls the database provides
  • Encryption. Broken into “At-Rest” and “In-Transit” as part of regular PCI requirements
  • Governance. Document validation and even checking for sensitive data such as an SSN or birth data
  • Auditing. The ability to see who did what in the database (also required for PCI).

That list lets us break down each into why they are important, and is it something that should be free in the MongoDB ecosystem.

Authentication

MongoDB has built-in users (off by default). It misses things, however, like password complexity, age-based rotation, centralization, and identification of user roles versus service functions. These are essential to passing PCI. PCI requires that people don’t use old passwords, easy-to-break passwords, and that user access gets revoked when there is a change in status (such as leaving a department or the company). Thankfully LDAP is an open-source project of its own. Many connectors allow the use of Windows Active Directory (AD) systems to talk with LDAP.

Using LDAP and AD, you can tie users in with your corporate directory. When they change roles or leave the company, they can be removed by HR from your database group. Practically, this means there are automated systems in place to ensure only those you want to access the data manually can do so, without accidently missing something. It is important to note that the MongoDB Community Edition© does not have LDAP support. For this you need MongoDB Inc’s© Enterprise build. You can also use Percona Server© for MongoDB. Percona’s build is open source and free, and we offer a support contract so that if you require support or help it is available.

Authorization

Role-based Authorization or (RBAC) is core to MongoDB Community©, MongoDB Enterprise© and Percona Server for MongoDB. In Mongo 2.6+ you can use built-in roles, or even craft your own down to what actions someone might be able to do – only exposing exactly what you want users to be able to do with razor precision. This is such a core MongoDB feature that it is everywhere and in every build regardless of vendor.

Encryption

As I mentioned before, this is broken into two areas of discussion: At-Rest and In-Transit

At-Rest.: This is defined as on-disk, and typically refers to data saved to an encrypted storage location. This is to prevent theft by physical means and create backups that are stored in a fashion not easily read by any third party. There are practical limits to this. The biggest are trusting your sys-admins and assuming that a hacker has not been able to get access to the system to pretend they are one of your admins. This is not an issue unique to MongoDB or Percona. Such systems used more globally work here as well. They might include things like LUKS/cryptfs, or might go into even more secure methods such as signing encryption keys with LDAP, Smart Cards, and RSA type tokens.

When doing this level of encryption, you need to consider things like automounting and decrypting of drives. However, the general point is this is not something new to your system administrators, and they can manage this requirement in the same way they manage it in other parts of the system. The added benefit is a single procedure for storage encryption, not one per whatever technology a particular function uses.

In-Transit.: To keep this simple, we are just talking about using SSL on all connections (it can be as simple as that in the end). Depending on your internal requirements,©  and Percona Server for MongoDB all additionally support custom Signing Authorities (CA), x509 clients and member certificates. The x509 system is very useful when you want to make sure only authorized machines are allowed to talk to your system because they can even attempt to send a user/password to the system.

Governance

Put simply; this is the ability to enforce complex standards on the system by using Document Validation. This is an important feature that is available to MongoDB Community©, MongoDB Enterprise© and Percona Server for MongoDB. Governance is about the insertion and updating of data. It is also useful for checking if a field name like bday, birthday, ssn, social, ect is defined. We are not limited to those: you could also do string regex’s on things like user_id to check for a $regex such as “^d{3}-d{2}-d{4}$” (which would be a Social Security Number), or a checking for a credit card number. These examples are ways your DBAs and security architects can help prevent developers from exposing the company to added risk.

You can also ensure schema changes only occur when properly vetted by your DBA staff, as the developer code could fail if they change the format of what you’re storing in the database. This brings an additional layer of control to MongoDB’s dynamic production schema (allowing itself to store anything even if it should not).

Auditing

Central to any good security design – and required by PCI – is being able to track what user did what action in the database (very similar to how you need to do it on your actual servers). At this time, MongoDB’s Community© build can’t track this. However, both MongoDB Enterprise©  and Percona Server for MongoDB support this feature. Both work in similar ways, allowing you to filter output to a particular user, database, collection, or source location. This gives you a log to review in any security incident and, more importantly, shows your PCI auditor that you’ve taken the correct steps to both protect your database from an intrusion and understand and incursions depth (should one occur).

Hopefully, this has been a good overview of the security options in MongoDB Community© and Enterprise© versions and Percona Server for MongoDB. Even without an enterprise-style contract, you can fulfill all your PCI compliance needs and protect your company using reasonable and known methods. Please note Percona strives to bring enterprise features to the community, but not to the point of wanting to lock you into a non-open source build. If you need support with anything MongoDB, we have the community and its users as our first priority!

by David Murphy at June 17, 2016 04:55 PM

June 16, 2016

Peter Zaitsev

Scaling Percona XtraDB Cluster with ProxySQL in Kubernetes

Percona XtraDB Cluster nodes with Linux Network namespaces

Percona XtraDB Cluster with ProxySQL in KubernetesHow do you scale Percona XtraDB Cluster with ProxySQL in Kubernetes?

In my previous post I looked how to run Percona XtraDB Cluster in a Docker Swarm orchestration system, and today I want to review how can we do it in the more advanced Kubernetes environment.

There are already some existing posts from Patrick Galbraith (https://github.com/kubernetes/kubernetes/tree/release-1.2/examples/mysql-galera) and Raghavendra Prabhu (https://github.com/ronin13/pxc-kubernetes) on this topic. For this post, I will show how to run as many nodes as I want, see what happens if we add/remove nodes dynamically and handle incoming traffic with ProxySQL (which routes queries to one of working nodes). I also want to see if we can reuse the ReplicationController infrastructure from Kubernetes to scale nodes to a given number.

These goals should be easy to accomplish using our existing Docker images for Percona XtraDB Cluster (https://hub.docker.com/r/percona/percona-xtradb-cluster/), and I will again rely on the running service discovery (right now the images only work with etcd).

The process of setting up Kubernetes can be pretty involved (but it can be done; check out the Kubernetes documentation to see how: http://kubernetes.io/docs/getting-started-guides/ubuntu/). It is much more convenient to use a cloud that supports it already (Google Cloud, for example). I will use Microsoft Azure, and follow this guide: http://kubernetes.io/docs/getting-started-guides/coreos/azure/. Unfortunately the scripts from the guide install previous version of Kubernetes (1.1.2), which does not allow me to use ConfigMap. To compensate, I will duplicate the ENVIRONMENT variables definitions for Percona XtraDB Cluster and ProxySQL pods. This can be done more optimally in the recent version of Kubernetes.

After getting Kurbernetes running, starting Percona XtraDB Cluster with ProxySQL is easy using following pxc.yaml file (which you also can find with our Docker sources https://github.com/percona/percona-docker/tree/master/pxc-57/kubernetes):

apiVersion: v1
kind: ReplicationController
metadata:
 name: pxc-rc
 app: pxc-app
spec:
 replicas: 3 # tells deployment to run N pods matching the template
 selector:
 app: pxc-app
 template: # create pods using pod definition in this template
 metadata:
 name: pxc
 labels:
 app: pxc-app
 spec:
 containers:
 - name: percona-xtradb-cluster
 image: perconalab/percona-xtradb-cluster:5.6test
 ports:
 - containerPort: 3306
 - containerPort: 4567
 - containerPort: 4568
 env:
 - name: MYSQL_ROOT_PASSWORD
 value: "Theistareyk"
 - name: DISCOVERY_SERVICE
 value: "172.18.0.4:4001"
 - name: CLUSTER_NAME
 value: "k8scluster2"
 - name: XTRABACKUP_PASSWORD
 value: "Theistare"
 volumeMounts:
 - name: mysql-persistent-storage
 mountPath: /var/lib/mysql
 volumes:
 - name: mysql-persistent-storage
 emptyDir: {}
 imagePullPolicy: Always
---
apiVersion: v1
kind: ReplicationController
metadata:
 name: proxysql-rc
 app: proxysql-app
spec:
 replicas: 1 # tells deployment to run N pods matching the template
 selector:
 front: proxysql
 template: # create pods using pod definition in this template
 metadata:
 name: proxysql
 labels:
 app: pxc-app
 front: proxysql
 spec:
 containers:
 - name: proxysql
 image: perconalab/proxysql
 ports:
 - containerPort: 3306
 - containerPort: 6032
 env:
 - name: MYSQL_ROOT_PASSWORD
 value: "Theistareyk"
 - name: DISCOVERY_SERVICE
 value: "172.18.0.4:4001"
 - name: CLUSTER_NAME
 value: "k8scluster2"
 - name: MYSQL_PROXY_USER
 value: "proxyuser"
 - name: MYSQL_PROXY_PASSWORD
 value: "s3cret"
---
apiVersion: v1
kind: Service
metadata:
 name: pxc-service
 labels:
 app: pxc-app
spec:
 ports:
 # the port that this service should serve on
 - port: 3306
 targetPort: 3306
 name: "mysql"
 - port: 6032
 targetPort: 6032
 name: "proxyadm"
 # label keys and values that must match in order to receive traffic for this service
 selector:
 front: proxysql

Here is the command to start the cluster:

kubectl create -f pxc.yaml

The command will start three pods with Percona XtraDB Cluster and one pod with ProxySQL.

Percona XtraDB Cluster nodes will register themselves in the discovery service and we will need to add them to ProxySQL (it can be done automatically with scripting, for now it is a manual task):

kubectl exec -it proxysql-rc-4e936 add_cluster_nodes.sh

Increasing the cluster size can be done with the scale command:

kubectl scale --replicas=6 -f pxc.yaml

You can connect to the cluster using a single connection point with ProxySQL: You can find it this way:

kubectl describe -f pxc.yaml
Name: pxc-service
Namespace: default
Labels: app=pxc-app
Selector: front=proxysql
Type: ClusterIP
IP: 10.23.123.236
Port: mysql 3306/TCP
Endpoints: <none>
Port: proxyadm 6032/TCP
Endpoints: <none>
Session Affinity: None

It exposes the endpoint IP address 10.23.123.236 and two ports: 3306 for the MySQL connection and 6032 for the ProxySQL admin connection.

So you can see that scaling Percona XtraDB Cluster with ProxySQL in Kubernetes is pretty easy. In the next post, I want to run benchmarks in the different Docker network environments.

by Vadim Tkachenko at June 16, 2016 06:38 PM

Why MongoRocks: Deprecating PerconaFT and MongoDB Optimistic locking

deprecating PerconaFT

deprecating PerconaFTIn this post, we’ll discuss the rationale behind deprecating PerconaFT and embracing RocksDB.

Why is Percona deprecating PerconaFT in favor of RocksDB?

Many of you may have seen Peter Zaitsev’s recent post about Percona embracing RocksDB and deprecating PerconaFT. I’m going to shed a bit more light on the issues between the locking models for PerconaFT’s and MongoDB’s core servers. When making this decision, we looked at how the differences between the engines measure up and impact other improvements we could make. In the end, we can do more for the community by focusing on engines that are in line with assumptions the core server makes every second in your daily operations.Then we have more resources available for improving the users’ experience by adding new tools, features, and improving the core server.

What is pessimistic locking?

Pessimistic locking locks an entity in the database for the entire time that it is actively used in application memory. A lock either limits or prevents other users from working with the entity in the database. A write lock indicates that the holder of the lock intends to update the entity and disallows anyone from reading, updating, or deleting the object. A read lock means that the owner of the lock does not want the object to change while it holds the lock, allowing others to read the entity but not update or delete it. The scope of a lock might be the entire database, a table, a collection of ranges of documents or a single document.

You can order pessimistic locks as follows (from broad to granular):

  1. Database locks
  2. Collection locks
  3. Range locks
  4. Document locks

The advantage of pessimistic locking is that changes to the database get made consistently and safely. The primary disadvantage is that this approach isn’t as scalable. The chance of waiting for a lock to be released increases when:

  • A system has a lot of users
  • The transactions (in MongoDB, there are transactions in the engine but not at the user level) involve a greater number of entities
  • When transactions are long-lived

Therefore, pessimistic locks limit the practical number of simultaneous users that your system can support.

What is optimistic locking?

In most database systems (NoSQL and RDBMS) expect collisions to be relatively uncommon. For example, although two clients are working with user objects, one might be working with the Bob Vader object while another works with the Luke Vader object. These won’t collide. In this case, optimistic locking becomes the most viable concurrency control strategy. If you accept the premise that collisions infrequently occur, instead of trying to prevent them you can choose to detect and then resolve them when they do occur.

MongoDB has something called a Write Conflict Exception (WCE). A WCE is like an engine-level deadlock. If a record inside the engine changes due to thread #1, thread #2 must wait for a time that it is safe to change the record, and retry then. Typically this occurs when a single document gets updated frequently. It can also occur when there are many updates, or there are ranges of locks happening concurrently. This is a perfect case of optimistic locking, preferring to resolve or retry operations when they occur rather than prevent them from happening.

Can you make these play well while limiting the amount of development resources needed?

These views are as polar opposite as you can get in the database world. In one view you lock as much as possible, preventing anyone else from making a change. In the other view you let things be as parallel as possible, and accept you will retry if two clients are updating the same document. With the nature of how many documents fit in a single block of memory, this has some real-world concerns. When you have more than one document in a memory block, you could have a situation where locking one document means 400% more documents get affected. For example, if we have an update using the IN operator with 25 entries, you could be blocking 125 documents (not 25 documents)!

That escalated rather quickly, don’t you think? Using optimistic locking in the same situation, you at most would have to retry five document write locks as the data changed. The challenge for optimistic locking is that if I have five clients that are all updating all documents, you get a flurry of updates. WCE’s come in and eventually resolve things. If you use pessimistic locking, everybody waits their turn, and each one would finish before the next could run.

Much of Percona’s engineering effort goes into what types of systems we should put in place to simulate cheap latches or locks in optimistic locking to allow pessimistic locking to work (without killing performance). This requires an enormous amount of work just to get on-par behavior from the system – specifically in update type workloads, given delete/inserts are very similar in the systems. As a result, we’ve spent more time improving the engine rather than adding additional variables and instrumentation.

Looking forward, MongoRocks aligns more to WiredTiger in its locking structure (they both run as log sequence merges or LSMs), and this means more time working on new optimizer patterns, building things to improve diagnostics or tuning the engine/system to your specific needs. We think you will be excited to see some of the progress we have been discussing for Mongo 3.4 (some of which might even make it directly into 3.2).

What is the MongoRocks anyhow and how does it compare to PerconaFT?

The last concept I want to cover is what RocksDB is exactly, what its future is and how it stacks up to PerconaFT. The most important news is Facebook is working on the core engine, which is used both by MyRocks and MongoRocks (you might have seen some of their talks on the engine). This means Percona can leverage some of the brilliant people working on RocksDB inside Facebook and focus instead on the API linking the engine into place, as well as optimizing how it uses the engine – rather than building the engine completely. Facebook is making some very significant bets on the backend use of RocksDB in several parts of the system, and potentially some user parts of the system (which have historically used InnoDB).

So what is RocksDB, and how does it work? Its core is an LSM system, which means it puts new data into the newest files as an append. Over time, the files get merged into five different levels (not covered here). As part of this process, when you have an update a new copy of the data is saved at the end of the latest file, and a memory table points a record to that location for the latest “version” of that data. In the background, the older records are marked for deletion by something called a “tombstone.” There is a background thread merging still-relevant data from old files into new files so that empty old files get unlinked from the filesystem.

This streamlines the process better than B-Tree’s constant rebalancing and empty blocks in files that need to be compacted or re-used over time. Being write-optimized means that, like PerconaFT previously, it will be faster for write-based operations than something like WiredTiger. (WiredTiger in some cases can be faster on reads, with MMAP being the fastest possible approach for reads.) This also means things like TTL’s can work pretty well in an LSM since all the items that were inserted in time order age out, and the engine can just delete the unneeded file. This solves some of the reasons people needed partitions to begin with, and it still allows sharding to work well.

We are also very excited about creating additional tools that let you query your backups in MongoRocks, as well as some of the very simple ways it will take binary-based backups quickly, consistently (even when sharded) and continually.

I hope this explains more about lock types and what their implications mean as a follow up to Peter’s blog post about the direction Percona is moving regarding PerconaFT and MongoRocks. If you want to ask any more questions, or would like another blog that covers some areas of the system more deeply, please let us know via this blog, email, twitter or even pigeon!

by David Murphy at June 16, 2016 01:51 PM

Jean-Jerome Schmidt

Planets9s - MySQL Performance Tuning, Upgrading to 5.7, Docker Containers & more

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch the replay of our MySQL Database Performance Tuning webinar

Thanks to everyone who participated in this week’s popular webinar on MySQL Database Performance Tuning, which discussed the following topics: database tuning overview, principles of the tuning process, tuning the Operating System configuration and the MySQL configuration, useful tools such as pt-summary and pt-mysql-summary and what to avoid when tuning OS and MySQL configuration. The replay of this webinar is now available online!

Watch the replay

Upgrading to MySQL 5.7 - The Database Upgrade Guide

During this week’s webinar on MySQL database performance tuning, a lot of participants expressed their interest in upgrading to MySQL 5.7. If that’s the case for you as well, then our new whitepaper on how to do so will hopefully be of help. Upgrading to a new major version involves risk, and it is important to plan the whole process carefully. In this whitepaper, we look at the important new changes in MySQL 5.7 and how to plan the test process, do a live system upgrade without downtime, avoid connection failures during slave restarts and switchover, or how to leverage ProxySQL to achieve a graceful upgrade process.

Download the whitepaper

Try NinesControl: deploy and monitor MySQL Galera clusters on Digital Ocean

We recently announced the public availability of NinesControl (beta), our database infrastructure management solution for the cloud. Designed with the needs of developers in mind, NinesControl enables users to easily deploy and monitor (MySQL) Galera clusters on DigitalOcean. Droplets are launched and managed using your own DigitalOcean account. We’d love to get your feedback on this new solution if you haven’t tested it yet, so please check it out and let us know what you think.

Try NinesControl

MySQL Docker containers: understanding the basics

Welcome to our new blog series - “MySQL on Docker” - in which we will touch upon swarms, shared volumes, data-only-containers, security and configuration management, multi-host networking, service discovery and implications on monitoring when we move from host-centric to role-centric services with shorter life cycles. In this initial post, we cover some basics around running MySQL in a container.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 16, 2016 12:49 PM

June 15, 2016

Peter Zaitsev

Troubleshooting hardware resource usage webinar: Q & A

Troubleshooting configuration issues

Troubleshooting hardware resourceIn this blog, I provide answers to the Q & A for the Troubleshooting hardware resource usage webinar.

First, I want to thank everybody who attended the May 26 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: How did you find the memory IO LEAK?

A: Do you mean the replication bug I was talking about in the webinar? I wrote about this bug here. See also comments in the bug report itself.

Q: Do you have common formulas you use to tune MySQL?

A: There are best practices: relate thread concurrency to number of CPU cores you have, set InnoDB buffer pool size large enough so it can contain all your working dataset (which is not always possible), and do not set the Query Cache size larger than 512MB (or even better, turn it off) to avoid issues with global lock set when it needs to be de-fragmented. I prefer not to call them “formulas,” because all options need to be adjusted to match the workload. If this weren’t the case, MySQL Server would have an automatic configuration. There is also a separate webinar on configuration (Troubleshooting configuration issues) where I discuss these practices.

Q: Slide 11: is this real time? Can we get this info for a job that has already finished?

A: Yes, this is real time. No, it is not possible to get this info for a thread that does not exist.

Q: Slide 11: what do negative numbers mean?

A: Numbers are taken from the 

CURRENT_NUMBER_OF_BYTES_USED
 field for table
memory_summary_by_thread_by_event_name
 in Performance Schema. These values, in turn, are calculated as (memory allocated by thread) – (memory freed by thread). Negative numbers here mean either a memory leak or incorrect calculation of memory used by the thread. I reported this behavior in the MySQL Bugs database. Please subscribe to the bug report and wait to see how InnoDB and Performance Schema developers answer.

Q: Are TokuDB memory usage stats recorded in the 

sys.memory_by_thread_by_current_bytes
  table also?  Do we have to set something to enable this collection? I ran the query, but it shows 0 for everything.

A: TokuDB currently does not support Performance Schema, thus its memory statistics are not instrumented. See the user manual on how memory instrumentation works.

Q: With disk what we will check for disk I/O?

A: I quite don’t understand the question. Are you asking on which disk we should check IO statistics? For datadir and other disks, look at the locations where MySQL stores data and log files (if you set custom locations).

Q: How can we put CPU in parallel to process multiple client requests? Put multiple requests In memory by parallel way. By defining transaction. Or there any query hints?

A: We cannot directly put CPU in parallel, but we can affect it indirectly by tuning InnoDB threads-related options (

innodb_threads_concurrency, innodb_read_io_threads, innodb_write_io_threads
) and using the thread pool.

Q: Is there any information the Performance Schema that is not found in the SYS schema?

A: Yes. For example, sys schema does not have a view for statistics about prepared statements, while Performance Schema does, because sys schema takes its statement statistics from digest tables (which make no sense for prepared statements).

Q: What is your favorite tool to investigate a memory issue with a task/job that has already finished?

A: I don’t know that there is such a tool suitable for use in production. In a test environment, you can use valgrind or similar tools. You can also make core dumps of the mysqld process and investigate them after the issue is gone.

by Sveta Smirnova at June 15, 2016 10:12 PM

Installing MongoDB 3.2 and upgrading MongoDB replica set

Upgrading MongoDB replica set

Upgrading MongoDB replica setIn this post, we’ll examine a couple of ways for upgrading MongoDB replica set.

With the release of MongoDB 3.2, comes a rash of new features and improvements. One of these enhancements is improved replica sets. From MongoDB: “A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments.”

Config servers are replica sets!

This is HUGE. It signals a significant advancement in backups, metadata stability and overall maturity. It is a very long-awaited feature that shows MongoDB is maturing. It means:

  • Mongos’ can retry connection vs error
  • Unified and consistent backups!
  • Up to 50 secondaries
    • Remove bugs with Mongos’ not near config servers!

How do we activate all these new awesome features? Let’s do it!

Upgrading to 3.2

  • Replace binaries and restart one secondary at a time
  • Then primaries as well
  • Restart configs in reverse order
    • If configdb=con1, con2, con3
      • Restart con3, con2, and then finally con1 with 3.2
      • Do con1 as FAST as possible, while the balancer is also disabled
    • You no longer need to restart a mongos –upgrade (as of 3.2)
    • Restart all mongos, this will reset ALL connections at some point (whether you do at once or space it out).

Upgrading the replset to the new protocol

This is by far the easiest upgrade bit but DON’T do it until you know your stable on 3.2. Log into each primary and run:

>cfg = rs.conf();
 {
     "_id" : "r1",
     "version" : 2,
     "members" : [
         {
             "_id" : 0,
             "host" : "localhost:17001"
         },
         {
             "_id" : 1,
             "host" : "localhost:17002"
         },
         {
            "_id" : 2,
            "host" : "localhost:17003",
         }
     ]
}
>cfg.protocolVersion=1;
>rs.reconfig(cfg);
{
    "ok" : 1,
    "$gleStats" : {
        "lastOpTime" : Timestamp(1464947003, 1),
        "electionId" : ObjectId("7fffffff0000000000000018")
    }
}

Or:

>db.getSiblingDB(‘config’).shards.forEach(function(shard){
    x = new Mongo(shard.host); /* Assumes no auth needed */
    conf =x.getDB("local").system.replset.findOne()
    conf.protcolVersion=1;conf.version++;
    x.getDB(‘admin’).runCommand({ replSetReconfig: conf });
});

The quick upgrade scripts

  • Upgrade_all_to_3.2.4.sh

Does what it says: kills every process and launches them on 3.2 binaries with no other changes.

  • Upgrade_replica_proto_version.sh

Simply runs the quick rs.reconfig() on each primary, adds the new settings to enable to new replication features.

Let’s upgrade the configs the right way!

This is not included as part of a normal upgrade so only do this AFTER you’re stable and don’t do it before upgrading the protocolVersion we just talked about. (I mean it! Disregard this advice and your life will seriously not be awesome!)

Upgrading to a Config ReplicaSet ( the official way)

  1. Run rs.initiate on the first config in the list (must be 3.2.4+)
    • Must be a fully configured document with configsrv:true defined.
  2. Restart same config server adding
    • configsvrMode = sccc
    • replSet = <name used in rs.initiate()>
    • storageEngine= WiredTiger
  3. Start the new config servers for the other two nodes (should be a new dbpath and port)
  4. Add those nodes to the replSet and check their status
  5. Remove the second original config server from the running
  6. Restart the 1st node you set “sccc” on to not have that setting
  7. At this point, the 1st node will transition to removed if using MMAP.
  8. Restart a mongos with a new configdb line
    • –configdb <replSetName>/node1:port,node2:port,…
    • Only replset members should be listed
  9. Verify you can work and query through mongos
  10. Repeat on all mongos
  11. Remove the 1st node with rs.remove
  12. Shutdown final original config and enable balancer

There is also an easy way.

The easy way, with a small maintenance window, which lets you just restore a good backup and have a nice and simple rollback plan:

  1. Stop all mongos after backing up the config directory
  2. Run rs.initiate on first config server
  3. Stop the 2nd, then the 3rd, restarting them with an empty dbpath directory
  4. Check the rs.status now
  5. Stop the 1st config server and restart with an empty dbpath directory
  6. Check Status
  7. Restart all mongos, adding <replSetName>/ to the front of the configdb line.
  8. Done!

Oh look there is a quick script we have made for you:

  • Upgrade_config_to_repliaset.sh
    • Kill all  config and mongos processes
    • Restart the first config server on non-standard port
    • Mongodump config database
    • Restart c1 as WiredTiger, clearing that data path
    • Import dump back into first config server
    • Restart on normal port
    • Initialize Replica Set
    • Restart second and third config server after clearing dbpath folder
    • After the initial sync, start all the mongos.
    • Done and script exits!

 

by David Murphy at June 15, 2016 08:55 PM

Jean-Jerome Schmidt

The Holy Grail Webinar Replay: MySQL Database Performance Tuning

Thanks to everyone who participated in this week’s popular webinar on MySQL Database Performance Tuning, which looked at answering the following questions:

  • You’re running MySQL as backend database, how do you tune it to make best use of the hardware?
  • How do you optimize the Operating System?
  • How do you best configure MySQL for a specific database workload?

The replay of this webinar is now available online:

Watch the replay

In this webinar, Krzysztof Książek, Senior Support Engineer at Severalnines, discussed some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL database. Krzysztof  also covered some of the variables which are frequently modified even though they should not. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Agenda

  • Database tuning - the what and why
  • Principles of the tuning process
  • Tuning the Operating System configuration
  • Tuning the MySQL configuration
  • Useful tools
    • pt-summary
    • pt-mysql-summary
  • What to avoid when tuning OS and MySQL configuration

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

To view all the blogs of the ‘Become a MySQL DBA’ series visit: http://www.severalnines.com/blog-categories/db-ops

by Severalnines at June 15, 2016 03:11 PM

June 14, 2016

Peter Zaitsev

MongoDB and non-existent collections

MongoDB and non-existent collections

MongoDB and non-existent collectionsIn this blog, I will discuss how I found some of my basic SQL assumptions that don’t hold true when dealing with MongoDB and non-existent collections.

Coming from a MySQL background, I have some assumptions about databases that don’t apply to MongoDB (or other kinds of databases that are neither SQL-based nor relationally-inspired).

An example of this is the assumption that data is organized in rows that are part of a table, with all tables having a strict schema (i.e., a single type of row). When working with MongoDB, this assumption must be transformed into the idea that data is organized in documents that are part of a collection and have a flexible schema so that different types of documents can reside in the same collection.

That’s an easy adjustment to make because a dynamic schema is one of the defining features of MongoDB. There are other less-obvious assumptions that need to be adjusted or redefined as you get familiar with a new product like MongoDB (for example, MySQL does not currently support built-in sharding, while MongoDB does).

There is a more fundamental kind of assumption, and by “fundamental” I mean an assumption that is deeply ingrained because you rely on it so often it’s automatic (i.e., unconscious). We’re usually hit by these when changing programming languages, especially in dynamic ones (“Will I be able to add a number to a string? If so, how will it behave?”). These can make it hard to adjust to a new database (or programming language, operating system, etc.) because we don’t consciously think about them and so we may forget to verify if they hold in the new system. This can happen in “real” life too: try going to a country where cars drive on the other side of the road from yours!

While working on a MongoDB benchmark recently, I was hit by one of these assumptions. I thought sharing my mistake may help others who are also coming to MongoDB from an SQL background.

One of my computing assumptions can be summarized as “reading from a non-existent source will fail with an error.”

Sure enough, it seems to be true for my operating system:

telecaster:~ fernandoipar$ cat notexists
cat: notexists: No such file or directory

And for MySQL:

mysql> select 1 from notexists;
ERROR 1146 (42S02): Table 'test.notexists' doesn't exist

But what happens in MongoDB?

> db.notexists.find()
> db.notexists.find().count()
0

No errors. You get no results. Interestingly, you can even count the number of documents in a cursor that is associated with a non-existent collection.

As I said, I hit this while working on a benchmark. How? I was comparing the throughput for different engines and various configurations, and after preparing the graphs they all showed the same behavior. While it’s not impossible for this to be accurate, it was very unlikely given what I was trying to measure. Some investigation led me to discover a mistake in the preparation phase of my benchmarks. To save time and to use the same data baseline for all tests, I was only running sysbench prepare once, backing up the data directory for each engine, and then restoring this backup before each experiment. The error was that I was restoring one subdirectory below MongoDB’s expected dbpath (i.e., to /data/db/db instead of /data/db), and so my scripts were reading from non-existent collections.

On a MySQL experiment, this would have immediately blown up in my face; with MongoDB, that is not the case.

On reflection, this behavior makes sense for MongoDB in that it is consistent with the write behavior. You don’t need to create a new collection, or even a new database. It’s enough that you write a document to it, and it gets created for you. If writing to a non-existent collection produces no errors, reading from one shouldn’t either.

Still, sometimes an application needs to know if a collection exists. How can you do this? There are multiple ways to do this, and I think the best approach is to verify their existence during the application initialization stage. Here are a couple of examples:

> db.notexists.stats()
{ "ok" : 0, "errmsg" : "Collection [test.notexists] not found." }
> db.getCollectionNames().indexOf("notexists") >= 0
false

I hope you find this useful, and if you’ve been hit by similar problems (such as MongoDB and non-existent collections) due to broken assumptions when moving to MongoDB with an SQL background, I’d love to read about them in the comments!

by Fernando Ipar at June 14, 2016 10:29 PM

Scaling Percona XtraDB Cluster with ProxySQL in Docker Swarm

Percona XtraDB Cluster nodes with Linux Network namespaces

Percona XtraDB Cluster with ProxySQLIn this post, we’ll look at scaling Percona XtraDB Cluster with ProxySQL in Docker Swarm.

In my previous post, I showed how to employ Percona XtraDB Cluster on multiple nodes in a Docker network.

The intention is to be able to start/stop nodes and increase/decrease the cluster size dynamically. This means that we should track running nodes, but also to have an easy way to connect to the cluster.

So there are two components we need: service discovery to register nodes and ProxySQL to handle incoming traffic.

The work with service discovery is already bundled with Percona XtraDB Cluster Docker images, and I have experimental images for ProxySQL https://hub.docker.com/r/perconalab/proxysql/.

For multi-node management, we also need some orchestration tool, and a good start is Docker Swarm. Docker Swarm is simple and only provides basic functionality, but it works for a good start. (For more complicated setups, consider Kubernetes.)

I assume you have Docker Swarm running, but if not here is some good material on how to get it rolling. You also need to have service discovery running (see http://chunqi.li/2015/11/09/docker-multi-host-networking/ and my previous post).

To start a cluster with ProxySQL, we need a docker-compose definition file docker-compose.yml.:

version: '2'
services:
 proxy:
   image: perconalab/proxysql
   networks:
    - front
    - Theistareykjarbunga
   ports:
    - "3306:3306"
    - "6032:6032"
   env_file: .env
 percona-xtradb-cluster:
   image: percona/percona-xtradb-cluster:5.6
   networks:
    - Theistareykjarbunga
   ports:
    - "3306"
   env_file: .env
networks:
 Theistareykjarbunga:
  driver: overlay
 front:
  driver: overlay

For convenience, both proxy and percona-xtradb-cluster share the same environment files (.env):

MYSQL_ROOT_PASSWORD=secret
DISCOVERY_SERVICE=10.20.2.4:2379
CLUSTER_NAME=cluster15
MYSQL_PROXY_USER=proxyuser
MYSQL_PROXY_PASSWORD=s3cret

You can also get both files from https://github.com/percona/percona-docker/tree/master/pxc-56/swarm.

To start both the cluster node and proxy:

docker-compose up -d

We can start as many Percona XtraDB Cluster nodes as we want:

docker-compose scale percona-xtradb-cluster=5

The command above will make sure that five nodes are running.

We can check it with docker ps:

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
725f5f2699cc percona/percona-xtradb-cluster:5.6 "/entrypoint.sh " 34 minutes ago Up 38 minutes 4567-4568/tcp, 10.20.2.66:10284->3306/tcp smblade04/swarm_percona-xtradb-cluster_5
1c85ea1367e8 percona/percona-xtradb-cluster:5.6 "/entrypoint.sh " 34 minutes ago Up 38 minutes 4567-4568/tcp, 10.20.2.66:10285->3306/tcp smblade04/swarm_percona-xtradb-cluster_2
df87e9c1342e percona/percona-xtradb-cluster:5.6 "/entrypoint.sh " 34 minutes ago Up 38 minutes 4567-4568/tcp, 10.20.2.66:10283->3306/tcp smblade04/swarm_percona-xtradb-cluster_4
cbb82f7a9789 perconalab/proxysql "/entrypoint.sh " 36 minutes ago Up 40 minutes 10.20.2.66:3306->3306/tcp, 10.20.2.66:6032->6032/tcp smblade04/swarm_proxy_1
59e049fe22a9 percona/percona-xtradb-cluster:5.6 "/entrypoint.sh " 36 minutes ago Up 40 minutes 4567-4568/tcp, 10.20.2.66:10282->3306/tcp smblade04/swarm_percona-xtradb-cluster_1
0921a2611c3c percona/percona-xtradb-cluster:5.6 "/entrypoint.sh " 37 minutes ago Up 42 minutes 4567-4568/tcp, 10.20.2.5:32774->3306/tcp centos/swarm_percona-xtradb-cluster_3

We can see that Docker schedules containers on two different nodes, the Proxy SQL container is smblade04/swarm_proxy_1, and the connection point is 10.20.2.66:6032.

To register Percona XtraDB Cluster in ProxySQL we can just execute the following:

docker exec -it smblade04/swarm_proxy_1 add_cluster_nodes.sh

The script will connect to the service discovery DISCOVERY_SERVICE (defined in .env file) and register nodes in ProxySQL.

To check that they are all running:

mysql -h10.20.2.66 -P6032 -uadmin -padmin
MySQL [(none)]> select * from stats.stats_mysql_connection_pool;
+-----------+-----------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host  | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-----------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 0         | 10.0.14.2 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 212        |
| 0         | 10.0.14.4 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 155        |
| 0         | 10.0.14.5 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 136        |
| 0         | 10.0.14.6 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 123        |
| 0         | 10.0.14.7 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 287        |
+-----------+-----------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

We can connect to a cluster using a ProxySQL endpoint:

mysql -h10.20.2.66 -uproxyuser -psecret
 mysql -h10.20.2.66 -P3306 -uproxyuser -ps3cret -e "SELECT @@hostname"
 +--------------+
 | @@hostname |
 +--------------+
 | 59e049fe22a9 |
 +--------------+
 mysql -h10.20.2.66 -P3306 -uproxyuser -ps3cret -e "SELECT @@hostname"
 +--------------+
 | @@hostname |
 +--------------+
 | 725f5f2699cc |
 +--------------+

We can see that we connect to a different node every time.

Now if we want to get crazy and make sure we have ten Percona XtraDB Cluster nodes running, we can execute the following:

docker-compose scale percona-xtradb-cluster=10
Creating and starting swarm_percona-xtradb-cluster_6 ...
Creating and starting swarm_percona-xtradb-cluster_7 ...
Creating and starting swarm_percona-xtradb-cluster_8 ...
Creating and starting swarm_percona-xtradb-cluster_9 ...
Creating and starting swarm_percona-xtradb-cluster_10 ...

And Docker Swarm will make sure ten nodes are running.

I hope this demonstrates that you can easily start playing with multi-nodes using Percona XtraDB Cluster. In the next post, I will show how to use Percona XtraDB Cluster with Kubernetes.

by Vadim Tkachenko at June 14, 2016 05:07 PM

Jean-Jerome Schmidt

Understand what has changed between MySQL 5.6 and 5.7 before upgrading

MySQL 5.7 has been GA since October 2015. At the time of writing, it is still a very new release. But more and more companies are looking into upgrading, as it has a list of great new features. Schema changes can be performed with less downtime, with more online configuration options. Multi-source and parallel replication improvements make replication more flexible and scalable. Native support for JSON data type allows for storage, search and manipulation of schema-less data.

An upgrade, however, is a complex process - no matter which major MySQL version you are upgrading to. There are a few things you need to keep in mind when planning this, such as important changes between versions 5.6 and 5.7 as well as detailed testing that needs to precede any upgrade process. This is especially important if you would like to maintain availability for the duration of the upgrade.

Which version are you using?

With MySQL 5.7, and actually with any MySQL version, one would only upgrade from the version prior to it. . It means that while you can easily upgrade from MySQL 5.6 to 5.7, if you are on MySQL 5.5, the upgrade has to be executed in two steps  - first 5.5 -> 5.6 and then 5.6 -> 5.7. The upgrade process may also differ in such case. While a straight upgrade from MySQL 5.6 to 5.7 can be executed as a binary upgrade, the first step (5.5 -> 5.6) should be performed by dumping and then reloading the data.

Changes between MySQL 5.6 and 5.7

As usual with new MySQL versions, many new features have been added and some existing behaviors have been altered. This may cause potential issues with your setup if you neglect the importance of research and testing - new behavior may not be compatible with your application.

One of the main changes is the way that internal metrics are made available. In MySQL 5.6, you could query information_schema.global_status table and get some data about MySQL internals. In MySQL 5.7, this data is available through the performance_schema - such change may render your monitoring and trending software useless unless you enable ‘compatibility’ mode.

Some changes have been introduced in the default SQL mode - right now MySQL uses STRICT_TRANS_TABLES by default. This is great change but it sure can break compatibility with older applications.

Another important change is related to the authentication mechanism. Pre-4.1 passwords (‘old passwords’) have been removed, a new authentication system has been added. Amongst others, password expiration policy has been introduced - this can become a serious issue as the default settings may not be safe for systems upgraded from older MySQL versions - it’s better to set policies manually instead of relying on defaults, which may change in the future.

Those are definitely not the only changes which may pose a problem in an upgrade process. We’ve covered these in more detail in our database upgrade guide, “Migrating to MySQL 5.7”.

by Severalnines at June 14, 2016 01:18 PM

June 13, 2016

Peter Zaitsev

RocksDB from Docker containers

RocksDB from Docker containers

RocksDB from Docker containersThis post will discuss how to get RocksDB from Docker containers to use with Percona Server for MongoDB.

With our Percona Server for MongoDB 3.2 release, we made RocksDB a first class citizen. With this newly-available engine, we want to make it easy for everybody interested to try it. So it is now available in docker images from https://hub.docker.com/r/percona/percona-server-mongodb/.

If you have docker running, starting RocksDB is very easy:

docker run -d -p 27017:27017 percona/percona-server-mongodb --storageEngine=rocksdb

Then run:

mongo --eval "printjson(db.serverStatus())"

You should see this as part of the output:

"storageEngine" : {
 "name" : "rocksdb",
 "supportsCommittedReads" : true,
 "persistent" : true
 },

Give it a try, and let us know how RocksDB works for you!

by Vadim Tkachenko at June 13, 2016 06:14 PM

Webinar Thursday, June 16: MongoDB Schema Design

MongoDB Schema Design

MongoDB Schema DesignPlease join Jon Tobin, Director of Solutions Engineering at Percona on Thursday, June 16, 2016 10:00am PDT (UTC-7) for a webinar on “MongoDB® Schema Design.”

Jon will discuss the most common misconception when evaluating the use of MongoDB: that it is “schemaless.” THIS IS NOT TRUE. MongoDB has a document structure, and thus, a schema. While the structure is much more dynamic than that of most relational database models, choices that you make can and will pay themselves forward (or haunt you forever).

In this webinar, we’ll cover what a document is, how they can be structured, and what structures work (and don’t work) for a particular use case. We will also touch on design decisions and how they affect the ability of the cluster to scale in the future. Some of the topics that will be covered are:

  • Document Structure
  • Embedding vs Referencing
  • Normalization vs De-Normalization
  • Atomicity
  • MongoDB Sharding

Register here.

MongoDB Schema DesignJon TobinDirector of Solution Engineering

When not saving kittens from sequoias or helping the elderly across busy intersections, Jon Tobin is Percona’s Director of Solutions Engineering. He has spent over 15 years in the IT industry. For the last 6 years, Jon has been helping innovative IT companies assess and address customer’s business needs through well-designed solutions.

 

by Dave Avery at June 13, 2016 03:49 PM

Jean-Jerome Schmidt

MySQL Docker Containers: Understanding the basics

Docker is quickly becoming mainstream, as a method to package and deploy self-sufficient applications in primarily stateless Linux containers. But for a stateful service like a database, this might be bit of a headache. How do we best configure MySQL in a container environment? What can go wrong? Should we even run our databases in a container environment? How does performance compare with e.g. running on virtual machines or bare-metal servers? How do we manage replicated or clustered setups, where multiple containers need to be created, upgraded and made highly available?

So, welcome to our new blog series - “MySQL on Docker”. We will touch upon swarms, shared volumes, data-only-containers, security and configuration management, multi-host networking, service discovery and implications on monitoring when we move from host-centric to role-centric services with shorter life cycles.

In our first blog post, we are going to cover some basics around running MySQL in a container. We are going to use the term ‘Docker’ as the container platform throughout the blog series.

MySQL Docker Containers

Think about a container as a “lightweight virtual machine”. Unlike virtual machines though, containers do not require an entire operating system, all required libraries and the actual application binaries. The same Linux kernel and libraries can be shared between multiple containers running on the host. Docker makes it easy to package Linux software in self-contained images, where all software dependencies are bundled and deployed in a repeatable manner. An image will have exactly the same software installed, whether we run it on a laptop or on a server. The key benefit of Docker is that it allows users to package an application with all of its dependencies into a standardized unit (container). Running many containers allows each one to focus on a specific task; multiple containers then work in concert to implement a distributed system.

The traditional way to run a MySQL database is to install the MySQL packages on a host (bare-metal, virtual machine, cloud instance), and applications would just have to connect to the listening port. Most of the management tasks, for example, configuration tuning, backup, restore, database upgrade, performance tweaking, troubleshooting and so on have to be executed on the database host itself. You would expect to have several ports accessible for connection, for example port TCP 22 for SSH, TCP 3306 for MySQL or UDP 514 for syslog.

In a container, think of MySQL as one single unit that only serve MySQL related stuff on port 3306. Most of the operation should be performed under this single channel. Docker works great in packaging your application/software into one single unit, which you can then deploy anywhere as long as Docker engine is installed. It expects the package, or image to be run as a single process per container. With Docker, the flow would be you (or someone) build a MySQL image using a specific version and vendor, package the image and distribute to anybody who wants to quickly fire a MySQL instance.

Let’s get it running

Let’s familiarize ourselves with a MySQL container running on Docker. We’ll take a ‘break/fix’ approach, so expect to see some errors pop up here and there. We’ll look at the errors and see why they happen. We are going to use the official MySQL image created and maintained by Docker.

To begin with, we must have a host. It can be any type of hosts (physical or virtual) running on Linux, Mac OS X or Windows. Please refer to Docker’s installation guide for details. You can also use docker-machine to provision hosts on a supported cloud provider like DigitalOcean and AWS EC2 Container Service, but we will cover that in another blog post. Here, we are going to use Ubuntu 14.04 as our machine host and use standard command line for deployment and management.

Next, find a container image that you want from the Docker registry. It can be a public registry like Docker Hub or a private registry, where you host the containers’ image on-premises, within your own network. If you can’t find the image that fits you, you can build your own.

There are many MySQL container images available in the Docker Hub registry. The following screenshot shows some examples:

Firing up a MySQL container

First, you have to install Docker. In the Linux box:

$ sudo apt-get install docker.io #Ubuntu/Debian
$ yum install docker # RedHat/CentOS

Then, use the following basic command to run a MySQL container:

$ docker run --name=test-mysql mysql

Yeap, that’s it. Just two steps. Here is what the second command line does:

  • run - Run a command in a new container.
  • --name - Give a name to the container. If you don’t specify this, Docker will generate a random name.
  • mysql - The image name as stated on the Docker Hub page. This is the simplest image name. The standard is “username/image_name:tag”, for example “severalnines/mysql:5.6”. In this case, we specified “mysql”, which means it has no username (the image is built and maintained by Docker, therefore no username), the image name is “mysql” and the tag is latest (default). If the image does not exist, it will pull it first from Docker Hub into the host, and then run the container.

You should then see the following lines:

Status: Downloaded newer image for mysql:latest
error: database is uninitialized and password option is not specified
  You need to specify one of MYSQL_ROOT_PASSWORD, MYSQL_ALLOW_EMPTY_PASSWORD and MYSQL_RANDOM_ROOT_PASSWORD

It looks like the container deployment failed. Let’s verify with the following command if there is any running container:

$ docker ps

There is no running container. Let’s show all containers (including the non-running ones):

$ docker ps -a
CONTAINER ID        IMAGE                                     COMMAND                CREATED             STATUS                     PORTS               NAMES
80b4914976a2        mysql:latest                              "docker-entrypoint.s   6 minutes ago       Exited (1) 6 minutes ago                       test-mysql

Under the ‘STATUS’ column, you can see the status was “Exited (1) 6 minutes ago”. If a program ended while returning a non-zero value, it means that the program was terminated with some kind of error. So, what happened? The MySQL image was successfully downloaded but Docker failed to run it as container because the environment is not properly set up. This is stated in the error lines.

Let’s try to fix this by specifying one of the environment variables:

$ docker run --name=test-mysql --env="MYSQL_ROOT_PASSWORD=mypassword" mysql
FATA[0000] Error response from daemon: Conflict. The name "test-mysql" is already in use by container 80b4914976a2. You have to delete (or rename) that container to be able to reuse that name.

Oops, another error occurred. We were trying to run a new container with the same name as an existing container. Let’s remove the created container and run the command again:

$ docker rm test-mysql
$ docker run --name=test-mysql --env="MYSQL_ROOT_PASSWORD=mypassword" mysql

You should see lots of lines appear. That’s MySQL initialization when starting up as newly installed software. You should see MySQL is ready to accept connections:

2016-06-01T12:06:59.543352Z 0 [Note] mysqld: ready for connections.
Version: '5.7.12'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)

Looks good. Our MySQL container is now running. However, you are now stuck in the terminal and can’t do anything because the container is running in attach mode (running in foreground). This is so inconvenient. We would expect MySQL to run as a service instead. Let’s consider this as a failed deployment and stop the current container. In another terminal, stop the running container and run it again in detach mode (running as background):

$ docker stop test-mysql
$ docker rm test-mysql
$ docker run --detach --name=test-mysql --env="MYSQL_ROOT_PASSWORD=mypassword" mysql
a6b09a8d332a16e008cb3035ffd36bcd664886b79c9d2533c3dc1d47026a33a0

You will get an output of the container ID, indicating the container is successfully running in the background. Let’s verify the status of the container:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
83285aa548ba        mysql:latest        "docker-entrypoint.s   4 minutes ago       Up 4 minutes        3306/tcp            test-mysql

MySQL container is now running and accessible on port 3306 of that container. Since it was running in the background, we could not see what was happening during the MySQL startup. Use the following command to see what happened during the container startup:

$ docker logs test-mysql

Connecting to the Container

Next, we retrieve the IP address of that container in order to access it. Run the inspect command:

$ docker inspect test-mysql

We can see lots of low-level information of that container. Lookup the “IPAddress” line:

        "IPAddress": "172.17.0.20",

From the physical host, we can now access the MySQL server. Ensure the MySQL client package is installed beforehand:

$ apt-get install mysql-client
$ mysql -uroot -pmypassword -h 172.17.0.20 -P 3306
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Voila! We now have a MySQL instance running in a container. However, this port is only accessible within the Docker network. If you have another Docker container for your application, you can connect with them directly via IP address 172.17.0.20 on port 3306, as illustrated in the following diagram:

Docker allocates a dynamic IP address on every running container. Whenever a container is restarted, you will get a new IP address. You can get the IP address range from the Docker network interface in the Linux box. Run the following command to see Docker’s network range:

$ ip a | grep docker | grep inet
    inet 172.17.42.1/16 scope global docker0

Our container’s IP address is 172.17.0.20 which is in the range of 172.17.42.1/16. Let’s restart the container, and you should get a new IP address being assigned by Docker:

$ docker stop test-mysql
$ docker start test-mysql
$ docker inspect test-mysql | grep IPAddress
        "IPAddress": "172.17.0.21",

Our IP address just changed to 172.17.0.21. If you had an application that connects to this container via the old IP address, the application would not get connected anymore. Docker introduces another way to link your container with another container, to ensure whatever IP address assigned to it will get updated in the linked container. Let’s say we deploy a Wordpress application (which has no MySQL installed on that image), and want to link with our existing MySQL container, test-mysql. Here is what you should do:

$ docker run --detach --name test-wordpress --link test-mysql:mysql wordpress

In a couple of minutes, the container “test-wordpress” will be up and running and linked to our test-mysql container:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
0cb9f4152022        wordpress:latest    "/entrypoint.sh apac   15 seconds ago      Up 15 seconds       80/tcp              test-wordpress
0a7aa1cf196e        mysql:latest        "docker-entrypoint.s   16 minutes ago      Up 16 minutes       3306/tcp            test-mysql

To verify if it’s linked correctly, enter the test-wordpress container and look at the content of /etc/hosts:

$ docker exec -it test-wordpress bash
root@0cb9f4152022:/var/www/html# cat /etc/hosts
172.17.0.22    0cb9f4152022
127.0.0.1    localhost
::1    localhost ip6-localhost ip6-loopback
fe00::0    ip6-localnet
ff00::0    ip6-mcastprefix
ff02::1    ip6-allnodes
ff02::2    ip6-allrouters
172.17.0.21    mysql 0a7aa1cf196e test-mysql

The application can now see an entry with IP address and hostname related to the linked MySQL container. If you restart the MySQL container and get another IP address, the entry will be updated by Docker accordingly.

You can also expose the MySQL container to the outside world by mapping the container’s MySQL port to the host machine port using the publish flag (as illustrated in the above diagram). Let’s re-initiate our container and run it again with an exposed port:

$ docker rm -f test-mysql
$ docker run --detach --name=test-mysql --env="MYSQL_ROOT_PASSWORD=mypassword" --publish 6603:3306 mysql

Verify if the container is correctly mapped:

CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS                    NAMES
8d97b70658a9        mysql:latest        "docker-entrypoint.s   3 seconds ago       Up 3 seconds        0.0.0.0:6603->3306/tcp   test-mysql
0cb9f4152022        wordpress:latest    "/entrypoint.sh apac   15 minutes ago      Up 15 minutes       80/tcp                   test-wordpress

At this point, we can now access the MySQL container directly from the machine’s port 6603.

Configuration management

The container comes with a standard MySQL 5.7 configuration options inside /etc/mysql/my.cnf. Let’s say our application that connects to this MySQL server requires more max_connections (default is 151) during startup, so we need to update the MySQL configuration file. The best way to do this in a container is to create alternative configuration files in a directory on the host machine and then mount that directory location as /etc/mysql/conf.d inside the mysql container.

On the host machine, create a directory and a custom MySQL configuration file:

$ mkdir -p /root/container/test-mysql/conf.d
$ vim /root/container/test-mysql/conf.d/my-custom.cnf

And add the following lines:

[mysqld]
max_connections=200

Then, we have to re-initiate the MySQL container (remove and run) by mapping the volume path as shown in the following command (the long command is trimmed to make it more readable):

$ docker run \
--detach \
--name=test-mysql \
--env="MYSQL_ROOT_PASSWORD=mypassword" \
--publish 6603:3306 \
--volume=/root/docker/test-mysql/conf.d:/etc/mysql/conf.d \
mysql

This will start a new container test-mysql where the MySQL instance uses the combined startup settings from the default /etc/mysql/my.cnf and /etc/mysql/conf.d/my-custom.cnf, with settings from the latter taking precedence.

Verify the new setting is loaded from the machine host:

$ mysql -uroot -pmypassword -h127.0.0.1 -P6603 -e 'show global variables like "max_connections"';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 200   |
+-----------------+-------+

Many configuration options can also be passed as flags to mysqld. This will give you the flexibility to customize the container without needing a custom .cnf file. For example, if you want to change the max connections similar to the above and collation for all tables to use UTF-8 (utf8mb4), just run the following:

$ docker run \
--detach \
--name=test-mysql \
--env="MYSQL_ROOT_PASSWORD=mypassword" \
--publish 6603:3306 \
mysql \
--max-connections=200 \
--character-set-server=utf8mb4 \
--collation-server=utf8mb4_unicode_ci

Data Storage

There are several ways to store data used by MySQL that run in Docker containers. Docker can manage the storage of your database’s data by writing the database files to disk on the host system, using its own internal volume management. If you run the inspect command, look at the “Volumes” directive and you should notice by default MySQL data directory (/var/lib/mysql) is mounted into Docker’s internal volume:

$ docker inspect test-mysql
    ...
    "Volumes": {
        "/etc/mysql/conf.d": "/root/docker/test-mysql/conf.d",
        "/var/lib/mysql": "/var/lib/docker/vfs/dir/4d437e2fe5ab2f71ffeea2590d72a417a9ca88607c130b46f5ad819d0a5b68cd"
    }

This is the easiest way and fairly transparent to the user. The downside is that the files may be hard to locate for tools and applications that run directly on the host system, i.e. outside containers.

The other way is to create a data directory on the host system (outside the container) and mount this to a directory visible from inside the container. This places the database files in a known location on the host system, and makes it easy for tools and applications on the host system to access the files. The downside is that the user needs to make sure that the directory exists, and that e.g. directory permissions and other security mechanisms on the host system are correctly set up.

Create a data directory on a suitable volume on your host system, e.g. /storage/docker/mysql-datadir:

$ mkdir -p /storage/docker/mysql-datadir

Start your mysql container like this:

$ docker run \
--detach \
--name=test-mysql \
--env="MYSQL_ROOT_PASSWORD=mypassword" \
--publish 6603:3306 \
--volume=/root/docker/test-mysql/conf.d:/etc/mysql/conf.d \
--volume=/storage/docker/mysql-datadir:/var/lib/mysql \
mysql

The ”--volume=/storage/docker/mysql-datadir:/var/lib/mysql“ part of the command mounts the /storage/docker/mysql-datadir directory from the underlying host system as /var/lib/mysql inside the container, where MySQL by default will write its data files, as illustrated in the following diagram:

When inspecting the container, you should see the following lines:

$ docker inspect test-mysql
    ...
    "Volumes": {
        "/etc/mysql/conf.d": "/root/docker/test-mysql/conf.d",
        "/var/lib/mysql": "/storage/docker/mysql-datadir"
    }

Which is now clearer for you to see the directory and files on the machine host created by this container:

$ ls -al /storage/docker/mysql-datadir/
total 188452
drwxr-xr-x 5  999  999     4096 Jun  3 10:13 .
drwxr-xr-x 3 root root     4096 Jun  3 10:06 ..
-rw-r----- 1  999  999       56 Jun  3 10:13 auto.cnf
-rw-r----- 1  999  999     1329 Jun  3 10:13 ib_buffer_pool
-rw-r----- 1  999  999 79691776 Jun  3 10:13 ibdata1
-rw-r----- 1  999  999 50331648 Jun  3 10:13 ib_logfile0
-rw-r----- 1  999  999 50331648 Jun  3 10:13 ib_logfile1
-rw-r----- 1  999  999 12582912 Jun  3 10:13 ibtmp1
drwxr-x--- 2  999  999     4096 Jun  3 10:13 mysql
drwxr-x--- 2  999  999     4096 Jun  3 10:13 performance_schema
drwxr-x--- 2  999  999    12288 Jun  3 10:13 sys

Note that restarting or removing the container does not remove the MySQL data directory. When you restart a MySQL container by using “stop” and “start” command, it would be similar to restarting the MySQL service in a standard installation:

$ docker stop test-mysql
$ docker start test-mysql

If you remove the MySQL container, the data in the mounted volumes will still be intact and you can run a new instance, mounting the same volume as data directory:

$ docker rm -f test-mysql
$ docker run -d --name=new-mysql -p 6604:3306 -v /storage/docker/mysql-datadir:/var/lib/mysql mysql

*If a MySQL container runs on top of an existing MySQL datadir, the $MYSQL_ROOT_PASSWORD variable should be omitted from the “run” command line; it will in any case be ignored, and the pre-existing database will not be changed in any way.

However, only one running (active) container is allowed to access the MySQL data directory at a time. Running another container mounting the same datadir volume will cause MySQL startup error on the later container:

$ docker run -d --name=another-new-mysql -p 6605:3306 -v /storage/docker/mysql-datadir:/var/lib/mysql mysql
$ docker logs another-new-mysql
2016-06-10T07:16:59.538635Z 0 [Note] InnoDB: Completed initialization of buffer pool
2016-06-10T07:16:59.540373Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2016-06-10T07:16:59.551646Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2016-06-10T07:16:59.551656Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.
2016-06-10T07:16:59.551659Z 0 [Note] InnoDB: Retrying to lock the first data file
2016-06-10T07:17:00.552294Z 0 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11
2016-06-10T07:17:00.552364Z 0 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.

This is expected since a MySQL process must exclusively own the MySQL data directory to avoid any potential conflicts. Having more than one mysqld process connecting to the same data directory is just not feasible. That’s why MySQL horizontal scaling can only be achieved via replication.

That concludes this blog. Do share your experience in managing MySQL containers in the comments section below.

by Severalnines at June 13, 2016 12:01 PM

Henrik Ingo

My son responded to the EU consultation on Freedom of Panorama

Yesterday I helped my 8 year old son submit a response (PDF) to the EU Commission's consultation on whether it is a good idea to require permission/payment for the right to photographs buildings and statues in public spaces (aka Freedom of Panorama). Our kids had heard about this issue from a discussion on a dinner table, and quickly became interested:

read more

by hingo at June 13, 2016 10:06 AM

MariaDB AB

Table and tablespace encryption on MariaDB 10.1

Jan Lindstrom

MariaDB has a wide set of security features to protect data (see MariaDB Enterprise Security Webinar). To encrypt the data in a MariaDB 10.1 database, you can enable data-at-rest encryption. MariaDB allows the option to select the most suitable level of the encryption in MariaDB: Temporary files, Aria tables, InnoDB tablespaces, InnoDB tables, InnoDB log files and Binlogs. In this article I will explain how to turn on encryption for InnoDB and discuss how encryption affects performance.

Encryption Plugins

Encryption plugins in MariaDB are needed to use the data-at-rest encryption feature. They're responsible for both key management and for the actual encryption and decryption of data. MariaDB currently supports two encryption plugins for real usage:

Creating Encryption Keys

There are different methods for creating encryption keys, depending the used encryption plugin. Keys for file_key_management_plugin can be generated using OpenSSL with the following command:

openssl enc -aes-256-ctr -k mylong2016secret@keyfor35fun -P -md sha1
salt=9265402E0907A5D4
key=55101B33D507041805AF49B95BBAE995B4218C5054D2DA2852C5F0042B837FC3
iv =C7040FF9DB066043D16ADBEC4F18053F

The key file is a text file containing a key identifier and the hex-encoded key. For example, keys.txt using the generated key above looks like this:

1;55101B33D507041805AF49B95BBAE995B4218C5054D2DA2852C5F0042B837FC3

For information on how to create keys for AWS Key Management System see their MariaDB AWS KMS Enryption Setup Guide. For more advanced information, see our AWS KMS Encryption Plugin Advanced Usage documentation.

InnoDB Specified Table Encryption

Specified Table encryption means that the administrator chooses which tables to encrypt. This allows you to balance security with speed. To use table encryption, you have to load the file-key-management-plugin, define the location of key file, and define the AES method used. To do this, you would add a few lines like these to the MariaDB configuration file:

[mariadb]
plugin-load-add=file_key_management
file_key_management_filename=/mnt/usb/secret.txt
file_key_management_encryption_algorithm=AES_CTR

We recommend that you place the encryption key file on an external storage device (e.g., a USB drive). This external storage can be unmounted after the MariaDB server is started and stored in secure location. After this, the database developer may select which tables contain sensitive data for encryption. Encryption can be enabled for a table when it's created or using the ALTER TABLE statement, as shown in the example below:

CREATE TABLE table1
(col1 INT NOT NULL PRIMARY KEY, secret CHAR(200))
ENGINE=InnoDB ENCRYPTED=YES;

CREATE TABLE table2
(col1 INT NOT NULL PRIMARY KEY, secret CHAR(200))
ENGINE=InnoDB;

ALTER TABLE table2
ENCRYPTED=YES encryption_key_id=2;

Note that the InnoDB redo-log is not encrypted by default, even when the tables are. Therefore, you should consider also using encryption for the redo-log. InnoDB redo-logs can be encrypted with this one line to the MariaDB configuration file:

[mariadb]
innodb-encrypt-log

InnoDB Transparent Tablespace Encryption

With tablespace encryption, all InnoDB tables and tablespaces are encrypted including the system tablespace. When configurating the server for the type of encryption, we recommended that you also enable InnoDB redo-log encryption. Below is an example of the configuration settings required:

[mariadb]
innodb-encrypt-tables
innodb-encrypt-log
innodb-encryption-threads = 4
plugin-load-add=file_key_management
file_key_management_filename=/mnt/usb/secret.txt
file_key_management_encryption_algorithm=AES_CTR
# for monitoring
innodb-tablespaces-encryption

After adding the above setting and restarted the server to implement them, all existing tables and all new tables will be encrypted—unless specified otherwise for a particular table. Despite the configuration, MariaDB does allow encryption to be disabled for tables that don't require encryption. This can be done when the table is created or by altering it later with the ALTER TABLE statement. Here's an example of both scenarios:

CREATE TABLE table3
(col1 INT NOT NULL PRIMARY KEY, notsecret VARCHAR(150))
ENGINE=InnoDB ENCRYPTED=NO;

CREATE TABLE table4
(col1 INT NOT NULL PRIMARY KEY, notsecret VARCHAR(150))
ENGINE=InnoDB;

ALTER TABLE table4 ENCRYPTED=NO;

If you don't want users to be able to create tables without encryption, you can set the server to force encryption. Just add the following line to the MariaDB confirguration file:

innodb-encrypt-tables=FORCE

Key Rotation

Currently, only Amazon AWS KMS plugin supports key rotation from the encryption plugins intended for serious use. Keys provided by plugin can be rotated using the SET statement like so:

SET global aws_key_management_rotate_key=key_id;

You'll also need to set the key_id equal to -1 so that all keys are rotated.

Tablespace key rotation is based on the age of key used. Key age is the key_version and the age limit as defined using innodb_encryption_rotate_key_age parameter. This parameter will indicate how old keys that are not yet rotated. For example, if innodb_encryption_rotate_key_age is set to a value of 10 and current key_version is set to 20, all tablespaces with key_version greater than 10 will be rotated to use new key_version.

Currently, this key rotation does not happen immediately when the tablespace key_version becomes obsolete. The need for tablespace rotation is checked only when some encrypted tablespace is changed by an INSERT or an UPDATE statement. At that point, a new key_version is requested from plugin. Below is an example showing this:

SELECT * FROM information_schema.innodb_tablespaces_encryption \G

*************************** 1. row ***************************
                       SPACE: 0
                        NAME: NULL
           ENCRYPTION_SCHEME: 1
          KEYSERVER_REQUESTS: 1
             MIN_KEY_VERSION: 1
         CURRENT_KEY_VERSION: 1
    KEY_ROTATION_PAGE_NUMBER: NULL
KEY_ROTATION_MAX_PAGE_NUMBER: NULL
              CURRENT_KEY_ID: 1
*************************** 2. row ***************************
                       SPACE: 1
                        NAME: sbtest/sbtest1
           ENCRYPTION_SCHEME: 1
          KEYSERVER_REQUESTS: 1
             MIN_KEY_VERSION: 1
         CURRENT_KEY_VERSION: 1
    KEY_ROTATION_PAGE_NUMBER: NULL
KEY_ROTATION_MAX_PAGE_NUMBER: NULL
              CURRENT_KEY_ID: 1


SET GLOBAL aws_key_management_rotate_key = -1;


SELECT * FROM information_schema.innodb_tablespaces_encryption \G

************************** 1. row ***************************
                       SPACE: 0
                        NAME: NULL
           ENCRYPTION_SCHEME: 1
          KEYSERVER_REQUESTS: 2
             MIN_KEY_VERSION: 11
         CURRENT_KEY_VERSION: 12
    KEY_ROTATION_PAGE_NUMBER: NULL
KEY_ROTATION_MAX_PAGE_NUMBER: NULL
              CURRENT_KEY_ID:1
************************* 2. row ***************************
                       SPACE: 1
                        NAME: sbtest/sbtest1
           ENCRYPTION_SCHEME: 1
          KEYSERVER_REQUESTS: 3
             MIN_KEY_VERSION: 11
         CURRENT_KEY_VERSION: 12
    KEY_ROTATION_PAGE_NUMBER: NULL
KEY_ROTATION_MAX_PAGE_NUMBER: NULL
              CURRENT_KEY_ID: 1

Monitoring

The Information Schema INNODB_TABLESPACES_ENCRYPTION table contains current encryption status for every table. For example, if key rotation is occurring, fields KEY_ROTATION_PAGE_NUMBER and KEY_ROTATION_MAX_PAGE_NUMBER indicate that background threads currently are working on encrypting or decrypting pages. An example of the output can be seen above. The tablespace with NULL for the name field is a system tablespace (ibdata1).

Performance

We used CentOS Linux release 7.1.1503 (Core) using the 3.10.0-229.el7.x86\_64 Linux kernel, ioMemory SX300-1600 with VSL driver 4.2.1 build 1137 and NVMFS 1.1.1 for filesystem. The benchmark used was Sysbench 0.5 with following command:

./sysbench --test=tests/db/oltp.lua --mysql-table-engine=innodb --oltp-test-mode=complex --oltp-read-only=off --oltp-table-size=100000 --max-requests=100000000 --num-threads=128 --max-time=60 --mysql-socket=/mnt/dfs/db/mysql.sock --mysql-user=root run

Figures

Figure 1 shows Sysbench OLTP benchmark transactions in the second result. The number of threads used has been variaed and the default unencrypted tables are compared to encrypted tables:

Figure 2 shows the Sysbench OLTP average response time at 95 percentile of results. It's done by varying the number of threads used and comparing default unencrypted tables to encrypted tables:

Limitations

There are some limitations to the implementation of data-at-rest encryption in MariaDB 10.1.14. Below is a summary of these:

  • Only data at rest is encrypted. Metadata (e.g., .frm files) and data sent to a client are not encrypted (see Secure Connections).

  • Only the MariaDB server can decrypt the data.
    • mysqlbinlog cannot read encrypted binary logs (MDEV-8813);
    • Percona XtraBackup cannot backup instances that use encrypted InnoDB log files (ME-478, PR-200); and
    • Percona XtraBackup cannot prepare instances that use encrypted InnoDB log files.
  • The disk-based Galera gcache is not encrypted (MDEV-9639). On Galera installations you could decrease your gcache size—at the expense of more SST in case of restarts—to have less unencrypted data on your disk.
  • The Audit plugin cannot create encrypted output. Instead, send it to syslog and configure the protection there.
  • File-based general query log and slow query log cannot be encrypted (MDEV-9639).
  • The Aria log is not encrypted (MDEV-9639). This affects only non-temporary Aria tables, though.
  • The MariaDB error log is not encrypted. The error log can contain query text and data in some cases, including crashes, assertion failures, and cases in which InnoDB/XtraDB write monitor outputs to the log to aid in debugging. It can also be sent to syslog, if needed.

Conclusions

MariaDB 10.1 provides a mature data-at-rest encryption solution that not only allows users to select suitable level of security using table encryption or fully transparent tablespace encryption.

by Jan Lindstrom at June 13, 2016 08:15 AM

June 10, 2016

Peter Zaitsev

Percona XtraDB Cluster 5.6.30-25.16 is now available

Percona XtraDB Cluster with ProxySQL in Kubernetes

Percona XtraDB Cluster 5.6Percona is glad to announce the new release of Percona XtraDB Cluster 5.6 on June 10, 2016. Binaries are available from the downloads area or our software repositories.

Percona XtraDB Cluster 5.6.30-25.16 is now the current release, based on the following:

All of Percona software is open-source and free, and all the details of the release can be found in the 5.6.30-25.16 milestone on Launchpad.

For more information about relevant Codership releases, see this announcement.

New Features:

  • PXC now uses wsrep_desync_count introduced in Galera 3.16 by Codership, instead of the concept that was previously implemented by Percona. The following logic applies:
    • If a node is explicitly desynced, then implicitly desyncing a node using RSU/FTWRL is allowed.
    • If a node is implicitly desynced using RSU/FTWRL, then explicitly desyncing a node is blocked until implicit desync is complete.
    • If a node is explicitly desynced and then implicitly desycned using RSU/FTWRL, then any request for another implicit desync is blocked until the former implicit desync is complete.

Bugs Fixed:

  • Changing wsrep_provider while the node is paused or desynced is not allowed.
  • TOI now checks that a node is ready to process DDL and DML before starting execution, to prevent a node from crashing if it becomes non-primary.
  • The wsrep_row_upd_check_foreign_constraints function now checks that fk-reference-table is open before marking it open.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at June 10, 2016 10:49 PM

Running Percona XtraDB Cluster in a multi-host Docker network

Percona XtraDB Cluster with ProxySQL in Kubernetes

Percona XtraDB Cluster in a multi-host Docker networkIn this post, I’ll discuss how to run Percona XtraDB Cluster in a multi-host Docker network.

With our release of Percona XtraDB Cluster 5.7 beta, we’ve also decided to provide Docker images for both Percona XtraDB Cluster 5.6 and Percona XtraDB Cluster 5.7.

Starting one node is very easy, and not that different from starting Percona Server image. The only an extra requirement is to have the CLUSTER_NAME variable defined. The startup command might look like this:

docker run -d -p 3306:3306
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e CLUSTER_NAME=Theistareykjarbunga
 -e XTRABACKUP_PASSWORD=Theistare
 percona/percona-xtradb-cluster

You might also notice we can optionally define an XTRABACKUP_PASSWORD password, which a xtrabackup@localhost user will employ for the xtrabackup-SST method.

Running Percona XtraDB Cluster in single mode kind of defeats the purpose of having the cluster. With our docker images, we tried to resolve the following tasks:

  1. Run in multiple-host environment (followed by running in Docker Swarm and Kubernetes)
  2. Start as many nodes in the cluster as we want
  3. Register all nodes in the service discovery, so that the client can see how many nodes are running and their status
  4. Integrate with ProxySQL

Let’s review these points one by one.

Using a multi-host network is when a Docker network becomes helpful. The recent Docker versions come with a network overlay driver, which we will use to run a virtual network over multiple boxes. Starting Docker overlay network is out of scope for this post, but check out this great introduction material on how to get it working.

With the network running, we can create an overlay network for our cluster:

docker network create -d overlay cluster1_net

Then we can start containers:

docker run -d -p 3306 --net=cluster1_net
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e CLUSTER_NAME=cluster1
 ...
 -e XTRABACKUP_PASSWORD=Theistare
 percona/percona-xtradb-cluster

The cool bit is that we can start Percona XtraDB Cluster on any node in the network, and they will communicate over the virtual network.

If you want to stay within a single Docker host (for example during testing), you still can create a bridge network and use it in one host environment.

The script above will run . . . almost. The problem is that every additional node needs to know the address of the running cluster.

To address this (if you prefer a manual process) we introduced the CLUSTER_JOIN variable, which should point to the IP address of one running nodes (or be empty to start the new cluster).

In this case, getting the script above to work might look like below:

docker run -d -p 3306 --net=cluster1_net
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e CLUSTER_NAME=cluster1
 -e CLUSTER_JOIN=10.0.5.5
 -e XTRABACKUP_PASSWORD=Theistare
 percona/percona-xtradb-cluster

I think manually tracking IP addresses requires unnecessary extra work, especially if we want to start and stop nodes on the fly. So we also decided to use service discovery — especially since you need it to run the Docker overlay network overlay. Right now we support the etcd discovery service, but it isn’t a problem to add more (such as Consul).

Starting etcd is also out of the scope of this post, but you can read about the procedure in the manual.

When you run etcd service discovery (on the host 10.20.2.4:2379, for example) you can start the nodes:

docker run -d -p 3306 --net=cluster1_net
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e CLUSTER_NAME=cluster1
 -e DISCOVERY_SERVICE=10.20.2.4:2379
 -e XTRABACKUP_PASSWORD=Theistare
 percona/percona-xtradb-cluster

The node will register itself in the service discovery and will join existing $CLUSTER_NAME.

There is convenient way to check all nodes:

curl http://$ETCD_HOST/v2/keys/pxc-cluster/$CLUSTER_NAME/?recursive=true | jq
{
  "action": "get",
  "node": {
    "key": "/pxc-cluster/cluster4",
    "dir": true,
    "nodes": [
      {
        "key": "/pxc-cluster/cluster4/10.0.5.2",
        "dir": true,
        "nodes": [
          {
            "key": "/pxc-cluster/cluster4/10.0.5.2/ipaddr",
            "value": "10.0.5.2",
            "modifiedIndex": 19600,
            "createdIndex": 19600
          },
          {
            "key": "/pxc-cluster/cluster4/10.0.5.2/hostname",
            "value": "2af0a75ce0cb",
            "modifiedIndex": 19601,
            "createdIndex": 19601
          }
        ],
        "modifiedIndex": 19600,
        "createdIndex": 19600
      },
      {
        "key": "/pxc-cluster/cluster4/10.0.5.3",
        "dir": true,
        "nodes": [
          {
            "key": "/pxc-cluster/cluster4/10.0.5.3/ipaddr",
            "value": "10.0.5.3",
            "modifiedIndex": 26420,
            "createdIndex": 26420
          },
          {
            "key": "/pxc-cluster/cluster4/10.0.5.3/hostname",
            "value": "cfb29833f1d6",
            "modifiedIndex": 26421,
            "createdIndex": 26421
          }
        ],
        "modifiedIndex": 26420,
        "createdIndex": 26420
      }
    ],
    "modifiedIndex": 19600,
    "createdIndex": 19600
  }
}

With this, you can start as many cluster nodes as you want and on any host in Docker Network. Now it is convenient to use an SQL proxy in front of the cluster. In this case, we will use ProxySQL (I will show that in a follow-up post).

In later posts, we will also review how to run Percona XtraDB Cluster nodes in an orchestration environment (like Docker Swarm and Kubernetes).

by Vadim Tkachenko at June 10, 2016 08:32 PM

Percona Monitoring and Management 1.0.1 Beta

Percona Monitoring and Management 1.0.1 Beta

Percona Monitoring and Management 1.0.1 BetaPercona is glad to announce the release of Percona Monitoring and Management 1.0.1 Beta on 10 June, 2016.

Like prior versions, PMM is distributed through Docker Hub and is free to download. Full instructions for download and installation of the server and client are available in the documentation.

Notable changes to the tool include:

  • Grafana 3.0
  • Replaced custom web server with NGINX
  • Eliminated most of the ports for PMM server container (now only two – 9001 and configurable 80)
  • Updated to the latest versions of Prometheus, exporters, QAN agent
  • Added mongodb_exporter
  • Added MongoDB dashboards
  • Replaced prom-config-api with Consul
  • Improvements to pmm-admin and ability to set server address with the port
  • Added “Server Summary” with aggregated query metrics to QAN app
  • MySQL dashboard updates, added “MySQL InnoDB Metrics Advanced” dashboard
The new server summary in PMM Beta 1.0.1
The new server summary in PMM Beta 1.0.1

 

 

metric_rates
Metric rates in Query Analytics

 

 

Available Dashboards in Metrics
Available Dashboards in Metrics

 

Full documentation is available, and includes details on installation and architecture, and a demonstration of the tool has been set up at pmmdemo.percona.com.

We have also implemented forums for the discussion of PMM.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Bob Davis at June 10, 2016 08:05 PM

June 09, 2016

Peter Zaitsev

Percona XtraDB Cluster 5.7 beta is now available

Percona XtraDB Cluster with ProxySQL

Percona XtraDB Cluster 5.7Percona is glad to announce the release of Percona XtraDB Cluster 5.7.11-4beta-25.14.2 on June 9, 2016. Binaries are available from the downloads area or our software repositories.

NOTE: This beta release is only available from the testing repository. It is not meant for upgrade from Percona XtraDB Cluster 5.6 and earlier versions. Only a fresh installation is supported.

Percona XtraDB Cluster 5.7.11-4beta-25.14.2 is based on the following:

This is the first beta release in the Percona XtraDB Cluster 5.7 series. It includes all changes from upstream releases and the following changes:

  • Percona XtraDB Cluster 5.7 does not include wsrep_sst_xtrabackup. It has been replaced by wsrep_sst_xtrabackup_v2.
  • The wsrep_mysql_replication_bundle variable has been removed.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at June 09, 2016 04:04 PM

Jean-Jerome Schmidt

Planets9s - Become a PostgreSQL and MongoDB DBA, New NinesControl & more

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

New blog series: Become a PostgreSQL DBA (if you’re really a MySQL DBA)

MySQL and PostgreSQL - two open source relational database management systems, each with their own set of use cases and users. You may be asking - why use both of these databases in one environment? Is there any value in having a replicated PostgreSQL setup running alongside a MySQL / MariaDB Galera cluster? In this blog post, which is the first in this new series, we try to answer these questions from the MySQL DBA standpoint and discuss different methods of deploying PostgreSQL.

Read the blog

Try the new NinesControl: deploy and monitor MySQL Galera clusters on Digital Ocean

This week we’re happy to announce the public availability of NinesControl (beta), our new database infrastructure management solution for the cloud. Designed with the needs of developers in mind, NinesControl enables users to easily deploy and monitor (MySQL) Galera clusters on DigitalOcean. Droplets are launched and managed using your own DigitalOcean account. We’d love to get your feedback on this new solution, so please do check it out and let us know what you think.

Try the new NinesControl

Become a MongoDB DBA: the basics of configuration

After covering the deployment of MongoDB in our previous blogpost in this series, we now move on to configuration basics. MongoDB is configured through both the config file (/etc/mongod.conf) and runtime. We discussed some of the configurables last month and now we go more in depth on each of them.

Read the blog

Sign up for our MySQL Database Performance Tuning webinar next week

There are still a few “seats” left for this popular new webinar on MySQL Database Performance Tuning, which takes place next Tuesday, June 14th. We’ll discuss some of the settings that can bring you significant improvement in the performance of your MySQL database as well as some of the variables which are frequently modified even though they should not. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Sign up for the webinar

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 09, 2016 09:39 AM

June 08, 2016

Peter Zaitsev

Using MySQL 5.7 Document Store with Internet of Things (IoT)

MySQL 5.7 Document Store

MySQL 5.7 Document StoreIn this blog post, I’ll discuss how to use MySQL 5.7 Document Store to track data from Internet of Things (IoT) devices.

Using JSON in MySQL 5.7

In my previous blog post, I’ve looked into MySQL 5.7.12 Document Store. This is a brand new feature in MySQL 5.7, and many people are asking when do I need or want to use the JSON or Document Store interface?

Storing data in JSON may be quite useful in some cases, for example:

  • You already have a JSON (i.e., from external feeds) and need to store it anyway. Using the JSON datatype will be more convenient and more efficient.
  • For the Internet of Things, specifically, when storing events from sensors: some sensors may send only temperature data, some may send temperature, humidity and light (but light information is only recorded during the day), etc. Storing it in JSON format may be more convenient in that you don’t have to declare all possible fields in advance, and do not have to run “alter table” if a new sensor starts sending new types of data.

Internet of Things

In this blog post, I will show an example of storing an event stream from Particle Photon. Last time I created a device to measure light and temperature and stored the results in MySQL. Particle.io provides the ability to use its own MQTT server and publish events with:

Spark.publish("temperature", String(temperature));
Spark.publish("humidity", String(humidity));
Spark.publish("light", String(light));

Then, I wanted to “subscribe” to my events and insert those into MySQL (for further analysis). As we have three different metrics for the same device, we have two basic options:

  1. Use a field per metric and create something like this: device_id int, temperature double, humidity double, light double
  2. Use a record per metric and have something like this: device_id int, event_name varchar(255), event_data text (please see this Internet of Things, Messaging and MySQL blog post for more details)

The first option above is not flexible. If my device starts measuring the soil temperature, I will have to “alter table add column”.

Option two is better in this regard, but I may significantly increase the table size as I have to store the name as a string for each measurement. In addition, some devices may send more complex metrics (i.e., latitude and longitude).

In this case, using JSON for storing metrics can be a better option. In this case, I’ve also decided to try Document Store as well.

First, we will need to enable X Plugin and setup the NodeJS / connector. Here are the steps required:

  1. Enable X Plugin in MySQL 5.7.12+, which uses a different port (33060 by default)
  2. Download and install NodeJS (>4.2) and mysql-connector-nodejs-1.0.2.tar.gz (follow the Getting Started with Connector/Node.JS guide).
    # node --version
    v4.4.4
    # wget https://dev.mysql.com/get/Downloads/Connector-Nodejs/mysql-connector-nodejs-1.0.2.tar.gz
    # npm install mysql-connector-nodejs-1.0.2.tar.gz

    Please note: on older systems you will probably need to upgrade the nodejs version (follow the Installing Node.js via package manager guide).

Storing Events from Sensors

Particle.io provides you with an API that allows you to subscribe to all public events (“events” are what sensors send). The API is for NodeJS, which is really convenient as we can use NodeJS for MySQL 5.7.12 Document Store as well.

To use the Particle API, install the particle-api-js module:

$ npm install particle-api-js

I’ve created the following NodeJS code to subscribe to all public events, and then add the data (in JSON format) to a document store:

var mysqlx = require('mysqlx');
var Particle = require('particle-api-js');
var particle = new Particle();
var token = '<place your token here>'
var mySession =
mysqlx.getSession({
    host: 'localhost',
    port: 33060,
    dbUser: 'root',
    dbPassword: '<place your pass here>'
});
process.on('SIGINT', function() {
    console.log("Caught interrupt signal. Exiting...");
    process.exit()
});
particle.getEventStream({ auth: token}).then(function(stream) {
                stream.on('event', function(data) {
                                console.log(data);
                                mySession.then(session => {
                                                session.getSchema("iot").getCollection("event_stream")
                                                .add(  data  )
                                                .execute(function (row) {
                                                        // can log something here
                                                }).catch(err => {
                                                        console.log(err);
                                                })
                                                .then( function (notices) {
                                                        console.log("Wrote to MySQL: " + JSON.stringify(notices))
                                                });
                                }).catch(function (err) {
                                              console.log(err);
                                              process.exit();
                                });
                });
}).catch(function (err) {
                                              console.log(err.stack);
                                              process.exit();
});

How it works:

  • particle.getEventStream({ auth: token}) gives me the stream of events. From there I can subscribe to specific event names, or to all public events using the generic name “events”: stream.on(‘event’, function(data).
  • function(data) is a callback function fired when a new event is ready. The event has JSON type “data.” From there I can simply insert it to a document store: .add( data ).execute() will insert the JSON data into the event_stream document store.

One of the reasons I use document store here is I do not have to know what is inside the event data. I do not have to parse it, I simply throw it to MySQL and analyze it later. If the format of data will change in the future, my application will not break.

Inside the data stream

Here is the example of running the above code:

{ data: 'Humid: 49.40 Temp: 25.00 *C Dew: 13.66 *C HeatI: 25.88 *C',
  ttl: '60',
  published_at: '2016-05-20T19:30:51.433Z',
  coreid: '2b0034000947343337373738',
  name: 'log' }
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["a3058c16-15db-0dab-f349-99c91a00"]}}
{ data: 'null',
  ttl: '60',
  published_at: '2016-05-20T19:30:51.418Z',
  coreid: '50ff72...',
  name: 'registerdev' }
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["eff0de02-726e-34bd-c443-6ecbccdd"]}}
{ data: '24.900000',
  ttl: '60',
  published_at: '2016-05-20T19:30:51.480Z',
  coreid: '2d0024...',
  name: 'Humid 2' }
{ data: '[{"currentTemp":19.25},{"currentTemp":19.19},{"currentTemp":100.00}]',
  ttl: '60',
  published_at: '2016-05-20T19:30:52.896Z',
  coreid: '2d002c...',
  name: 'getTempData' }
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["5f1de278-05e0-6193-6e30-0ebd78f7"]}}
{ data: '{"pump":0,"salt":0}',
  ttl: '60',
  published_at: '2016-05-20T19:30:51.491Z',
  coreid: '55ff6...',
  name: 'status' }
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["d6fcf85f-4cba-fd59-a5ec-2bd78d4e"]}}

(Please note: although the stream is public, I’ve tried to anonymize the results a little.)

As we can see the “data” is JSON and has that structure. I could have implemented it as a MySQL table structure (adding published_at, name, TTL and coreid as separate fields). However, I would have to depend on those specific fields and change my application if those fields changed. We also see examples of how the device sends the data back: it can be just a number, a string or another JSON.

Analyzing the results

Now I can go to MySQL and use SQL (which I’ve used for >15 years) to find out what I’ve collected. First, I want to know how many device names I have:

mysql -A iot
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 3289
Server version: 5.7.12 MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> select count(distinct json_unquote(doc->'$.name')) from event_stream;
+---------------------------------------------+
| count(distinct json_unquote(doc->'$.name')) |
+---------------------------------------------+
|                                        1887 |
+---------------------------------------------+
1 row in set (5.47 sec)

That is slow! As described in my previous post, I can create a virtual column and index for doc->’$.name’ to make it faster:

mysql> alter table event_stream add column name varchar(255)
    -> generated always as (json_unquote(doc->'$.name')) virtual;
Query OK, 0 rows affected (0.17 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table event_stream add key (name);
Query OK, 0 rows affected (3.47 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> show create table event_stream
*************************** 1. row ***************************
       Table: event_stream
Create Table: CREATE TABLE `event_stream` (
  `doc` json DEFAULT NULL,
  `_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL,
  `name` varchar(255) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$.name'))) VIRTUAL,
  UNIQUE KEY `_id` (`_id`),
  KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
mysql> select count(distinct name) from event_stream;
+----------------------+
| count(distinct name) |
+----------------------+
|                 1820 |
+----------------------+
1 row in set (0.67 sec)

How many beers left?

Eric Joyce has published a Keg Inventory Counter that uses a Particle Proton device to measure the amount of beer in a keg by 12oz pours. I want to see what was the average and the lowest amount of beer per day:

mysql> select date(json_unquote(doc->'$.published_at')) as day,
    ->        avg(json_unquote(doc->'$.data')) as avg_beer_left,
    ->    min(json_unquote(doc->'$.data')) as min_beer_left
    -> from event_stream
    -> where name = 'Beers_left'
    -> group by date(json_unquote(doc->'$.published_at'));
+------------+--------------------+---------------+
| day        | avg_beer_left      | min_beer_left |
+------------+--------------------+---------------+
| 2016-05-13 |  53.21008358996988 | 53.2          |
| 2016-05-18 |  52.89973045822105 | 52.8          |
| 2016-05-19 | 52.669233854792694 | 52.6          |
| 2016-05-20 |  52.60644257702987 | 52.6          |
+------------+--------------------+---------------+
4 rows in set (0.44 sec)

Conclusion

UDocument Store can be very beneficial if an application is working with a JSON field and does not know or does not care about its structure. In this post, I’ve used the “save to MySQL and analyze later” approach here. We can then add virtual fields and add indexes if needed.

by Alexander Rubin at June 08, 2016 05:17 PM

Jean-Jerome Schmidt

Announcing NinesControl: deploy and monitor MySQL Galera clusters on DigitalOcean

Following the recent completion of our testers programme for NinesControl, we’re now happy to announce the public availability of NinesControl (beta), the database infrastructure management solution for the cloud.

Many thanks to our testers for participating in the NinesControl feedback programme. With the collective insight gained, we were able to fine-tune this initial public release and present it to you today.

With NinesControl, users can now easily and quickly deploy and monitor (MySQL) Galera clusters on DigitalOcean. Droplets are launched and managed using your own DigitalOcean account.

We encourage you to sign up on ninescontrol.com and try this new service out.

NinesControl is designed with developers in mind. Spend time developing your applications and let the service provision, monitor and manage your databases.

It is currently in beta for DigitalOcean and Galera cluster users, before we expand the service to other public cloud providers and databases.

With a couple of simple steps, you can deploy and manage MySQL Galera with Percona, MariaDB or Codership. We are currently underway of adding support for MongoDB and Amazon EC2 as a cloud provider.

Overtime more monitoring and management features will be added to the service such as:

  • Automated backups, Schema and DB user management
  • Health, query and performance monitoring
  • Configuration management
  • Scaling out and deploying load balancers like HAProxy and MaxScale
  • Notifications through email and incident management services
  • and more ...

Try it out today!

by Severalnines at June 08, 2016 11:55 AM

June 07, 2016

Peter Zaitsev

Choosing MySQL High Availability Solutions

MySQL High Availability Solutions

In this blog post we’ll look at various MySQL high availability solutions, and examine their pluses and minuses.

High availability environments provide substantial benefit for databases that must remain available. A high availability database environment co-locates a database across multiple machines, any one of which can assume the functions of the database. In this way, a database doesn’t have a “single point of failure.”

There are many HA strategies and solutions, so how do you choose the best solution among a myriad of options. The first question to ask is “what is the problem you are trying to solve?” The answers boil down to redundancy versus scaling versus high availability. These are not necessarily all the same!

  • Need multiple copies of data in event of a disaster
  • Need to increase read and/or write throughput
  • Need to minimize outage duration

MySQL High Availability SolutionsWhen you are planning your database environment, it’s important to remember the CAP Theorem applies. The CAP Theorem breaks problems into three categories: consistency, availability, and partition tolerance. You can pick any two from those three, at the expense of the third.

  • Consistency. All nodes see the same data at the same time
  • Availability. Every request receives a response about whether it succeeded or not
  • Partition Tolerance. The system continues to operate despite arbitrary partitioning due to network failures

Whatever solution you choose, it should maximize consistency. The problem is that although MySQL replication is great, it alone does not guarantee consistency across all nodes. There is always the potential that data is out of sync, since transactions can be lost during failover and other reasons. Galera-based clusters such as Percona XtraDB Cluster are certification-based to prevent this!

Data loss

The first question you should ask yourself is “Can I afford to lose data?”

This often depends on the application. Apps should check status codes on transactions to be sure they were committed.  Many do not! It is also possible to lose transactions during a failover. During failover, simple replication schemes have the possibility of losing data

Inconsistent nodes are another problem. Without conflict detection and resolution, inconsistent nodes are unavoidable. One solution is to run pt-table-checksum often to check for inconsistent data across replication nodes. Another option is using a Galera-based Distributed Cluster, such as Percona XtraDB Cluster, with a certification process.

MySQL High Availability SolutionsAvoiding a single point of failure

What is watching your system? Or is anything standing ready to intervene in a failure? For replication, take a look at MHA and MySQL Orchestrator.  Both are great tools to perform failover of a Replica.  There are others.

For Percona XtraDB Cluster, failover is typically much faster, but it is not the perfect solution in every case.

Can I afford lost transactions?

Many MySQL DBAs worry about setting

innodb_flush_log_at_trx_commit
 to 1 for ACID compliance and sync_binlog, but then use replication with no consistency checks! Is this logically consistent? Percona XtraDB Cluster maintains consistency through certification.

Conflict detection and resolution

All solutions must have some means of conflict detection and resolutions. Galera’s certification process follows the following method:

  • Transaction continues on a node as normal until it reaches COMMIT stage
  • Changes are collected into a writeset
  • Writeset is sent to all nodes for certification
  • PKs are used to determine if the writeset can be applied
  • If certification fails, the writeset is dropped and the transaction is rolled back.
  • If it succeeds, the transaction commits and the writesets are applied to all of the nodes.
  • All nodes will reach the same decision on every transaction and is thus deterministic.

Do I want Failover or a Distributed System?

Another important consideration is whether you should have a failover or a distributed system. A failover system runs one instance at a time, and “fails over” to a different instance when an issue occurs. A distributed system runs several instances at one time, all handling different data.

  • Failover pitfalls:
    • Failover systems have a monitor which detects failed nodes and moves services elsewhere if available
    • Failover takes time!
  • Distributed systems:
    • Distributed systems minimize failover time

MySQL High Availability SolutionsAnother question is should your failover be automatic or manual?

  • Advantage of Manual Failover
    • The primary advantage to failing over manually is that a human usually can make a better decision as to whether failover is necessary.
    • Systems rarely get it perfect, but they can be close!
  • Advantage of Automatic Failover
    • More Nines due to minimized outages
    • No need to wait on a DBA to perform

A further question is how fast does failover have to occur? Obviously, the faster it happens, the less time there is for potential data loss.

  • Replication / MHA / MMM
    • Depends on how long it takes for pending Replica transactions to complete before failover can occur
    • Typically around 30 seconds
  • DRBD
    • Typically between 15 and 30 seconds
  • Percona XtraDB Cluster / MySQL Cluster
    • VERY fast failover. Typically less than 1 second depending upon Load Balancer

How many 9’s do you really need?

The “9” measure of accuracy is a standard for how perfect a system is. When it comes to “how many 9s,” each 9 is an order of magnitude more accurate. 99.99 is four nines, while 99.999 is five nines.

Every manager response to the question of how many nines is always “As many as I can get.” That sounds great, but the reality is that tradeoffs are required! Many applications can tolerate a few minutes of downtime with minimal impact. The following tables shows downtime as correlated to each “9”:

MySQL High Availability Solutions

Do I need to scale reads and/or writes?

When looking at your environment, it’s important to understand your workload. Is your workload heavy on reads, writes, or both? Know whether you’re going to need to scale reads or writes is important to choosing your HA solution:

  • Scaling reads
    • Most solutions offer ability to read from multiple nodes or replicas
    • MHA, Percona XtraDB Cluster, MySQL Cluster, and others are well suited for this
  • Scaling writes
    • Many people wrongly try to scale writes by writing to multiple nodes in Percona XtraDB Cluster leading to conflicts
    • Others try it with Master-Master Replication which Is also problematic
    • Possibly the best solution in this regard is MySQL Cluster

What about provisioning new nodes?

  • Replication
    • Largely, this is a manual process
    • MySQL Utilities makes this easier than ever
  • Distributed Clusters

MySQL High Availability SolutionsThe rule of threes

With Percona XtraDB Cluster, try to have three of everything. If you span a data center, have three data centers. If your nodes are on a switch, try to have three switches.

Percona XtraDB Cluster needs at least three nodes in the cluster.  An odd number is preferred for voting reasons. Forget about trying to keep a cluster alive during failure with only two data centers.  You are better off making one a DR site. Forget about custom weighting to try to get by on two data centers.  The 51% rule will get you anyway!

How many data centers do I have?

Knowing how many data centers are involved in your environment is a critical factor. Running multiple data centers has implications for the HA solution you adopt.

What if I only have one data center? You can gain protection against a single failed node or more, depending on cluster size. If you have two data centers, you should probably be considering the second data center as a DR solution. Having three or more data centers is the most robust solution when using Galera-based clusters such as XPercona XtraDB Cluster.

How do I plan for disaster recovery?

Planning for disaster recovery is crucial in your HA environment. Make sure the DR node(s) can handle the traffic, if even at a minimized performance level.

  • Replicating from a Percona XtraDB Cluster to a DR site
    • Asynchronous Replication from Percona XtraDB Cluster to a single node
    • Asynchronous Replication from Percona XtraDB Cluster to a replication topology
    • Asynchronous Replication from Percona XtraDB Cluster to another Percona XtraDB Cluster

What storage engine(s) do I need?

Nowadays especially, there is a multitude of storage engines available for a database environment. Which one should you use for your HA solution? Your solution will help determine which storage engine you can employ.

  • Not storage engine dependent. Works with all storage engines
  • Percona XtraDB Cluster. Requires InnoDB. Support for MyISAM is experimental and should not be used in Production
  • MySQL Cluster. Requires NDB Storage Engine

MySQL High Availability SolutionsLoad balancer options

Load balancers provide a means to distribute your workload across your environment resources so as not to create a bottleneck at any one particular point. The following are some load balancing options:

  • HAProxy
    • Open-source software solution
    • Cannot split reads and writes. If that is a requirement, the app will need to do it!
  • F5 BigIP
    • Typical hardware solution
  • MaxScale
    • Can do read/write splitting
  • Elastic Load Balancer (ELB)
    • Amazon solution

What happens if the cluster reboots?

Some changes require that the cluster be rebooted for the changes to be applied. For example, changing a parameter value in a parameter group is only applied to the cluster after the cluster is rebooted. A cluster could also reboot due to power interruption or other technology failures.

  • A power outage in a single data center could lead to issues
    • Percona XtraDB cluster can be configured to auto bootstrap
    • May not always work when all nodes lose power simultaneously. While server is running, the grastate.dat file shows -1 for seqno
  • Surviving a Reboot
    • Helpful if nodes are shutdown by a System Administrator for a reboot or other such process
    • Normal shutdown sets seqno properly

Do I need to be able to read after writing?

Asynchronous Replication does not guarantee consistent views of data across nodes. Percona XtraDB Cluster offers causal reads. Replica will wait for the event to be applied before processing additional queries, guaranteeing a consistent read state across nodes.

What if I do a lot of data loading?

In the recent past, it was conventional wisdom to use replication in such scenarios over Percona XtraDB Cluster.  MTS does help if data is distributed over multiple schemas but is not a fit for all situations. Percona XtraDB Cluster is now a viable option since we discovered a bug in Galera which did not properly split large transactions.

Have I taken precautions against split brain?

Split Brain occurs when a cluster has its nodes divided from one another, most often due to network blip, and nodes form two or more new and independent (and thus divergent) clusters. XPercona XtraDB Cluster is configured to go into a non-primary state and refuse to take traffic. A newer setting with XtraDB Cluster will allow for dirty reads for non-primary nodes

Does my application require high concurrency?

Newer approaches to replication allow for parallel threads (Percona XtraDB Cluster has had this from the beginning), such as Multi-Thread Slaves (MTS). MTS allows a replica to have multiple SQL threads all with their own relay logs. It enable GTID to make backups via Percona XTRABackup safer due to not being able to trust SHOW SLAVE STATUS to get relay log position.

Am I limited on RAM?

Some Distributed solutions such as MySQL Cluster require a lot of RAM, even with file-based tables.  Be sure to plan appropriately. XtraDB Cluster works much more like a stand-alone node.

How stable is my network?

Networks are never really 100% reliable. Some “Network Problems” are due to outside factors such as system resource contention (especially on virtual machines). Network problems cause inappropriate failover issues. Use LAN segments with Percona XtraDB Cluster to minimize network traffic across the WAN.

Conclusion

Making the right choice depends on:

  • Knowing what you really need!
  • Knowing your options.
  • Knowing your constraints!
  • Understanding the pros/cons of each solution
  • Setting expectations properly!

For more information on how to plan your HA environment, and what tools are available, sign up for my webinar Choosing a MySQL® High Availability Solution today on June 23, 2016 at 10:00 am. You can also get some great insights by watching these videos on our high availability video playlist.

by Michael Patrick at June 07, 2016 08:25 PM

Severe performance regression in MySQL 5.7 crash recovery

MySQL 5.7 Crash Recovery

In this post, we’ll discuss some insight I’ve gained regarding severe performance regression in MySQL 5.7 crash recovery.

Working on different InnoDB log file sizes in my previous post:

What is a big innodb_log_file_size?

I tried to understand how we can make InnoDB crash recovery faster, but found a rather surprising 5.7 crash recovery regression.

Basically, crash recovery in MySQL 5.7 is two times slower, due to this issue: https://bugs.mysql.com/bug.php?id=80788. InnoDB now performs the log scan twice, compared to a single scan in MySQL 5.6 (no surprise that there is performance degradation).

Fortunately, there is a proposed patch for MySQL 5.7, so I hope it will be improved soon.

As for general crash recovery improvement, my opinion is that it would be much improved by making it multi-threaded. Right now it is significantly limited by the single thread that reads and processes log entries one-by-one. With the current hardware, consisting of tens of cores and fast SSD, we can improve crash recovery by utilizing all the resources we have.

One small improvement that can be made is to disable PERFORMANCE_SCHEMA during recovery (these stats are not needed anyway).

by Vadim Tkachenko at June 07, 2016 01:01 PM

June 06, 2016

Peter Zaitsev

Webinar Thursday, June 9: Troubleshooting MySQL configuration issues

MongoDB Schema Design

Troubleshooting MySQL configurationPlease join us on Thursday June 9, at 10:00 am PDT (UTC-7) for the webinar Troubleshooting MySQL configuration issues.

MySQL Server is highly tunable. It has hundreds of configuration options which provide great tuning abilities and, at the same time, can be the source of various issues.

In this webinar you will learn which types of options MySQL Server supports, when they take effect and how to modify configuration safely. I will demonstrate best practices and tricks, used by Support engineers when they work with bug reports and customer issues which highly depend on configuration.

Register now.

Principal Technical Services Engineer
Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that help solve typical issues quicker, teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as Support Engineer in MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle. She is the author of book “MySQL Troubleshooting” and JSON UDF functions for MySQL.

by Dave Avery at June 06, 2016 08:24 PM

Percona Server 5.7.12-5 is now available

percona server 5.6.30-76.3

percona server 5.6.30-76.3Percona is glad to announce the GA release of Percona Server 5.7.12-5 on June 6, 2016. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.7.12, including all the bug fixes in it, Percona Server 5.7.12-5 is the current GA release in the Percona Server 5.7 series. All of Percona’s software is open-source and free, all the details of the release can be found in the 5.7.12-5 milestone at Launchpad.

Bugs Fixed:

  • MEMORY storage engine did not support JSON columns. Bug fixed #1536469.
  • When Read Free Replication was enabled for TokuDB and there was no explicit primary key for the replicated TokuDB table there could be duplicated records in the table on update operation. The fix disables Read Free Replication for tables without explicit primary key and does rows lookup for UPDATE and DELETE binary log events and issues warning. Bug fixed #1536663 (#950).
  • Attempting to execute a non-existing prepared statement with Response Time Distribution plugin enabled could lead to a server crash. Bug fixed #1538019.
  • TokuDB was using using different memory allocators, this was causing safemalloc warnings in debug builds and crashes because memory accounting didn’t add up. Bug fixed #1546538 (#962).
  • Adding an index to an InnoDB temporary table while expand_fast_index_creation was enabled could lead to server assertion. Bug fixed #1554622.
  • Percona Server was missing the innodb_numa_interleave server variable. Bug fixed #1561091 (upstream #80288).
  • Running SHOW STATUS in parallel to online buffer pool resizing could lead to server crash. Bug fixed #1577282.
  • InnoDB crash recovery might fail if innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1529885.
  • Fixed heap allocator/deallocator mismatch in Metrics for scalability measurement. Bug fixed #1581051.
  • Percona Server is now built with system zlib library instead of the older bundled one. Bug fixed #1108016.
  • CMake would fail if TokuDB tests passed. Bug fixed #1521566.
  • Reduced the memory overhead per page in the InnoDB buffer pool. The fix was based on Facebook patch #91e979e. Bug fixed #1536693 (upstream #72466).
  • CREATE TABLE ... LIKE ... could create a system table with an unsupported enforced engine. Bug fixed #1540338.
  • Change buffer merge could throttle to 5% of I/O capacity on an idle server. Bug fixed #1547525.
  • Parallel doublewrite memory was not freed with innodb_fast_shutdown was set to 2. Bug fixed #1578139.
  • Server will now show more descriptive error message when Percona Server fails with errno == 22 "Invalid argument", if innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1578604.
  • The error log warning Too many connections was only printed for connection attempts when max_connections + one SUPER have connected. If the extra SUPER is not connected, the warning was not printed for a non-SUPER connection attempt. Bug fixed #1583553.
  • apt-cache show command for percona-server-client was showing innotop included as part of the package. Bug fixed #1201074.
  • A replication slave would fail to connect to a master running 5.5. Bug fixed #1566642 (upstream #80962).
  • Upgrade logic for figuring if TokuDB upgrade can be performed from the version on disk to the current version was broken due to regression introduced when fixing bug #684 in Percona Server 5.7.11-4. Bug fixed #717.
  • Fixed jemalloc version parsing error. Bug fixed #528.
  • If ALTER TABLE was run while tokudb_auto_analyze variable was enabled it would trigger auto-analysis, which could lead to a server crash if ALTER TABLE DROP KEY was used because it would be operating on the old table/key meta-data. Bug fixed #945.
  • The tokudb_pk_insert_mode session variable has been deprecated and the behavior will be that of the former tokudb_pk_insert_mode set to 1. The optimization will be used where safe and not used where not safe. Bug fixed #952.
  • Bug in TokuDB Index Condition Pushdown was causing ORDER BY DESC to reverse the scan outside of the WHERE bounds. This would cause query to hang in a sending data state for several minutes in some environments with large amounts of data (3 billion records) if the ORDER BY DESC statement was used. Bugs fixed #988, #233, and #534.

Other bugs fixed: #1510564 (upstream #78981), #1533482 (upstream #79999), #1553166, #1496282 (#964), #1496786 (#956), #1566790, #718, #914, #937, #954, #955, #970, #971, #972, #976, #977, #981, #982, #637, and #982.

NOTE: X Plugin is a part of the MySQL community server so it’s part of the Percona Server 5.7.12-5 release as well. But users will need to install MySQL Shell from official MySQL repositories.

Release notes for Percona Server 5.7.12-5 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

by Hrvoje Matijakovic at June 06, 2016 07:23 PM

EL5 and why we’ve had to enable TLSv1.0 again

TLSv1.0

TLSv1.0We have had to revert back to TLSv1.0.

If you saw my previous post on TLSv1.0 (https://www.percona.com/blog/2016/05/23/percona-disabling-tlsv1-0-may-31st-2016/), you’ll know I  wanted to deprecate TLSv1.0 well ahead of PCI’s changes. We made the changes May 31st.

Unfortunately, it has become apparent that EL 5, which is in the final phases of End Of Life, does not support TLSv1.1 or TLSv1.2. As such, I have had to re-enable TLSv1.0 support so that these users employing EL 5 can still receive updates from our repositories.

If you are running EL 5 (RHEL 5 / CentOS 5 / Scientific Linux 5 / etc …), I encourage you to update as soon as possible. As of March 31st, 2017 there will be no more updates at all, and at present EL 5 is effectively receiving very few updates. It also has known vulnerabilities.

Removal of TLSv1.0 support will now take place March 31st, 2017. If there are any EL 5 backports that bring support for TLSv1.1 / TLSv1.2 in the interim, I will seek to remove support earlier.

 

by David Busby at June 06, 2016 01:52 PM

June 05, 2016

Valeriy Kravchuk

Fun with Bugs #43 - Bugs Fixed in MySQL 5.7.13

For some reason I do not see numerous public announcements about recent MySQL 5.7.13 release, even though it happened a couple of days ago formally. Maybe that's because we do not have any really "big" new features in this release (new ->> operator for introduced though for those who, unlike me, care about JSON). Still there are many fixes for bugs previously reported by Community (or Oracle engineers, but still in public bugs database), and I'd like to discuss some of them.

Let me start with the bug that was NOT fixed though, Bug #81093, "mysqld_multi not included in mysql-community-server package", reported by Georgi Iovchev and verified by Umesh. For this bug we had not got any fix, but explanation of this kind:
"On platforms for which systemd support is installed, systemd has the capability of managing multiple MySQL instances. For details, see Configuring Multiple MySQL Instances Using systemd. Consequently, mysqld_multi and mysqld_multi.server are not installed because they are unnecessary."
I still think that removing familiar scripts like mysqld_safe or mysqld_multi from systems where they are not used by default to start services upon OS startup is a mistake that has to be corrected.

I'd like to highlight Bug #80299, "DEFAULT is not deterministic and should not be allowed in generated columns", also. It was reported by Roy Lyseng and verified by Umesh.The fix is not those bug's synopsis suggested. Instead, ALTER TABLE operations that change the definition of a column now cause a table rebuild if any generated column expression uses DEFAULT().

Now let's concentrate on InnoDB bugs fixed:
  • Bug #80898 - "Replication stops after transaction is rolled back asynchronously in master". It was reported by Debarun Banerjee (who probably works on group replication feature in Oracle). I wonder how this fix may influence Galera-based cluster, especially PXC 5.7 from Percona that may become available as GA some day. Something to check later.
  • Bug #80772 - "Excessive memory used in memory/innodb/os0file starting 5.7.8". This regression bug was reported by Morgan Tocker and verified by Umesh.
  • Bug #80761 - "Server crashed in mutex_create due to invalid option". It was reported by Allen Lai. It's not clear who could verify it, but the problem is fixed now.
  • Bug #80708 - "Problem importing encrypted tablespace". It was reported by my former colleague Sergei Glushchenko from Percona and verified by Umesh.
  • Bug #80669 - "Failing assert: fil_space_get(table->space) != __null in row0quiesce.cc line 724", was also reported by Percona employee and great, hard working QA engineer, Ramesh Sivaraman. As most of the bugs from Community, it was immediately (in 44 minutes!) verified by Umesh.
  • Bug #80667 - "innobase_get_computed_value(dtuple_t const*, dict_v_col_t const*, dict_index_t c". It was reported based on failing test case runs by Viswanatham Gudipati, who works for Oracle.
  • Bug #80327 - "InnoDB: Failing assertion: !(rec_get_info_bits(rec2, comp) & 0x10UL)", was reported by Erlend Dahl from Oracle based on numerous failures of innodb_gis.rtree_purge test. I am really happy to see Oracle engineers reporting test failures in public, even when they happen to 5.8 (a three that we, Community, may not have access to at the moment). This is exactly what I was fighting for since I started this blog in 2012.
  • Bug #80261 - "Invalid InnoDB FTS Doc ID during INSERT". This bug was reported by Monty Solomon and promptly verified by Umesh. The bug is fixed, but release notes do not mention it explicitly. There are bugs even in release notes, we should live with that.
  • Bug #80182 - "Unsupported extension error for new features while 'create table like'". This bug was reported by yet another great QA engineer and my former colleague in Percona, Shahriyar Rzayev. As usual, it was promptly verified by Umesh.
  • Bug #79330 - "DROP TABLESPACE fails for missing general tablespace *.ibd file", was reported by Daniel Price who works for Oracle and verified by Marko Mäkelä.
  • Bug #78761 - "Unexpected deadlock with innodb_autoinc_lock_mode=0". It was reported by Yiftach Kaplan and verified by Miguel Solorzano.
  • Bug #77011 - Hanging "System Lock" when executing "flush table ... for export", was reported by Vincent Meng and verified by Umesh. It is also fixed in MySQL 5.6.31.
Now let's proceed with replication bugs that were fixed:
  • Bug #78999 - "Injected query log events do not set explicit_defaults_ts, read it on applying", was reported by great Laurynas Biveinis from Percona and verified by Umesh. It was a regression introduced by the fix for Bug #72794.
  • Bug #78995 - "Binlog_sender::read_event may access freed event buffer". Yet another bug reported by Laurynas Biveinis and immediately verified by Umesh. Percona engineers did a great QA job while working on Percona Server 5.7 GA release, and we see that Oracle fixes their bugs reported back then...
  • Bug #78467 - "mysqldump --dump-slave does not work with multi-source". It was reported by Sven Sandberg. I think that bugs of this kind should not even appear with proper design for new features and, in any case, they have to be fixed before GA release, so that it is at least feature complete. But what do I know...
  • Bug #77496 - "Replication position lost after crash on MTS configured slave". It was reported by Simon Mudd almost a year ago for MySQL 5.6, but is fixed only recently (also in 5.6.31). The bug was verified by Umesh.
Honestly, I am scared with numerous problems we still see in new replication features that appeared in MySQL 5.6, and thinking about Oracle's group replication development on top of that makes me scared even more. It will be us, Support engineers, and poor production DBAs who are forced to deal with numerous features not working well together or breaking usual work habits here and there...
There we not so many bugs reported in public that were fixed in other areas. I'd like to highlight the following:
  • Bug #78512 - "/var/log/mysqld.log has incorrect permissions after installing server from repo", Bug #81390 - "mariadb galera not replaced with community packages during upgrade on fc24", and Bug #79377 - "service start failed to create database if it doesn't exists(ubuntu1510)". They were all reported by Ramana Yeruva who seem to work for Oracle.
  • Bug #81139 - "Missing break statement in mysql_read_default_options()". It was reported by Georgi Kodinov from Oracle.
  • Bug #81014 - "DELETE from joined tables with WHERE using derived table fails with error 1093". This regression bug in optimizer was reported by Jim Parks, who seem to work for Oracle.
  • Bug #80935 - "Dev library header files not on default include path". It was reported by Lars Tangvald who works for Oracle.
  • Bug #80748 - "mysql5.6 does not build on solaris12". This bug was reported by Tor Didriksen from Oracle. Fixed in MySQL 5.6.31 also. I have to check if this fix really help my builds on OpenIndiana VM one day...
  • Bug #80526 - "LEFT OUTER JOIN returns incorrect results on the outer side". It was reported by Attila Rózsár and verified by Miguel Solorzano
  • Bug #80461 - "Accessing uninitialized memory inside strings/decimal.c ull2dec line ~1071". This bug was reported by Sergey Sprogis and verified by Umesh.
  • Bug #80451 - "mysqld --initialize does not support the keyring_file_data option". It was reported and verified by Jesper Krogh from Oracle.
  • Bug #80333 - "Valgrind: Conditional jump or move in Rewriter::do_refresh", was reported by Roel Van de Paar from Percona and verified by Umesh.
  • Bug #80231 - "Column name in having, which is contained in select list '*', produces error", was reported by Su Dylan and formally verified by Sinisa Milivojevic.
  • Bug #80089 - "Memory leak from open files limit", was reported by Manuel Ung and verified by Umesh. This bug is also fixed in MySQL 5.6.31.
  • Bug #79591 - "select distinct not returning distinct rows in 5.7...". This regression bug was reported by Shane Bester and probably verified (and fixed) by Øystein Grøvlen.
  • Bug #79396 - "null pointer dereference", was reported by Павел Гусев and verified by Sinisa Milivojevic. It is fixed in MySQL 5.6.31 also. It would be nice to see Oracle using AppChecker static analyzer for MySQL code...
  • Bug #78254 - "After running mysql_upgrade proxies_priv user columns are not updated to 32". It was reported by Robert Gołębiowski and verified by Umesh well before 5.7 became GA, but fixed only recently. I really wonder from prevented fixing this bug in 5.7.9...
  • Bug #74977 - "Cryptic error when failing to unload a dynamic library", was reported by  Martin Hansson back in 2014 and fixed only recently (also in MySQL 5.6.31). 
  • Bug #74636 - "mysqld_multi misleading when my_print_defaults is not found", was reported by Andrii Nikitin from Oracle back in 2014 and fixed also in MySQL 5.6.31 recently.
  • Bug #72230 - "Undesirable MySQL yum repo packaging dependencies". Yet another bug reported by Simon Mudd more than 2 years ago and verified by Umesh. it is fixed only in 5.7.13.
  • The last but not the least, Bug #71783, "mysqldump silently quits when it encounters an error.", was reported by Shane Bester in 2014, and was recently fixed in MySQL 5.7.13 and 5.6.31.            
To summarize, there are many bug fixes of all kinds in MySQL 5.7.13. Tomorrow I'll surely try to build it from sources on all VMs and platforms I have.

What I consider suspicious is the fact that aside from Oracle engineers, my former Percona colleagues (who had released their own versions of Percona Server and Xtrabackup based on MySQL 5.7) and few well known community members like Simon Mudd, there are very few bug reports from Community users for MySQL 5.7 that are fixed. Not sure what it means...

by Valeriy Kravchuk (noreply@blogger.com) at June 05, 2016 07:11 PM

June 03, 2016

Peter Zaitsev

Percona Server for MongoDB 3.2.6-1.0 is now available

Percona_ServerfMDBLogoVert

Percona is pleased to announce the release of Percona Server for MongoDB 3.2.6-1.0 on June 3, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.2.6-1.0 is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.2 protocol and drivers. Based on MongoDB 3.2.6, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as enterprise-grade features like external authentication and audit logging at no extra cost. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The PerconaFT storage engine has been deprecated and will not be available in future releases.


This release includes all changes from MongoDB 3.2.6 as well as the following:

  • PerconaFT has been deprecated. It is still available, but is no longer recommended for production use. PerconaFT will be removed in future releases.
  • MongoRocks is no longer considered experimental and is now recommended for production.

For more information, see the Embracing MongoRocks blog entry.

The release notes are available in the official documentation.

 

by Alexey Zhebel at June 03, 2016 07:49 PM

MariaDB AB

MariaDB 10.1 Now Supports Amazon RDS

Jessica Taylor

Starting today, customers can launch MariaDB version 10.1 instances on Amazon RDS. They can also upgrade their existing Amazon RDS for MariaDB database instances from version 10.0 to 10.1 using either the console or API.

Since the launch of support for the open source MariaDB database in Amazon RDS in October 2015, thousands of customers have leveraged RDS to make it easy to set up, operate, and scale their MariaDB servers in the cloud.

MariaDB 10.1 is the latest major version release and offers a number of enhancements for better performance, and scalability.

Some of the key new features in MariaDB 10.1 are:

 

  • XtraDB/InnoDB page compression

  • XtraDB/InnoDB data scrubbing

  • XtraDB/InnoDB defragmentation

  • Optimistic in-order parallel replication

  • ORDER BY optimization

  • WebScale SQL patches

 

Amazon RDS for MariaDB 10.1 is available in all AWS regions. To learn more about Amazon RDS for MariaDB, please refer to the RDS documentation.

Learn more on our MariaDB AWS Partner Page

 

by Jessica Taylor at June 03, 2016 05:49 PM

Peter Zaitsev

MySQL 5.7 By Default 1/3rd Slower Than 5.6 When Using Binary Logs

binary logs make MySQL 5.7 slower

binary logs make MySQL 5.7 slower

Researching a performance issue, we came to a startling discovery:

MySQL 5.7 + binlogs is by default 37-45% slower than MySQL 5.6 + binlogs when otherwise using the default MySQL settings

Test server MySQL versions used:
i7, 8 threads, SSD, Centos 7.2.1511
mysql-5.6.30-linux-glibc2.5-x86_64
mysql-5.7.12-linux-glibc2.5-x86_64

mysqld –options:

--no-defaults --log-bin=mysql-bin --server-id=2

Run details:
Sysbench version 0.5, 4 threads, socket file connection

Sysbench Prepare: 

sysbench --test=/usr/share/doc/sysbench/tests/db/parallel_prepare.lua --oltp-auto-inc=off --mysql-engine-trx=yes --mysql-table-engine=innodb --oltp_table_size=1000000 --oltp_tables_count=1 --mysql-db=test --mysql-user=root --db-driver=mysql --mysql-socket=/path_to_socket_file/your_socket_file.sock prepare

Sysbench Run:

sysbench --report-interval=10 --oltp-auto-inc=off --max-time=50 --max-requests=0 --mysql-engine-trx=yes --test=/usr/share/doc/sysbench/tests/db/oltp.lua --init-rng=on --oltp_index_updates=10 --oltp_non_index_updates=10 --oltp_distinct_ranges=15 --oltp_order_ranges=15 --oltp_tables_count=1 --num-threads=4 --oltp_table_size=1000000 --mysql-db=test --mysql-user=root --db-driver=mysql --mysql-socket=/path_to_socket_file/your_socket_file.sock run

Results:

5.6.30: transactions: 7483 (149.60 per sec.)
5.7.12: transactions: 4689 (93.71 per sec.)  — That is a 37.36% decrease!

Note: on high-end systems with premium IO (think Fusion-IO, memory-only, high-end SSD with good caching throughput), the difference would be much smaller or negligible.

The reason?

A helpful comment from Shane Bester on a related bug report made me realize what was happening. Note the following in the MySQL Manual:

“Prior to MySQL 5.7.7, the default value of sync_binlog was 0, which configures no synchronizing to disk—in this case, the server relies on the operating system to flush the binary log’s contents from time to time as for any other file. MySQL 5.7.7 and later use a default value of 1, which is the safest choice, but as noted above can impact performance.” — https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html#sysvar_sync_binlog

The culprit is thus the

--sync_binlog=1
 change which was made in 5.7.7 (in 5.6 it is 0 by default). While this may indeed be “the safest choice,” one has to wonder why Oracle chose to implement this default change in 5.7.7. After all, there are many other options t aid crash safety.

A related blog post  from the MySQL HA team states;

“Indeed, [with sync_binlog=1,] it increases the total number of fsyncs called, but since MySQL 5.6, the server groups transactions and fsync’s them together, which minimizes greatly a potential performance hit.” — http://mysqlhighavailability.com/replication-defaults-in-mysql-5-7-7/ (ref item #4)

This seems incorrect given our findings, unless perhaps it requires tuning some other option.

This raises some actions points/questions for Oracle’s team: why change this now? Was 5.6 never crash-safe in terms of binary logging? How about other options that aid crash safety? Is anything [before 5.7.7] really ACID compliant by default?

In 2009 my colleague Peter Zaitsev had already posted on performance matters in connection with sync_binlog issues. More than seven years later, the questions asked in his post may still be valid today;

“May be opening binlog with O_DSYNC flag if sync_binlog=1 instead of using fsync will help? Or may be binlog pre-allocation would be good solution.” — PZ

Testing the same setup again, but this time with

sync_binlog=0
  and
sync_binlog=1
  synchronized/setup on both servers, we see;

Results for sync_binlog=0:

5.6.30: transactions: 7472 (149.38 per sec.)
5.7.12: transactions: 6594 (131.86 per sec.)  — A 11.73% decrease

Results for sync_binlog=1:

5.6.30: transactions: 3854 (77.03 per sec.)
5.7.12: transactions: 4597 (91.89 per sec.)  — A 19.29% increase

Note: the increase here is to some extent negated by the fact that enabling sync_binlog is overall still causes a significant (30% on 5.7 and 48% on 5.6) performance drop. Also interesting is that this could be the effect of “tuning the defaults” of/in 5.7, and it also makes one think about the possibility o further defaults tuning/optimization in this area.

Results for sync_binlog=100:

5.6.30: transactions: 7564 (151.12 per sec.)
5.7.12: transactions: 6515 (130.22 per sec.) — A 13.83% decrease

Thus, while 5.7.12 made some improvements when it comes to

--sync_binlog=1
, when
--sync_binlog
 is turned off or is set to 100, we still see a ~11% decrease in performance. This is the same when not using binary logging at all, as a test with only
--no-defaults
  (i.e. 100% vanilla out-of-the-box MySQL 5.6.30 versus MySQL 5.7.12) shows;

Results without binlogs enabled:

5.6.30: transactions: 7891 (157.77 per sec.)
5.7.12: transactions: 6963 (139.22 per sec.)  — A 11.76% decrease

This raises another question for Oracle’s team: with four threads, there is a ~11% decrease in performance for 5.7.12 versus 5.6.30 (both vanilla)?

Discussing this internally, we were interested to see whether the arbitrary low number of four threads skewed the results and perhaps only showed a less realistic use case. However, testing with more threads, the numbers became worse still:

Results with 100 threads:

5.6.30. transactions: 20216 (398.89 per sec.)
5.7.12. transactions: 11097 (218.43 per sec.) — A 45.24% decrease

Results with 150 threads:

5.6.30. transactions: 11852 (233.01 per sec.)
5.7.12. transactions: 6606 (129.80 per sec.) — A 44.29% decrease

The findings in this article were compiled from a group effort.

by Roel Van de Paar at June 03, 2016 04:11 PM

June 02, 2016

Peter Zaitsev

Galera warning “last inactive check”

Galera warning "last inactive check"

Galera warning "last inactive check"In this post, we’ll discuss the Galera warning “last inactive check” and what it means.

Problem

I’ve been working with Percona XtraDB Cluster quite a bit recently, and have been investigating various warnings. I came across this one today:

[Warning] WSREP: last inactive check more than PT1.5S ago (PT1.51811S), skipping check

This warning is related to the evs.inactive_check_period option. This option controls the poll period for the group communication response time. If a node is delayed, it is added to a delay list and it can lead to the cluster evicting the node.

Possible Cause

While some troubleshooting tips seem to associate the warning with VMWare snapshots, this isn’t the case here, as we see the warning on a physical machine.

I checked for backups or desynced nodes, and this also wasn’t the case. The warning was not accompanied by any errors or other information, so there was nothing critical happening.

In the troubleshooting link above, Galera developers said:

This can be seen on bare metal as well — with poorly configured mysqld, O/S, or simply being overloaded. All it means is that this thread could not get CPU time for 7.1 seconds. You can imagine that access to resources in virtual machines is even harder (especially I/O) than on bare metal, so you will see this in virtual machines more often.

This is not a Galera specific issue (it just reports being stuck, other mysqld threads are equally stuck) so there is no configuration options for that. You simply must make sure that your system and mysqld are properly configured, that there is enough RAM (buffer pool not over provisioned), that there is swap, that there are proper I/O drivers installed on guest and so on.

Basically, Galera runs in virtual machines as well as the virtual machines approximates bare metal.

It could also be an indication of unstable network or just higher average network latency than expected by the default configuration. In addition to checking network, do check I/O, swap and memory when you do see this warning.

Our graphs and counters otherwise look healthy. If this is the case, this is most likely nothing to worry about.

It is also a good idea to ensure your nodes are desynced before backup. Look for spikes in your workload. A further option to check for is that swappiness is set to 1 on modern kernels.

If all of this looks good, ensure the servers are all talking to the same NTP server, have the same time zone and the times and dates are in sync. While this warning could be a sign of an overloaded system, if everything else looks good this warning isn’t something to worry about.

Source

The warning comes from evs_proto.cpp in the Galera code:

if (last_inactive_check_ + inactive_check_period_*3 < now)
{
log_warn << "last inactive check more than " << inactive_check_period_*3
<< " ago (" << (now - last_inactive_check_)
<< "), skipping check";
last_inactive_check_ = now;
return;
}

Since the default for inactive_check_period is one second according to the Galera documentation, if it is now later than three seconds after the last check, it skips the rest of the above routine and adds the node to the delay list and does some other logic. The reason it does this is that it doesn’t want to rely on stale counters before making decisions. The message is really just letting you know that.

In Percona XtraDB Cluster, this setting defaults to 0.5s. This warning simply could be that your inactive_check_period is too low, and the delay is not high enough to add the node to the delay list. So you could consider increasing evs.inactive_check_period to resolve the warnings. (Apparently in Galera, it may also now be 0.5s but documentation is stale.)

Possible Solution

To find a sane value my colleague David Bennett came up with this command line, which gives you an idea of when your check warnings are happening:

$ cat mysqld.log | grep 'last inactive check more than' | perl -ne 'm/(PT(.*)S)/; print $1."n"' | sort -n | uniq -c
1 1.55228
1 1.5523
1 1.55257
1 1.55345
1 1.55363
1 1.5543
1 1.55436
1 1.55483
1 1.5552
1 1.55582

Therefore, in this case, it may be a good idea to set inactive_check_period at 1 or 1.5 to make the warnings go away.

Conclusion

Each node in the cluster keeps its own local copy of how it sees the topology of the entire cluster. check_inactive is a node event that is triggered every inactive_check_period seconds to help the node update its view of the whole cluster, and ensure it is accurate. Service messages can be broadcast to the cluster informing nodes of changes to the topology. For example, if a cluster node is going down it will broadcast a service message telling each node in the cluster to remove it. The action is queued but the actual view of the cluster is updated with check_inactive. This is why it adds nodes to its local copy of inactive, suspect and delayed nodes.

If a node thinks it might be looking at stale data, it doesn’t make these decisions and waits until the next time for a fresh queue. Unfortunately, if inactive_check_period is too low, it will keep giving you warnings.

by Manjot Singh at June 02, 2016 06:51 PM

Percona Live Europe call for papers is now open!

Percona Live Europe Call for Papers

Percona Live Europe Call for PapersThe Percona Live Europe 2016 call for papers is now officially open! We’re looking forward to seeing you in Amsterdam this October 3-5, and hearing your speak! Ask yourself “Do I have…”:

  • Fresh ideas?
  • Enlightening case studies?
  • Insight on best practices?
  • In-depth technical knowledge?

If the answer to any of these is “YES,” then you need to submit your proposal for a chance to speak at Percona Live Europe 2016. Speaking is a great way to broaden not only the awareness of your company with an intelligent and engaged audience of software architects, senior engineers and developers, but also your own!

The deadline to submit is July 18th, 2016.

Tracks/Topics: 

Percona Live Europe 2016 is looking for topics for Breakout Sessions, Tutorial Sessions, and Lightning Talks:

  • Breakout Session. Submissions should be detailed and clearly indicate the topic and content of your proposal for the Conference Committee. Sessions should either be 25 minutes or 50 minutes in length, including Q&A.
  • Tutorial Session. Submissions should be detailed and include an agenda for review by the Conference Committee. Tutorial sessions should present immediate and practical applications of in-depth knowledge of MySQL, NoSQL, or Data in the Cloud technologies. They should be presented at a level between a training class and a conference breakout session. Attendees are expected to have their laptops to work through detailed and potentially hands-on presentations. Tutorials will be 3 hours in length including Q&A. If you would like to submit your proposal as a full day, six-hour tutorial, please indicate this in your submission.
  • Lightning Talks. Lightning talks are five-minute presentations focusing on one key point that will be of interest to the community. Talks can be technical, lighthearted, fun or otherwise entertaining submissions. These can include new ideas, a successful project, a cautionary story, quick tip or demonstration. This session is an opportunity for ideas to get the attention they deserve. The rules for this session are easy: five minutes and only five minutes. Use this time wisely to present the pertinent message of the subject matter and have fun doing so!

This year, the conference will feature a variety of formal tracks.We are particularly interested in proposals that fit into the areas of MySQL, MongoDB, NoSQL, ODBMS and Data in the Cloud. Our conference tracks are:

  • Analytics
  • Architecture/Design
  • Big Data
  • Case Stories
  • Development
  • New and Trending Topics
  • Operations and Management
  • Scalability/Performance
  • Security

Action Items:

With the call for papers ending July 18th, 2016the time to submit is now! Here’s what you need to submit:

1) Pull together your proposal information:

-Talk title
-Abstract
-Speaker bio, headshot, video

2) Submit your proposal here before July 18th: SUBMIT

3) After submitting, you will be given a direct link and badge to advertise to the community. Tweet, Facebook, blog – get the word out about your submission and get your followers to vote!

That’s all it takes! We’re looking forward to your submissions!

Sponsoring / Super Saver Tickets! 

Interested in sponsoring? Take advantage of early sponsorship opportunities to reach the influential Percona Live audience. Sponsorship Information

Want to attend at the cheapest rate possible? Super Saver tickets are on sale NOW! Register today.

by Kortney Runyan at June 02, 2016 06:13 PM

Jean-Jerome Schmidt

Planets9s - Sign up for our MySQL Database Performance Tuning webinar on June 14th

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Sign up for our MySQL Database Performance Tuning webinar on June 14th

You’re running MySQL as backend database, how do you tune it to make best use of the hardware? How do you optimize the Operating System? How do you best configure MySQL for a specific database workload? If these questions sound familiar, then this webinar is for you. We’ll discuss some of the settings that can bring you significant improvement in the performance of your MySQL database as well as some of the variables which are frequently modified even though they should not. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Sign up for the webinar

Join us for our European Polyglot Persistence Meetups Tour

We’re delighted to announce the first dates of our European Polyglot Persistence Meetups this summer. We’re starting off with Amsterdam today, then moving on to Dublin next week, Paris, Berlin, Stockholm and London. We’ll cover MySQL, PostgreSQL and MongoDB storage backends with the topics of deployment, scaling, configuration, management, backups and monitoring. And with time and as members suggest topics, the scope is likely to broaden.

Join the meetups

Severalnines helps IIL handle over 100 million transaction per day

This week we’re happy to announce our latest customer, IIL, a UK-based data and technology company. IIL specialises in providing pricing, fraud and compliance technology for insurance and credit providers. In their own words “ClusterControl gives us detailed insight into our database clusters in an intuitive format. Now we can see what is really going on under the hood quickly without a lot of time to go through the logs.”

Read the press release

Become a ClusterControl DBA - SSL Key Management and Encryption of MySQL Data in Transit

Anybody who’s been through the process of generating a self-signed certificate will probably agree that it is not the most straightforward process - most of the time, you end up searching through the internet to find howto’s and instructions on how to do this. This is especially true if you are a DBA and only go through this process every few months or even years. This is why we recently added a new ClusterControl feature designed to help you manage SSL keys across your database cluster. This new blog explains how.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 02, 2016 10:58 AM

June 01, 2016

Peter Zaitsev

Why use provisioned IOPS volumes for AWS databases?

provisioned-IOPS-volumes-for-AWS small

In this blog, we’ll use some test results to look at the rationale for using provisioned IOPS volumes for AWS databases.

One piece of advice you often hear running MySQL, MongoDB or other databases in the AWS EC2 environment is that you should use volumes with provisioned IOPs. This kind of makes sense on the “marketing” level, where provisioned IOPS (io1) volumes are designed for IO-intensive database workloads, while General Purpose (gp2) volumes are not. But if you go to the AWS volume type description, you will find that gp2s are shown to have pretty good IO performance. So where do all these supposed database performance problems for Amazon Elastic Block Store (EBS), with no provisioned IOs, come from?

Here is what I found out running experiments with a beta of Percona Monitoring and Management.

I ran a typical database instance workload, where the OLTP workload uses around 20% of the system capacity, and periodically I have a single user IO intensive batch job hitting the same system. Even if you do not have batch jobs running, your backup is likely to show this same IO pattern.

What would happen in this case if you have conventional local storage? Some queueing happens on the storage level, but as there is only one user with intensive IO, the impact is typically not very significant. What do we see from the AWS gp2 volume?

provisioned IOPS volumes for AWS

At first, the read services spike to more than 1.5K IOPS, and while latency increases from normal 1-2ms, it remains below 10ms on average. However, after a couple of minutes IOPS drops to around 500, and read latency spikes to over 100ms (note the log scale on the graph).

What is happening here? The gp2 volumes behave differently than your conventional storage by allowing IO bursts for short periods of time – after a short period of time, however, the IOs are throttled (in this case to only 500/sec). How does the throttling work? By adding delay to IO completion so that only the required IOs are completed per second, and the more concurrency we add to such throttled devices, the higher the average IO response latency is!

What does this mean from an application point of view? Let’s say you have a database transaction that requires 100 reads from the disk. If you have an average of 1ms latency, this transaction takes about 100ms reading from the disk, and will likely be seen as very good user experience. If you have an average IO latency of 100ms, the same transaction spends ten seconds reading from the disk – well above the tolerance for many users.   

As a DBA, you can see how putting an extra (small) load on the database system (such as running batch job or backup) can cause your boss to come screaming that the website is down ten minutes later.

There is another key difference between conventional local storage such as RAID or SSD, and an EBS volume. Not all local storage IO is created equal, while an EBS general purpose volume seems to inject latencies into IO operations independent of what the IO is.

Transactional log flushes are one of the most latency critical IO operations databases perform. These are very small (often just 1 page) sequential writes. RAID controllers and SSDs can handle these very quickly by only writing in memory (battery or capacitor backed up), at a fraction of the costs of other operations. This is not the case for EBS gp2: log writes come with high latency.

We can see this latency in Performance Schema graphs, where such patch jobs correlate to a huge amount of time spent writing to the InnoDB Transactional Log file or Binary Log File:

provisioned IOPS volumes for AWS

We can also see the main InnoDB thread spending up to 30% of its time flushing the log – the number is drastically lower for typical storage configuration:

provisioned IOPS volumes for AWS

Another way AWS EBS storage is different from the typical local storage is that size directly buys you performance. GP2 volumes provide 3 IOPS/GB, up to 10000 IOPS (99 percentile figure),  which means that larger storage will have higher performance – though if anything, this means you’re getting better performance from your larger production volumes than your smaller test ones.

A final note: EBS storage is essentially connected to a network, which means both slightly higher latencies and limited throughput. According to the documentation, there is 160MiB/s throughput limit per volume, which is a lot less than even inexpensive SATA SSD. SSD often can provide 500MB/sec or more, and are generally limited by SATA bus capacity.  

My takeaways from these results:

  • EBS General Purpose volumes have decent performance for light-duty workloads – if you don’t demand a lot of IOPS from your storage for prolonged periods of time. If you do, storage with provisioned IOPS is a better choice
  • Whenever you’re using Amazon or other environments with multi-tenant virtualized storage, I would highly suggest running some benchmark on how it behaves for the above scenarios. The assumptions you have about your conventional RAID or SSD storage might not apply.

Want to play around with live graphs? Check out our PMM Demo, which is currently running the stated workload on Amazon EC2. You can also install the beta version to use with your own system.

 

by Peter Zaitsev at June 01, 2016 10:05 PM

Embracing MongoRocks

MongoRocks

MongoRocksThis blog post discusses Percona’s future plans for TokuMX, PerconaFT, and MongoRocks for MongoDB 3.2 and later releases.

At Percona, we focus on providing the best solution for our customers, rather than falling into “big ego,” “not invented here” or “sunk cost fallacy” messages. We have limited engineering resources and want to concentrate on the things that matter the most for our community and our customers – the things that make a positive difference.

When we acquired Tokutek, TokuMX was a product that truly made a difference for a lot of MongoDB users. It allowed them to do what they could not do with MongoDB. However, things in the technology world change rapidly, as they have in the year since the acquisition.   

During the last year, the WiredTiger storage engine became generally available and demonstrated great performance for many read-focused and in-memory workloads. In addition, the RocksDB-based MongoRocks storage engine is well-optimized for handling large data write-intensive workloads. 

(You can see some of the benchmarking we’ve done to verify the performance of these engines here.)

With these developments, Tokutek’s Fractal Tree storage engine (now named PerconaFT) lost major performance advantages for the most common workloads. Additionally, the fact that MongoDB internally uses an optimistic locking pattern, while PerconaFT was designed with a pessimistic locking pattern, means that it would take a lot of work to make PerconaFT the best solution in a MongoDB environment.

As result, we have decided to deprecate PerconaFT in Percona Server for MongoDB 3.2, with plans of removing it in Percona Server for MongoDB 3.4.  At the same time, we will move our focus towards working together with Facebook on making MongoRocks the leading write-optimized storage engine for the MongoDB ecosystem.    

We will focus on fixing any remaining feature and performance gaps to ensure moving from PerconaFT to MongoRocks with Percona Server for MongoDB is as simple as possible – and results in even better performance.

With the release of Percona Server for MongoDB 3.2, we’re fully embracing MongoRocks and lifting its experimental status to be fully supported for all production environments.

You may ask, what does this mean for TokuDB? This means our Fractal Tree Indexing engineering team can focus on MySQL workloads rather than supporting two different database servers that have very different expectations from the underlying storage engine, and will allow us to improve TokuDB faster than before.

by Peter Zaitsev at June 01, 2016 04:18 PM

Jean-Jerome Schmidt

The Holy Grail Webinar: Become a MySQL DBA - Database Performance Tuning

You’re running MySQL as backend database, how do you tune it to make best use of the hardware? How do you optimize the Operating System? How do you best configure MySQL for a specific database workload?

Do these questions sound familiar to you? Maybe you’re having to deal with that type of situation yourself?

A database server needs CPU, memory, disk and network in order to function. Understanding these resources is important for a DBA, as any resource that is weak or overloaded can become a limiting factor and cause the database server to perform poorly. A main task of the DBA is to tune operating system and database configurations and avoid overutilization or underutilization of the available resources.

In this webinar, we’ll discuss some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL database. We will also cover some of the variables which are frequently modified even though they should not. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, June 14th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, June 14th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • Database tuning - the what and why
  • Principles of the tuning process
  • Tuning the Operating System configuration
  • Tuning the MySQL configuration
  • Useful tools
    • pt-summary
    • pt-mysql-summary
  • What to avoid when tuning OS and MySQL configuration

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

This webinar is based on our popular blog series ‘Become a MySQL DBA’.

We look forward to “seeing” you there!

To view all the blogs of the ‘Become a MySQL DBA’ series visit: http://www.severalnines.com/blog-categories/db-ops.

by Severalnines at June 01, 2016 12:31 PM

Oli Sennhauser

MySQL spatial functionality - points of interest around me

This week I was preparing the exercises for our MySQL/MariaDB for Beginners training. One of the exercises of the training is about MySQL spatial (GIS) features. I always tell customers: "With these features you can answer questions like: Give me all points of interest around me!"

Now I wanted to try out how it really works and if it is that easy at all...

To get myself an idea of what I want to do I did a little sketch first:

poi.png

   My position
   Shops
   Restaurants
   Cafes

To do this I needed a table and some data:

CREATE TABLE poi (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT
, name VARCHAR(40)
, type VARCHAR(20)
, sub_type VARCHAR(20)
, pt POINT NOT NULL
, PRIMARY KEY (id)
, SPATIAL INDEX(pt)
) ENGINE=InnoDB;

INSERT INTO poi (name, type, sub_type, pt) VALUES
  ('Shop 1', 'Shop', 'Cloth', Point(2,2))
, ('Cafe 1', 'Cafe', '',  Point(11,2))
, ('Shop 2', 'Shop', 'Cloth',  Point(5,4))
, ('Restaurant 1', 'Restaurant', 'Portugies',  Point(8,7))
, ('Cafe 2', 'Cafe', '',  Point(3,9))
, ('Shop 3', 'Shop', 'Hardware',  Point(11,9))
;

This looks as follows:

SELECT id, CONCAT(ST_X(pt), '/', ST_Y(pt)) AS "X/Y", name, type, sub_type
  FROM poi;
+----+-----------+--------------+------------+-----------+
| id | X/Y       | name         | type       | sub_type  |
+----+-----------+--------------+------------+-----------+
|  1 | 2/2       | Shop 1       | Shop       | Cloth     |
|  2 | 11/2      | Cafe 1       | Cafe       |           |
|  3 | 5/4       | Shop 2       | Shop       | Cloth     |
|  4 | 8/7       | Restaurant 1 | Restaurant | Portugies |
|  5 | 3/9       | Cafe 2       | Cafe       |           |
|  6 | 11/9      | Shop 3       | Shop       | Hardware  |
+----+-----------+--------------+------------+-----------+

Now the question: "Give me all shops in a distance of 4.5 units around me":

SET @hereami = POINT(9,4);

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
+----+------------+--------+----------+
| id | point      | name   | distance |
+----+------------+--------+----------+
|  3 | POINT(5 4) | Shop 2 |     4.00 |
+----+------------+--------+----------+
1 row in set (0.37 sec)

The query execution plan looks like this:

           id: 1
  select_type: SIMPLE
        table: poi
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 650361
     filtered: 10.00
        Extra: Using where; Using filesort

So no use of the spatial index yet. :-(

Reading the MySQL documentation Using Spatial Indexes provides some more information:

The optimizer investigates whether available spatial indexes can be involved in the search for queries that use a function such as MBRContains() or MBRWithin() in the WHERE clause.

So it looks like the optimizer CAN evaluate function covered fields in this specific case. But not with the function ST_Distance I have chosen.

So my WHERE clause must look like: "Give me all points within a polygon spanned 4.5 units around my position..."

I did not find any such function in the short run. So I created a hexagon which is not too far from a circle...

With this hexagon I tried again:

SET @hereami = POINT(9,4);
SET @hexagon = 'POLYGON((9 8.5, 12.897 6.25, 12.897 1.75, 9 -0.5, 5.103 1.75, 5.103 6.25, 9 8.5))';

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE MBRContains(ST_GeomFromText(@hexagon), pt)
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
Empty set (0.03 sec)

And tadaaah: Damned fast, but the result is not the same! :-( When you look at the graph above it is obvious why: The missing shop is 0.103 units outside of our hexagon search range but within our circle range. So an octagon would have been the better approach...

At least the index is considered now! :-)

           id: 1
  select_type: SIMPLE
        table: poi
   partitions: NULL
         type: range
possible_keys: pt
          key: pt
      key_len: 34
          ref: NULL
         rows: 31356
     filtered: 10.00
        Extra: Using where; Using filesort

For specifying a an "outer" hexagon I was too lazy. So I was specifying a square:

SET @hereami = POINT(9,4);
SET @square = 'POLYGON((4.5 8.5, 13.5 8.5, 13.5 -0.5, 4.5 -0.5, 4.5 8.5))';

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE MBRContains(ST_GeomFromText(@square), pt)
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
+----+------------+--------+----------+
| id | point      | name   | distance |
+----+------------+--------+----------+
|  3 | POINT(5 4) | Shop 2 |     4.00 |
+----+------------+--------+----------+
1 row in set (0.02 sec)

So, my shop is in the result again now. And even a bit faster!

Now I wanted to find out if this results are any faster than the conventional method with an index on (x) and (y) or (x, y):

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE x >=  4.5 AND x <= 13.5
   AND y >= -0.5 AND y <=  8.5
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
1 row in set (0.15 sec)

Here the optimizer chooses the index on x. But I think he could do better. So I forced to optimizer to use the index on (x, y):

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi FORCE INDEX (xy)
 WHERE x >=  4.5 AND x <= 13.5
   AND y >= -0.5 AND y <=  8.5
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
1 row in set (0.03 sec)

           id: 1
  select_type: SIMPLE
        table: poi
   partitions: NULL
         type: range
possible_keys: xy
          key: xy
      key_len: 10
          ref: NULL
         rows: 115592
     filtered: 1.11
        Extra: Using index condition; Using where; Using filesort

Same performance than with the spatial index. So it looks like for this simple task with my data distribution conventional methods do well enough.

No I wanted to try a polygon which comes as close as possible to a circle. This I solved with a MySQL stored function which returns a polygon:/p>

DROP FUNCTION polygon_circle;

delimiter //

CREATE FUNCTION polygon_circle(pX DOUBLE, pY DOUBLE, pDiameter DOUBLE, pPoints SMALLINT UNSIGNED)
-- RETURNS VARCHAR(4096) DETERMINISTIC
RETURNS POLYGON DETERMINISTIC
BEGIN

  DECLARE i SMALLINT UNSIGNED DEFAULT 0;
  DECLARE vSteps SMALLINT UNSIGNED;
  DECLARE vPolygon VARCHAR(4096) DEFAULT '';

  -- Input validation

  IF pPoints < 3 THEN
    RETURN NULL;
  END IF;
  IF pPoints > 360 THEN
    RETURN NULL;
  END IF;
  IF pPoints > 90 THEN
    RETURN NULL;
  END IF;
  if (360 % pPoints) != 0 THEN
    RETURN NULL;
  END IF;

  -- Start

  SET vSteps = 360 / pPoints;

  WHILE i < 360 DO
    SET vPolygon = CONCAT(vPolygon, (pX + (SIN(i * 2 * PI() / 360) * pDiameter)), ' ', (pY + (COS(i * 2 * PI() / 360) * pDiameter)), ', ');
    SET i = i + vSteps;
  END WHILE;

  -- Add first point again
  SET vPolygon = CONCAT("POLYGON((", vPolygon, (pX + (SIN(0 * 2 * PI() / 360) * pDiameter)), " ",  (pY + (COS(0 * 2 * PI() / 360) * pDiameter)), "))");

  -- RETURN vPolygon;
  RETURN ST_GeomFromText(vPolygon);
END;
//

delimiter ;

SELECT ST_AsText(polygon_circle(9, 4, 4.5, 6));
-- SELECT polygon_circle(9, 4, 4.5, 8);

Then calling the query in the same way:

SET @hereami = POINT(9,4);
SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE MBRContains(polygon_circle(9, 4, 4.5, 90), pt)
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
+----+------------+--------+----------+
| id | point      | name   | distance |
+----+------------+--------+----------+
|  3 | POINT(5 4) | Shop 2 |     4.00 |
+----+------------+--------+----------+
1 row in set (0.03 sec)

This seems not to have any significant negative impact on performance.

Results

Test#rowsoperationlatency
Total655360FTS1300 ms
Spatial exact Circle4128FTS520 ms
Spatial inner Hexagon3916range (pt)20 ms
Spatial outer Square4128range (pt)30 ms
Conventional outer Square on (x)4128range (x) or (y)150 ms
Conventional outer Square on (xy)4128range (x,y)30 ms
Spatial good Polygon4128range (pt)30 ms
Taxonomy upgrade extras: 

by Shinguz at June 01, 2016 08:13 AM

May 31, 2016

Peter Zaitsev

Troubleshooting locking issues webinar: Q & A

Troubleshooting Slow Queries

Troubleshooting locking issuesIn this blog, I will provide answers to the Q & A for the Troubleshooting locking issues webinar.

First, I want to thank you for attending the May, 12 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: Do you have the links to those other info sources?

A: Yes, they are listed in the “More Information” slide. In the PDF, all the links are active. If you speak Russian, you can also check this presentation by Dmitry Lenev. He also did a similar presentation in English for MySQL Connect, but now all the content is gone from the official website, so only chance to find his slides in English is to search web archives.

Q: Are you going to discuss metadata locks?

A: Yes. I discuss them in slides 11-16.

Q: Why do row locking when table level lock is already set by InnoDB. My question was table level lock is already set. You update 100 rows in that table, but InnoDB locks these 100 rows. Why? The table is already locked . . .

A: Table lock, which you saw on slide #20, is set by InnoDB only for a short time and almost immediately released. But the transaction not closed yet, and InnoDB still needs to protect updated rows from modifications by other transactions. Why can’t it be done with table-level lock only? Imagine you have a table with 1,000,000 rows. All have an ID from 1 to 1,000,000 (and other fields). Now imagine you need to update the row with ID=1. In the case of table lock, the whole table is locked while you are performing this one update. If another connection wants to update a row with ID=202, it has to wait. In the case of row-level locks, the two queries do not interfere each other and can apply in parallel.

Q: How do you avoid locks on alter, without resetting that transaction?

A: If you are using version 5.6 and up, many ALTER commands are non-locking. See the overview of online DDL in the user manual. However, if you want to use an ALTER variation that cannot be done online, you can use the utility pt-online-schema-change from Percona Toolkit. Note that ALTER will take longer than the regular “blocking” variant, but it will not block your other connections.

Q: This is not a question, but there is a typo on the slide – it should be Intention Locks, not Intension Locks

A: Thank you! I fixed this and the wrong table name in slide #28. Please download updated version of slides.

Q: Why does the ALTER table operation have to wait forever? It should start once the transaction finished, but I know that the lock will remain. Why doesn’t it unlock when the transaction is finished?

A: Of course it doesn’t wait forever! It was just an acronym for “waits very long time,” which can happen if you have a very busy application, with many threads updating the same table. Or if you don’t close transactions.

Q: Is the metadata_locks table enabled by default?

A: Yes.

by Sveta Smirnova at May 31, 2016 11:58 PM

What is a big innodb_log_file_size?

big innodb_log_file_size

In this post, we’ll discuss what constitutes a big

innodb_log_file_size
, and how it can affect performance.

In the comments for our post on Percona Server 5.7 performance improvements, someone asked why we use

innodb_log_file_size=10G
 with an indication that it might be too big?

In my previous post (https://www.percona.com/blog/2016/05/17/mysql-5-7-read-write-benchmarks/), the example used

innodb_log_file_size=15G
. Is that too big? Let’s take a more detailed look at this.

First, let me start by rephrasing my warning: the log file size should be set as big as possible, but not bigger than necessary. A bigger log file size is better for performance, but it has a drawback (a significant one) that you need to worry about: the recovery time after a crash. You need to balance recovery time in the rare event of a crash recovery versus maximizing throughput during peak operations. This limitation can translate to a 20x longer crash recovery process!

But how big is “big enough”? Is it 48MB (the default value), 1-2GB (which I often see in production), or 10-15GB (like we use for benchmarks)?

I wrote about how the 

innodb_log_file_size
 is related to background flushing five years ago, and I recommend this post if you are interested in details:

InnoDB Flushing: Theory and solutions

Since that time many improvements have been made both in Percona Server and MySQL, but a small

innodb_log_file_size
 still affects the throughput.

How? Let’s review how writes happen in InnoDB. Practically all data page writes happen in the background. It seems like background writes shouldn’t affect user query performance, but it does. The more intense background writes are, the more resources are taken away from the user foreground workload. There are three big forces that rule background writes:

  1. How close checkpoint age is to the async point (again, see previous material https://www.percona.com/blog/2011/04/04/innodb-flushing-theory-and-solutions/). This is adaptive flushing.
  2. How close is
    innodb_max_dirty_pages_pct
     to the percentage of actual dirty pages.  You can see this in the LRU flushing metrics.
  3. What amount of free pages are defined by
    innodb_lru_scan_depth
    . This is also in LRU flushing metrics.

So in this equation 

innodb_log_file_size
 defines the async point, and how big checkpoint age can be.

To show a practical application of these forces, I’ve provided some chart data. I will use charts from the Percona Monitoring and Management tool and data from Percona Server 5.7.

Before jumping to graphs, let me remind you that the max checkpoint age is defined not only by

innodb_log_file_size
, but also 
innodb_log_files_in_group
 (which is usually “2” by default). So 
innodb_log_file_size=2GB
 will have 4GB of log space, from which MySQL will use about 3.24GB (MySQL makes extra reservations to avoid a situation when we fully run out of log space).

Below are graphs from a tpcc-mysql benchmark with 1500 warehouses, which provides about 150GB of data. I used

innodb_buffer_pool_size=64GB
, and I made two runs:

  1. with
    innodb_log_file_size=2GB
  2. with
    innodb_log_file_size=15GB

Other details about my setup:

  • CPU: 56 logical CPU threads servers Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
  • OS: Ubuntu 16.04
  • Kernel 4.4.0-21-generic
  • The storage device is Samsung SM863 SATA SSD, single device, with ext4 filesystem
  • MySQL versions: Percona Server 5.7.11
  • innodb_io_capacity=5000 / innodb_io_capacity_max=7500

On the first chart, let’s look at the max checkpoint age, current checkpoint age and amount of flushed pages per second:

. . . and also a related graph of how many pages are flushed by different forces (LRU flushing and adaptive flushing). You can receive this data by enabling

innodb_monitor_enable = '%'
.

From these charts, we can see that with 2GB innodb_log_file_size InnoDB is forced by adaptive flushing to flush (write) more pages, because the current checkpoint age (uncheckpointed bytes) is very close to Max Checkpoint Age. To see the checkpoint age in MySQL, you can use the innodb_metrics table and metrics 

recovery_log_lsn_checkpoint_age
 and
recovery_log_max_modified_age_sync
.

In the case using

innodb_log_file_size=15GB
, the main flushing is done via LRU flushing (to keep 5000 pages (
innodb_lru_scan_depth
) free per buffer pool instance). From the first graph we can figure that uncheckpointed bytes never reach 12GB, so in this case using
innodb_log_file_size=15GB
 is overkill. We might be fine with innodb_log_file_size=8GB – but we wouldn’t know unless we set the 
innodb_log_file_size
 big enough. MySQL 5.7 comes with a very convenient improvement: now it is much easier to change the
innodb_log_file_size
, but it still requires a server restart. I wish we could change it online, like we can for
innodb_buffer_pool_size
 (I do not see technical barriers for this).

Let’s also look into the InnoDB buffer pool content:

We can see that there are more modified pages in the case with 15GB log files (which is good, as more modified pages means less work done in the background).

And the most interesting question: how does it affect throughput?

With

innodb_log_file_size=2GB
, the throughput is about 20% worse. With a 2GB log size, you can see that often zero transactions are processed within one second – this is bad, and says that the flushing algorithm still needs improvements in cases when the checkpoint age is close to or at the async point.

This should make a convincing case that using big

innodb_log_file_size
 is beneficial. In this particular case, probably 8GB (with
innodb_log_files_in_group=2
) would be enough.

What about the recovery time? To measure this, I killed mysqld when the checkpoint age (uncheckpointed bytes) was about 10GB. It appeared to take 20 mins to start mysqld after the crash. In another experiment with 25GB of uncheckpointed bytes, it took 45 mins. Unfortunately, crash recovery in MySQL is still singlethreaded, so it takes a long time to read and apply 10GB worth of changes (even from the fast storage).

We can see that recovery is single-threaded from the CPU usage chart during recovery:

cpu_recovery

The system uses 2% of the CPU (which corresponds to a single CPU).

In many cases, crash recovery is not a huge concern. People don’t always have to wait for MySQL recovery – since even one minute of downtime can be too long, often the instance fails over to a slave (especially with async replication), or the crashed node just leaves the cluster (if you use Percona XtraDB Cluster).

I would still like to see improvements in this area. Crash recovery is the biggest showstopper for using a big

innodb_log_file_size
, and I think it is possible to add parallelism similar to multithreaded slaves into the crash recovery process.

You can find the raw results, scripts and configs here.

 

by Vadim Tkachenko at May 31, 2016 03:45 PM

Jean-Jerome Schmidt

Press Release: Severalnines protects the UK insurance industry from risky bets

Fast, thorough database technology helps IIL handle over 100 million transactions a day

Stockholm, Sweden and anywhere else in the world - 31st May 2016 - Severalnines, the award-winning provider of database infrastructure management software, today announced its latest customer, IIL, a UK-based data and technology company. IIL specialises in providing pricing, fraud and compliance technology for insurance and credit providers.

Insurance providers rely on IIL to authenticate customer details by aggregating private, public and industry data throughout the whole insurance lifecycle from quote generation to claims management. This means IIL processes over 100 million transactions every day in real-time. It is therefore imperative the supporting databases can handle the volume, variety and velocity of data transactions.

To accommodate new customers and meet growth projections, IIL realised its existing service platform needed more capacity without sacrificing on performance. To meet the goal of having enough capacity for the next three years, database operations was made more complex with the addition of clustering, load-balancing and high-availability. IIL required an effective way to manage and monitor the clusters, as the previous approach was too manual and took away hours of valuable time a day from the IT team which could be spent on other projects.

After being recommended and trialled by the IT team at IIL, Severalnines’ ClusterControl platform was deployed within weeks to resolve the database challenges. ClusterControl now automates the management and monitoring of a number of clustered MySQL systems, which in total handle over 40 Terabytes of data. IIL can meet market expansion without compromising on database reliability and resilience.

Andy McCulloch, IT Manager at IIL, said: “The majority of our customers need to deliver online insurance quotes in real-time and this requires IIL to be as fast in providing customer checks. ClusterControl gives us detailed insight into our database clusters in an intuitive format. Now we can see what is really going on under the hood quickly without a lot of time to go through the logs. Automating database management saves us time in writing new code for checks and maintenance tools. If we need further optimisation in the future, we want to work with Severalnines again.”

Vinay Joosery, Severalnines Founder and CEO said: “There are two forces which are threatening to pull the insurance industry apart. At one end billions of pounds are lost every year due to fraud and non-compliance, whilst at the other end comparison websites are making it easier for customers to compare and buy policies in real-time and this reduces brand loyalty. IIL has a very important job in future-proofing the insurance industry by sharing reliable information and identifying risks such as fraud. Severalnines is glad to support IIL in a fast-moving insurance sector.”

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 8,000 deployments to date via its popular online database configurator. Currently counting BT, Orange, Cisco, CNRS, Technicolour, AVG, Ping Identity and Paytrail as customers. Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit, http://www.severalnines.com/company.

About IIL

IIL uses data and technology to deliver intelligence at key data driven decision points throughout the customer lifecycle in a variety of industries. . Launched in 2008, IIL’s data solutions enable our clients to deliver next generation products and services to their customers. Find out more.

by Severalnines at May 31, 2016 11:13 AM

Colin Charles

Speaking in June 2016

I have a few upcoming speaking engagements in June 2016:

  • Nerdear.la – June 9-10 2016 – Buenos Aires, Argentina – never been to this event but MariaDB Corporation are sponsors and I’m quite excited to be back in Buenos Aires. I’m going to talk about the MySQL ecosystem in 2016.
  • SouthEast LinuxFest – June 10-12 2016 – Charlotte, NC, USA – I have a few talks here, a bit bummed that I’m going to be missing the speaker dinner, but I expect this to be another great year. Learn about MariaDB Server/MySQL Security Essentials, the MySQL ecosystem in 2016, and about distributions from the view of a package.
  • NYC MySQL Meetup – June 27 2016 – New York, USA – I’m going to give a talk on lessons you can learn from other people’s database failures. I did this at rootconf.in, and it was well received so I’m quite excited to give this again.
  • Community Open House for MongoDB – June 30 2016 – New York, USA – I’m going to give my first MongoDB talk at the Community Open House for MongoDB – My First Moments with MongoDB, from the view of someone who’s been using MySQL for a very long time.

So if you’re in Buenos Aires, Charlotte or New York, I’m looking forward to seeing you to talk all things databases and open source.

by Colin Charles at May 31, 2016 07:52 AM

May 30, 2016

Jean-Jerome Schmidt

Become a ClusterControl DBA - SSL Key Management and Encryption of MySQL Data in Transit

Databases usually work in a secure environment. It may be a datacenter with a dedicated VLAN for database traffic. It may be a VPC in EC2. If your network spreads across multiple datacenters in different regions, you’d usually use some kind of Virtual Private Network or SSH tunneling to connect these locations in a secure manner. With data privacy and security being hot topics these days, you might feel better with an additional layer of security.

MySQL supports SSL as a means to encrypt traffic both between MySQL servers (replication) and between MySQL servers and clients. If you use Galera cluster, similar features are available - both intra-cluster communication and connections with clients can be encrypted using SSL.

A common way of implementing SSL encryption is to use self-signed certificates. Most of the time, it is not necessary to purchase an SSL certificate issued by the Certificate Authority. Anybody who’s been through the process of generating a self-signed certificate will probably agree that it is not the most straightforward process - most of the time, you end up searching through the internet to find howto’s and instructions on how to do this. This is especially true if you are a DBA and only go through this process every few months or even years. This is why we recently added a new ClusterControl feature designed to help you manage SSL keys across your database cluster. We based this blog post on ClusterControl 1.3.1.

Key Management in the ClusterControl

You can enter Key Management by going to Settings->Key Management section.

You will be presented with the following screen:

You can see two certificates generated, one being a CA and the other one a regular certificate. To generate more certificates, switch to the ‘Generate Key’ tab:

A certificate can be generated in two ways - you can first create a self-signed CA and then use it to sign a certificate. Or you can go directly to the ‘Client/Server Certificates and Key’ tab and create a certificate. The required CA will be created for you in the background. Last but not least, you can import an existing certificate (for example a certificate you bought from one of many companies which sell SSL certificates).

To do that, you should upload your certificate, key and CA to your ClusterControl node and store them in /var/lib/cmon/ca directory. Then you fill in the paths to those files and the certificate will be imported.

If you decided to generate a CA or generate a new certificate, there’s another form to fill - you need to pass details about your organization, common name, email, pick the key length and expiration date.

Once you have everything in place, you can start using your new certificates. ClusterControl currently supports deployment of SSL encryption between clients and MySQL databases and SSL encryption of intra-cluster traffic in Galera Cluster. We plan to extend the variety of supported deployments in future releases of the ClusterControl.

Full SSL encryption for Galera Cluster

Now let’s assume we have our SSL keys ready and we have a Galera Cluster, which needs SSL encryption, deployed through our ClusterControl instance. We can easily secure it in two steps.

First - encrypt Galera traffic using SSL. From your cluster view, one of the cluster actions is  Enable Galera SSL Encryption. You’ll be presented with the following options:

If you do not have a certificate, you can generate it here. But if you already generated or imported an SSL certificate, you should be able to see it in the list and use it to encrypt Galera replication traffic. Please keep in mind that this operation requires a cluster restart - all nodes will have to stop at the same time, apply config changes and then restart. Before you proceed here, make sure you are prepared for some downtime while the cluster restarts.

Once intra-cluster traffic has been secured, we want to cover client - server connections. To do that, pick ‘Enable SSL Encryption’ job and you’ll see following dialog:

It’s pretty similar - you can either pick an existing certificate or generate new one. The main difference is that to apply client - server encryption, downtime is not required - a rolling restart will suffice.

Of course, enabling SSL on the database is not enough - you have to copy certificates to clients which are supposed to use SSL to connect to the database. All certificates can be found in /var/lib/cmon/ca directory on the ClusterControl node. You also have to remember to change grants for users and make sure you’ve added REQUIRE SSL to them if you want to enforce only secure connections.

We hope you’ll find those options easy to use and help you secure your MySQL environment. If you have any questions or suggestions regarding this feature, we’d love to hear from you.

by Severalnines at May 30, 2016 10:16 AM

May 29, 2016

Erkan Yanar

systemd: a revolution gets used to itself

New versions of systemd (like in Ubuntu 16.04) are able to configure pid Cgroups (ulimit -u via Cgroups) using TasksMax.

Check the commit implementing TaksMax.

Author: Lennart Poettering <lennart@poettering.net>
Date:   Fri Nov 13 19:28:32 2015 +0100

    core: enable TasksMax= for all services by default, and set it to 512

    Also, enable TasksAccounting= for all services by default, too.

    See:

    http://lists.freedesktop.org/archives/systemd-devel/2015-November/035006.html

But not we’ve got silly defaults:

$ systemctl show -p TasksMax docker
TasksMax=512

$ systemctl show -p TasksMax mysql
TasksMax=512

So if you use Docker or MySQL most likely you are going to have trouble without really putting load on your server.

Just set TasksMax for your service.

TasksMax=infinity

Feel free to configure that setting fitting you load :)

Viel Spaß

Erkan \o/

Update 1: There will be an update for Debian/Ubuntu setting TaskMax=infinity as default.

Update 2: With version 230 there is a new default \o/ KillUserProcesses=yes. Check it out yourself and cry :(

by erkan at May 29, 2016 11:07 PM

May 27, 2016

Peter Zaitsev

Galera Error Failed to Report Last Committed (Interrupted System Call)

MySQL-Got-an-error-reading-communication-packet-errors

In this blog, we’ll discuss the ramifications of the Galera Error Failed to Report Last Committed (Interrupted System Call).

I have recently seen this error with Percona XtraDB Cluster (or Galera):

[Warning] WSREP: Failed to report last committed 549684236, -4 (Interrupted system call)

It was posted in launchpad as a bug in 2013: https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1434646

My colleague Przemek replied, and explained it as:

Reporting the last committed transaction is just a part of the certification index purge process. In case it fails for some reason (it occasionally does), the cert index purge may be a little delayed. But it does not mean the transaction was not applied successfully. This is a warning after all.

If we look up this error in the source code, we realize it is reusing Linux system errors. Specifically:

#define EINTR 4 /* Interrupted system call */

As there isn’t much documentation regarding this error, and internet searches did not bring up useful information, my colleague David Bennett and I delved into the source code (as we do on occasion).

If we look in the Galera source code gcs_sm.hpp we see:

289  * @retval -EINTR  - was interrupted by another thread

We also see:

317                 /* was interrupted, will be handled by someone else */

This means that the thread was interrupted, but the server will retry on another thread. As it is just a warning, it isn’t anything to be too concerned about – unless they begin to pile up (which could be a sign of concurrency issues).

The specific warning is thrown from galera_service_thd.cpp here:

58                 if (gu_unlikely(ret < 0))
59                 {
60                     log_warn << "Failed to report last committed "
61                              << data.last_committed_ << ", " << ret
62                              << " (" << strerror (-ret) << ')';
63                     // @todo: figure out what to do in this case
64                 }

This warning could be handled better so as to not bloody the logs, or sound cryptic enough to concern administrators.

by Manjot Singh at May 27, 2016 10:00 PM

Asynchronous Query Execution with MySQL 5.7 X Plugin

Asynchronous-Query-Execution

In this blog, we will discuss MySQL 5.7 asynchronous query execution using the X Plugin.

Overview

MySQL 5.7 supports X Plugin / X Protocol, which allows (if the library supports it) asynchronous query execution. In 2014, I published a blog on how to increase a slow query performance with the parallel query execution. There, I created a prototype in the bash shell. Here, I’ve tried a similar idea with NodeJS + mysqlx library (which uses MySQL X Plugin).

TL;DR version: By using the MySQL X Plugin with NodeJS I was able to increase query performance 10x (some query rewrite required).

X Protocol and NodeJS

Here are the steps required:

  1. First, we will need to enable X Plugin in MySQL 5.7.12+, which will use a different port (33060 by default).
  2. Second, download and install NodeJS (>4.2) and mysql-connector-nodejs-1.0.2.tar.gz (follow Getting Started with Connector/Node.JS guide).
    # node --version
    v4.4.4
    # wget https://dev.mysql.com/get/Downloads/Connector-Nodejs/mysql-connector-nodejs-1.0.2.tar.gz
    # npm install mysql-connector-nodejs-1.0.2.tar.gz

    Please note: on older systems, you will probably need to upgrade the nodejs version. Follow the Installing Node.js via package manager guide.
  3. All set! Now we can use the asynchronous queries feature.

Test data 

I’m using the same Wikipedia Page Counts dataset (wikistats) I’ve used for my Apache Spark and MySQL example. Let’s imagine we want to compare the popularity of MySQL versus PostgeSQL in January 2008 (comparing the total page views). Here are the sample queries:

mysql> select sum(tot_visits) from wikistats_by_day_spark where url like '%mysql%';
mysql> select sum(tot_visits) from wikistats_by_day_spark where url like '%postgresql%';

The table size only holds data for English Wikipedia for January 2008, but still has ~200M rows and ~16G in size. Both queries run for ~5 minutes each, and utilize only one CPU core (one connection = one CPU core). The box has 24 CPU cores, Intel(R) Xeon(R) CPU L5639 @ 2.13GHz. Can we run the query in parallel, utilizing all cores?

That is possible now with NodeJS and X Plugin, but require some preparation:

  1. Partition the table using hash, 24 partitions:
    CREATE TABLE `wikistats_by_day_spark_part` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `mydate` date NOT NULL,
      `url` text,
      `cnt` bigint(20) NOT NULL,
      `tot_visits` bigint(20) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=239863472 DEFAULT CHARSET=latin1
    /*!50100 PARTITION BY HASH (id)
    PARTITIONS 24 */
  2. Rewrite the query running one connection (= one thread) per each partition, choosing its own partition for each thread:
    select sum(tot_visits) from wikistats_by_day_spark_part partition (p<N>) where url like '%mysql%';
  3. Wrap it up inside the NodeJS Callback functions / Promises.

The code

var mysqlx = require('mysqlx');
var cs_pre = {
    host: 'localhost',
    port: 33060,
    dbUser: 'root',
    dbPassword: 'mysql'
};
var cs = {
    host: 'localhost',
    port: 33060,
    dbUser: 'root',
    dbPassword: 'mysql'
};
var partitions = [];
var res = [];
var total = 0;
mysqlx.getNodeSession( cs_pre ).then(session_pre => {
        var sql="select partition_name from information_schema.partitions where table_name = 'wikistats_by_day_spark_part' and table_schema = 'wikistats' ";
        session_pre.executeSql(sql)
                .execute(function (row) {
                        partitions.push(row);
                }).catch(err => {
                        console.log(err);
                })
                .then( function () {
                        partitions.forEach(function(p) {
                                  mysqlx.getNodeSession( cs ).then(session => {
                                     var sql="select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(" + p + ") where url like '%mysql%';"
                                     console.log("Started SQL for partiton: " + p);
                                     return Promise.all([
                                        session.executeSql(sql)
                                                 .execute(function (row) {
                                                        console.log(p + ":" + row);
                                                        res.push(row);
                                                        total = Number(total) + Number(row);
                                                }).catch(err => {
                                                        console.log(err);
                                                }),
                                        session.close()
                                     ]);
                                  }).catch(err => {
                                       console.log(err + "partition: " + p);
                                  }).then(function() {
                                        // All done
                                        if (res.length == partitions.length) {
                                                console.log("All done! Total: " + total);
                                                // can now sort "res" array if needed an display
                                        }
                                  });
                        });
                });
        session_pre.close();
});
console.log("Starting...");

The explanation

The idea here is rather simple:

  1. Find all the partitions for the table by using “select partition_name from information_schema.partitions”
  2. For each partition, run the query in parallel: create a connection, run the query with a specific partition name, define the callback function, then close the connection.
  3. As the callback function is used, the code will not be blocked, but rather proceed to the next iteration. When the query is finished, the callback function will be executed.
  4. Inside the callback function, I’m saving the result into an array and also calculating the total (actually I only need a total in this example).
    .execute(function (row) {
                             console.log(p + ":" + row);
                             res.push(row);
                             total = Number(total) + Number(row); ...

Asynchronous Salad: tomacucumtoes,bersmayonn,aise *

This may blow your mind: because everything is running asynchronously, the callback functions will return when ready. Here is the result of the above script:

$ time node async_wikistats.js
Starting...
Started SQL for partiton: p0
Started SQL for partiton: p1
Started SQL for partiton: p2
Started SQL for partiton: p3
Started SQL for partiton: p4
Started SQL for partiton: p5
Started SQL for partiton: p7
Started SQL for partiton: p8
Started SQL for partiton: p6
Started SQL for partiton: p9
Started SQL for partiton: p10
Started SQL for partiton: p12
Started SQL for partiton: p13
Started SQL for partiton: p11
Started SQL for partiton: p14
Started SQL for partiton: p15
Started SQL for partiton: p16
Started SQL for partiton: p17
Started SQL for partiton: p18
Started SQL for partiton: p19
Started SQL for partiton: p20
Started SQL for partiton: p21
Started SQL for partiton: p22
Started SQL for partiton: p23

… here the script will wait for the async calls to return, and they will return when ready – the order is not defined.

Meanwhile, we can watch MySQL processlist:

+------+------+-----------------+-----------+---------+-------+--------------+-------------------------------------------------------------------------------------------------------------------+
| Id   | User | Host            | db        | Command | Time  | State        | Info                                                                                                              |
+------+------+-----------------+-----------+---------+-------+--------------+-------------------------------------------------------------------------------------------------------------------+
|  186 | root | localhost:44750 | NULL      | Sleep   | 21391 | cleaning up  | PLUGIN                                                                                                            |
| 2290 | root | localhost       | wikistats | Sleep   |  1417 |              | NULL                                                                                                              |
| 2510 | root | localhost:41737 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p0) where url like '%mysql%'  |
| 2511 | root | localhost:41738 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p1) where url like '%mysql%'  |
| 2512 | root | localhost:41739 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p2) where url like '%mysql%'  |
| 2513 | root | localhost:41741 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p4) where url like '%mysql%'  |
| 2514 | root | localhost:41740 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p3) where url like '%mysql%'  |
| 2515 | root | localhost:41742 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p5) where url like '%mysql%'  |
| 2516 | root | localhost:41743 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p6) where url like '%mysql%'  |
| 2517 | root | localhost:41744 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p7) where url like '%mysql%'  |
| 2518 | root | localhost:41745 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p8) where url like '%mysql%'  |
| 2519 | root | localhost:41746 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p9) where url like '%mysql%'  |
| 2520 | root | localhost:41747 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p10) where url like '%mysql%' |
| 2521 | root | localhost:41748 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p11) where url like '%mysql%' |
| 2522 | root | localhost:41749 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p12) where url like '%mysql%' |
| 2523 | root | localhost:41750 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p13) where url like '%mysql%' |
| 2524 | root | localhost:41751 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p14) where url like '%mysql%' |
| 2525 | root | localhost:41752 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p15) where url like '%mysql%' |
| 2526 | root | localhost:41753 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p16) where url like '%mysql%' |
| 2527 | root | localhost:41754 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p17) where url like '%mysql%' |
| 2528 | root | localhost:41755 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p18) where url like '%mysql%' |
| 2529 | root | localhost:41756 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p19) where url like '%mysql%' |
| 2530 | root | localhost:41757 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p20) where url like '%mysql%' |
| 2531 | root | localhost:41758 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p21) where url like '%mysql%' |
| 2532 | root | localhost:41759 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p22) where url like '%mysql%' |
| 2533 | root | localhost:41760 | NULL      | Query   |     2 | Sending data | PLUGIN: select sum(tot_visits) from wikistats.wikistats_by_day_spark_part partition(p23) where url like '%mysql%' |
| 2534 | root | localhost       | NULL      | Query   |     0 | starting     | show full processlist                                                                                             |
+------+------+-----------------+-----------+---------+-------+--------------+-------------------------------------------------------------------------------------------------------------------+

And CPU utilization:

Tasks:  41 total,   1 running,  33 sleeping,   7 stopped,   0 zombie
%Cpu0  : 91.9 us,  1.7 sy,  0.0 ni,  6.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 97.3 us,  2.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 97.0 us,  3.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 97.7 us,  2.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  : 95.7 us,  2.7 sy,  0.0 ni,  1.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  : 98.3 us,  1.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  : 98.3 us,  1.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  : 97.7 us,  2.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  : 96.7 us,  3.0 sy,  0.0 ni,  0.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  : 98.3 us,  1.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 : 95.7 us,  4.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 : 97.7 us,  2.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 : 98.0 us,  2.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu13 : 98.0 us,  1.7 sy,  0.0 ni,  0.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu14 : 97.7 us,  2.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 : 97.3 us,  2.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu16 : 98.0 us,  2.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu17 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu18 : 97.3 us,  2.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu19 : 98.7 us,  1.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu20 : 99.3 us,  0.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu21 : 97.3 us,  2.3 sy,  0.0 ni,  0.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu22 : 97.0 us,  3.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu23 : 96.0 us,  4.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
...
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
18901 mysql     20   0 25.843g 0.017t   7808 S  2386 37.0 295:34.05 mysqld

Now, here is our “salad”:

p1:2499
p23:2366
p2:2297
p0:4735
p12:12349
p14:1412
p3:2045
p16:4157
p20:3160
p18:8717
p17:2967
p13:4519
p15:5462
p10:1312
p5:2815
p7:4644
p9:766
p4:3218
p6:4175
p21:2958
p8:929
p19:4182
p22:3231
p11:4020

As we can see, all partitions are in random order. If needed, we can even sort the result array (which isn’t needed for this example as we only care about the total). Finally our result and timing:

All done! Total: 88935
real    0m30.668s
user    0m0.256s
sys     0m0.028s

Timing and Results

  • Original query, single thread: 5 minutes
  • Modified query, 24 threads in Node JS: 30 seconds
  • Performance increase: 10x

If you are interested in the original question (MySQL versus PostgreSQL, Jan 2008):

  • MySQL, total visits: 88935
  • PostgreSQL total visits: 17753

Further Reading:

PS: Original Asynchronous Salad Joke, by Vlad @Crazy_Owl (in Russian)

by Alexander Rubin at May 27, 2016 08:58 PM

Join me for the Community Open House for MongoDB

MongoDB-Community-Open-House

MongoDB-Community-Open-House
If you can make it to Manhattan Thursday, June 30, 2016, please join me at the Community Open House for MongoDB. The Community Open House for MongoDB, held from 9 AM to 6 PM at the Park Central Hotel, is a free and open event that will feature technical presentations and sessions from key members of the MongoDB open source community. It’s the day after MongoDB World ends, so many of you will already be in town and ready to talk MongoDB – stay one more day to get an even bigger and broader perspective on your MongoDB needs!

The open source community is a diverse and powerful collection of companies, organizations and individuals that have helped to literally change the world. Percona is proud to call itself a member of the open source community, and we strongly feel that upholding the principles of the community is a key to our success. These principals include an open dialog, an open mind, and a zeal for cooperative interaction. Since Percona and ObjectRocket weren’t allowed to participate in MongoDB World, we felt this event would help foster creative and productive dialogue within the community. Together, we can build amazing things!

A reception will be held from 4:30 PM to 6:00 PM, featuring plenty of food, drink and fun.

Event Speakers:

The Community Open House for MongoDB features expert speakers from Percona, ObjectRocket, Facebook, Appboy, Severalnines and The Washington Post. Here’s a partial lineup – more will be announced soon!

  • Open Source Monitoring for MongoDB – Peter Zaitsev, Percona
  • Real-world Operational Concerns for Scaling MongoDB – Kim Wilkins, ObjectRocket
  • Doing More With Less With RocksDB and MongoRocks at Facebook – Islam AbdelRahman, Facebook
  • Tracking Billions of App Installs with MongoDB – Jon Hyman, Appboy
  • WiredTiger Scalability and Performance Optimization – Vadim Tkachenko, Percona
  • Using MongoDB and NodeJS to Build Better Forms at the Washington Post – Kat Styons, The Washington Post
  • Real-world Operational Concerns for Scaling MongoDB – Tim Banks, ObjectRocket
  • How to Automate, Monitor and Manage Existing MongoDB Servers – Art van Scheppingen, Severalnines
  • Open Source Encryption for MongoDB – Tim Banks, ObjectRocket
  • My First Moments with MongoDB – Colin Charles
  • Open Source Backups for MongoDB – David Murphy – Percona
  • MongoDB Chunks – Distribution, Splitting & Merging – Jason Terpko, ObjectRocket
  • MongoDB for MySQL Developers and DBAs – Alexander Rubin, Percona

This event is free of charge and open to all, but we do ask you to register in advance so we can save you a seat.

I hope to see you Thursday in NYC!

by Peter Zaitsev at May 27, 2016 06:36 PM

May 26, 2016

Peter Zaitsev

Monitoring made easy with Percona App for Grafana

Percona-App-for-Grafana

Percona has released a new Percona App for Grafana!

Are you using Grafana 3.x with Prometheus’ time-series database? Now there is a “Percona App” available on Grafana.net! The app provides a set of dashboards for MySQL performance and system monitoring with Prometheus’ datasource, and make it easy for users install them. The dashboards rely on the alias label in the Prometheus config and depend on the small patch applied on Grafana.

The dashboards in the app are:

  • Cross Server Graphs
  • Disk Performance
  • Disk Space
  • Galera Graphs
  • MySQL InnoDB Metrics
  • MySQL MyISAM Metrics
  • MySQL Overview
  • MySQL Performance Schema
  • MySQL Query Response Time
  • MySQL Replication
  • MySQL Table Statistics
  • MySQL User Statistics
  • Prometheus
  • Summary Dashboard
  • System Overview
  • TokuDB Graphs
  • Trends Dashboard

Percona App for Grafana

The Grafana and Prometheus teams are doing a fantastic job of bringing monitoring and time-series to the next level. They are making collecting and graphing metrics simple and more usable.

See my previous blog post for step-by-step instructions on how to install Grafana and Prometheus. Get the Percona App for Grafana today!

by Roman Vynar at May 26, 2016 05:57 PM

AWS Aurora Benchmarking part 2

AWS-Aurora-Benchmarking

AWS Aurora Benchmarking

Some time ago, I published the article on AWS Aurora Benchmarking (AWS Aurora Benchmarking – Blast or Splash?), in which I analyzed the behavior of different solutions using synchronous replication in AWS environment. This blog follows up with some of the comments and suggestions I received regarding that post from the community and Amazon engineers.

I decided to perform another round of tests, keeping in mind comments and suggestions received.

I presented some of the results during the Percona conference in Santa Clara last April 2016. The following is the transposition that presentation, with more details.

Not interested in the preliminary descriptions? Go to the results section

Why new tests?

A very good question, with an easy answer.

Aurora is a product that is still under development and refinement: six months of development could present major changes in performance. Not only that, but the initial tests focused on entry-level solutions, meaning I was analyzing the kind of users that are currently starting their business and looking for a flexible solution that allows them to save money and scale.

This time, I put the focus on enterprise solutions by analyzing what an already well-established company would get when looking for a decent scalable solution.

These are two different scenarios.

Why so many (different) tests?

I used many different benchmarking tools, and I am still planning to run others. Why so? Why not simply use one of them?

Again, a simple answer. I used different tools because in some cases, they provide me a different way of accessing and using data. I also do not trust benchmarking tools, not even the ones I developed. I wanted to test the same thing using different tools and compare the results. ONLY if I see a common pattern, then would I consider the test valid. Personally, I tend to discard any test that is not consistent, or if the analysis performed is using a single benchmarking tool. In my opinion, being lazy is not an option when doing these kind of exercises.

About the tests

It was difficult to compare apples to apples here. And I think that is the main point to keep in mind.

Aurora is not a standard RDS solution, like we are used to. Aurora looks like MySQL, smells like MySQL, but is not vanilla MySQL. To achieve what they have, the engineers had to change many parts. The more you dig in, the more you realize there are significant differences.

Because of that, I had to focus more on identifying what each solution can do and compare the solutions against expectations, rather than comparing the numbers.

I was more interested to see what happen if:

  • I have a burst of connections, and my application goes from 4K to 40K connections. Will it crash? Will it slow down?
  • How long should I wait if a node fails?
  • What should I not have in my schema design, to prevent bottlenecks?

Those are relevant questions in my opinion, more so than discovering that solution A has 3000 rows written/sec, and solution B has 3100. Or that I might (might) have some additional page rotation, file -> memory-> flushes because the amount of memory differs.

That is valuable information, for sure, but less valuable than having a decent understanding of which platform will help my business grow and remain stable.

What is the right tool for the job? This is the question I am addressing.

Tests run

I had run three main kinds of tests:

  • Performance and load stress
  • High availability failover
  • Response time (latency) from the application point of view

Performance and load stress

These tests were the most extensive and demanding.

I analyzed the capacity to serve the load under different conditions, from a light load up to full utilization, and some degree of resource saturation.

  • The first set of tests were to evaluate a simple load on a single table, causing the table to become a hotspot and showing how the platform would manage the increasing contention.
  • The second set of tests were to perform a similar load, but distributing it cross multiple tables and batching the operations. Parallelization, contention, scalability and distributed hotspots were in the picture.

The two above focused on write operations only, and were done using different tools (comparing the results as they were complementary).

  • Third set of tests, using my own stress tool, were focused on R/W oriented usage. The tests were executed against multiple tables, performing CRUD actions, using simple and batch insert, reads by PK, index, by range, IN and exact match conditions.
  • The fourth set of tests were performed using a TPC-C like load (OLTP).
  • The fifth set of tests were using sysbench in OLTP mode, with 250 tables.

The scope of the last three set of tests was to identify how the platforms would manage the load, considering the following:

  • Read and write contention on the same tables
  • High level of parallelism (from the application)
  • Possible hot-spots (TPCC district)
  • Increasing utilization (memory, threads, IO)
  • Saturation (connections)

Finally, all tests were run with fully utilized BufferPool.

The machines

Small boxes (first round of tests):

EIP = 1
VPC = 1
ELB=1
Subnets = 4 (1 public, 3 private)
HAProxy = 6
MHA Monitor (micro ec2) = 1
NAT Instance (EC2) =1 (hosting EIP)
DB Instances (EC2) = 3 (m4.xlarge) 16GB
Application Instances (EC2) = 6 (4)
EBS SSD 3000 PIOS
Aurora RDS node = 3 (db.r3.xlarge) 30GB

Large boxes (latest tests):

EIP = 1
VPC = 1
ELB=1
Subnets = 4 (1 public, 3 private)
HAProxy = 4
MHA Monitor (micro ec2) = 1
NAT Instance (EC2) =1 (hosting EIP)
DB Instances (EC2) = 3 (c3.8xlarge) 60GB
Application Instances (EC2) = 4
EBS SSD 5000 PIOS
Aurora RDS node = 3 (db.r3.8xlarge) 244GB

A note

It was pointed out to me that I deliberately chose to use an Ec2 solution for Percona XtraDB Cluster with less memory than the one available in Aurora. This is true, and we must take that into consideration. The reason for this is that the only Ec2 solution matching the memory of a db_r3.8xlarge is the d2.8xlarge.

I did try it, but the level of scalability I got (from the CPU point of view) was less efficient than the one available with c3.8xlarge. I decided to prefer CPU resources to memory, especially because I was going to test concurrency and parallelism in conjunction with the load increase.

From the result, I feel confident that I chose correctly – but I am open to comment.

The layout

This is what the setup looks like:

AWS Aurora Benchmarking

Where you read Java, those are the application nodes running the different test applications.

Two words about Aurora first

Aurora has a few key concepts that we must have clearly in mind, especially how it manages the writes across replica, and how connections are implemented.

The IO activity

To replicate the information across the different storage, Aurora only replicates FRM files and data coming from IB_LOGS. This is a quite significant advantage to other forms of replication, given the limited number of bytes that are replicated over the network (and also if they are replicated six times).

AWS Aurora Benchmarking

Another significant advantage is that Aurora does not use a double write buffer, which is obviously another blast (see the recent optimization in Percona Server https://www.percona.com/blog/2016/05/09/percona-server-5-7-parallel-doublewrite/ ).

In other words, writes in Aurora are organized by filling its commit queue and pushing the changes as group commit to the storage.

AWS Aurora Benchmarking

In some presentations, you might have seen that all steps are asynchronous. But is important to underline that a commit is acknowledged by Aurora when at least two availability zones (AZ) have received and written the incoming data related to that commit. Writes here mean received in the storage node incoming queue and with a quorum of four over six nodes.

This means that no matter what, data has to travel on the network to reach the final destination, and ACK signals come back before Aurora returns the ACK to the commit operation. The network is in the same region, but still it could represent an incognita about performance. No wonder we could have some latency at this stage!

As you can see, what I am reporting is also confirmed in the image below (and in the observations). The point is that the impact of steps 1 – 2 is not obviously clear.

AWS Aurora Benchmarking

Thread pooling

Aurora also use thread pooling – a lot! That will become very clear later, and as more of the work is based on parallelism, the more efficient thread pooling seems to be.

In most cases we are used to seeing CPUs on database servers not fully utilized, unless there is some heavy ordering operation or a bad query. That behavior is also (not only) a direct consequence of the connection-to-thread model, which implies a period of latency and stand by. In Aurora, the incoming connections are not following the same model. Instead, the pool redistributes the load of the incoming connection to a pool of threads, optimizing the latency period, resulting in a higher CPU utilization. Which is what you want from your resource: to be utilized and not waiting for something else to do its job.

AWS Aurora Benchmarking

 

The results

Without wasting more electronic ink, let see what comes out of this round of tests (not the final one by the way). To simplify the results, I will also report the graphs from the first set of tests, but will focus on the latest.Small Boxes = SB, Large Boxes LB.

Small Boxes = SB, Large Boxes = LB.

First Test: IIBench

As declared previously, my scope was to verify how the two platforms would have reacted to a simple load focus on inserts with a basic single table. The bufferpool was saturated before running the test.

SB

AWS Aurora Benchmarking

LB

AWS Aurora Benchmarking

As we can see, in the presence of a hot spot the solution using Percona XtraDB Cluster outperformed Aurora, in both cases. What is notable, though, is that while XtraDB Cluster remained approximately around the same time/performance, Aurora is significantly reduced the time taken. This shows that Aurora was taking advantage of the more powerful platform, while XtraDB Cluster was not able to.

With further analyzation of the details, we notice that Aurora performs better atomically. It was able to manage more writes/second as well as rows and pages managed. But it was inconsistent: Aurora had performance hiccups at regular intervals. As such the final result was that it took more time to process the whole workload.

I was not able to dig to deeply, given some metrics are not fully available in Aurora. As such I had to rely fully on Aurora engineers, who mentioned to me that hot-spot contention was a possible issue.

Aurora Handler calls:

AWS Aurora Benchmarking

XtraDB Cluster Handlers:

AWS Aurora Benchmarking

The execution in XtraDB Cluster showed fewer calls but constant performance, while Aurora has hiccups.

Aurora page activity write:

AWS Aurora Benchmarking

XtraDB Cluster page activity write:

AWS Aurora Benchmarking

The trend shown by the handlers stayed consistent in the page management and rows insert, as expected.

Second Test: Application Ingest

As mentioned, this test showed many threads from different application servers, inserted by a batch of 50 statements against multiple tables.

The results coming from this test are quite favorable to Aurora, as we can see starting from the time taken to complete the same workload:

LB

AWS Aurora Benchmarking

SB

AWS Aurora Benchmarking

With small ones, the situation was inverted.

But here is where the interesting part starts.

Aurora can manage significantly higher numbers of rows, as the picture below shows:

AWS Aurora Benchmarking

The results are also constant, and don’t decrease significantly like the inserts with XtraDB Cluster.

The number of handler commits, however, are significantly less.

AWS Aurora Benchmarking

Once more they stay the same with the load increase, without impacting performance.

Reviewing all handler calls, we get our first surprise.

XtraDB Cluster handler calls:

AWS Aurora Benchmarking

Aurora handler calls:

AWS Aurora Benchmarking

The gap/drop existing in the two graphs are the different tests (with an increasing number of threads).

Two things to notice here: the first one is that XtraDB Cluster decreases in performance while processing the load, while Aurora does not. The second (you need to zoom the image) is the number of commits is floating in XtraDB Cluster, while it stays fixed in Aurora.

An even bigger surprise comes up when reviewing the connections graphs.

As expected, XtraDB Cluster has all my connections open, and the number of threads running is quite close to the number of connected threads.

AWS Aurora Benchmarking

Both of them follow the increasing number of connected threads.

But this is not the case in Aurora.

AWS Aurora Benchmarking

Also, if my applications are trying to open ~800 threads, the Aurora node see only a part of them, and the number of running is fixed to 32 threads.

The important things to consider here are that a) my applications don’t connect directly to the Aurora instance, but to a connector (MariaDB), and b) that Aurora, in this case, caps the number of running threads to the number of CPU available on the instance (here 32).

Given that, I expected to have worse performance (but I don’t). The fact that Aurora uses one thread for multiple connections seems to be working quite efficiently.

The number of rows inserted is also consistent with the handler calls, and has better performance than XtraDB Cluster.

Aurora rows inserted:

AWS Aurora Benchmarking

XtraDB Cluster rows inserted

AWS Aurora Benchmarking

Again we have the same trend, only, this time, Aurora performs better than XtraDB Cluster.

Third Test: OLTP Application

When run on the small boxes, this test saw XtraDB Cluster performing much better than Aurora. The time taken by Aurora was ~3 times the time taken by XtraDB Cluster.

AWS Aurora Benchmarking

With a large box, I had the inverse result: Aurora is outperforming XtraDB Cluster from 2 to 7 times the speed.

AWS Aurora Benchmarking

Analyzing the number of commands executed with the increasing workload, we can see how XtraDB Cluster can perform better than Aurora with a workload of 128 threads, but starts to have worse performance as the load increases.

On the other hand, Aurora manages the read/write load without significant performance loss, which includes being able to increase the number of commits/sec.

AWS Aurora Benchmarking

Reviewing the handler calls, we see that the handler commit calls are significantly less in Aurora (as already noticed in the ingest tests).

AWS Aurora Benchmarking

Another thing to note is that the number of calls for XtraDB Cluster is significantly higher and not scaling, while Aurora has a nice scaling trend.

Fourth Test: TPCC-mysql

The TPCC test is mainly to test OLTP traffic, with the note that some tables (like district) might become a hotspot. The tests I ran were executed against 400 warehouses, and used 128 threads maximum for the small box and 2048 threads for the large box.

During this test, I hit one of the Aurora limitations and I escalated it to the Aurora engineers (who are aware of the problem).

Small boxes:

AWS Aurora Benchmarking

In the case of small boxes, there is nothing to say: XtraDB Cluster manages the load more efficiently. This trend is not optimal, having significant fluctuation. Aurora is just not able to keep it up.

Large boxes:

 

AWS Aurora Benchmarking

It is a different and a more complex scenario in the case of the use of large boxes. I would like to say that Aurora performs better.

This is true for two of the three tests, and up to when it got stuck by internal limitation Aurora was also performing better on the third. But then its performance just collapsed.

With a more in-depth investigation, I noticed that under the hood Aurora was not performing as well as it appeared. This comes out quite clearly by looking at a comparison between the graphs covering Comm_ execution, open files, handlers and InnoDBrow lock time.

In all of them it is evident how XtraDB Cluster keeps serving the workload with consistent behavior, while Aurora fails the second test on (512 threads) — not just on the third with 2048 threads.

Aurora:

AWS Aurora Benchmarking

XtraDB Cluster:

AWS Aurora Benchmarking

It is clear that Aurora was better served during the test with 256 threads going over the 450K com select serve (in 10 sec interval), compared with XtraDB Cluster that was not able to go over 350K.

But in the following tests, while XtraDB Cluster was able to keep going (with decreasing performance), Aurora started to struggle with very inconsistent behavior.

This was also confirmed by the open files graph.

Aurora:

AWS Aurora Benchmarking

XtraDB Cluster:

AWS Aurora Benchmarking

The graphs show the instances of files open during the test, not the ones already open. It reflects the Open_file metric “The number of files that are open. This count includes regular files opened by the server. It does not include other types of files such as sockets or pipes. Also, the count does not include files that storage engines open using their own internal functions rather than asking the server level to do so.”

I was quite surprised by the number of files open by Aurora.

Handlers reflected the same behavior, as well.

Aurora:

AWS Aurora Benchmarking

XtraDB Cluster:

AWS Aurora Benchmarking

Perfectly in line with the com trend.

So what was increasing in reverse?

Aurora:

AWS Aurora Benchmarking

XtraDB Cluster:

AWS Aurora Benchmarking

As you can see from the above, the exactly same workload generated an increasing lock row time, from quite low in the test with 256 threads, up to a crazy high with 2048 threads.

As mentioned, we know that TPCC has a couple of tables that act as hotspots, and we already saw with IIbench how Aurora is not working efficiently in that case.

I also was getting a lot of 188 errors during the test. This is an Aurora internal error. When I reported it, I was told they know about it, and they are planning to work on it.

I hope they do soon, because if this issue is solved it is very likely that Aurora will not only be able to manage the tested workload, but exceed it by far.

I am saying this because also with the identified issues Aurora was able to keep going and manage a more than decent response time during the second test (with 512 threads).

AWS Aurora Benchmarking

Fifth Test: Sysbench

I added the sysbench tests to test scalability, and to see the what happens when the system reaches a saturation point. This test brought up some limitations existing in the Aurora solution, related more to the connector than the Aurora engine itself.

Aurora has a limit of 16k connections. I wanted to see what happens if I got to saturation point or close to it. It doesn’t matter if this is a ridiculously high number or not.

What happened was that Aurora managed traffic up to 4K. The closer I got to the limit, however, the more I had a connectivity issue. At the end I had to run the test with 8K, 12K and 20K threads pointing directly to the Aurora instance, bypassing the connector that was not able to serve the traffic. After that, I was able to hit up to ~15500 threads (but with a lot of inconsistent performance). I am defining the limit of a meaningful test from the previous level of 12K threads.

XtraDB Cluster was able to scale up to 16K no problem.

What also is notable here is that Aurora was able to manage the workload more efficiently regarding transaction handling (i.e., as transactions executed and latency).

AWS Aurora Benchmarking

The number of transactions executed by Aurora was ~three times the one executed by XtraDB Cluster.

AWS Aurora Benchmarking

Regarding latency, Aurora showed less latency then XtraDB Cluster.

Internally, Aurora and XtraDB Cluster operations were once again different regarding how the workload was handled. The most divergent result was the handler calls:

AWS Aurora Benchmarking

Commit calls in Aurora were a fraction of the calls in XtraDB Cluster, while the number of rollbacks was higher.

The read calls had an even more divergent behavior, with XtraDB Cluster performing a higher number of read_keys, while Aurora was having a very limited number of them. Read_rnd are very high in XtraDB Cluster, but totally absent in Aurora (note that in Aurora, read_rnds are reported but seem not to increase). On the other hand, Aurora reported a high number of read_rnd_next, while XtraDB Cluster has none.

AWS Aurora Benchmarking

HA availability

Fail-over time

Both solutions:

AWS Aurora Benchmarking

In this test, the fail-over time for the solution using Galera and HAProxy was more efficient. For both a limited or mid-level load. One assumption is that given Aurora has to verify both the status of the data transmitted and its consistency across the six data store nodes in every case; the process is not as fast as it could be.

It could also be that the cluster connector is not as efficient as it should in redirecting the traffic from one node to another. It would be a very interesting exercise to replace it with some other custom solution.

Note that I was performing the tests following the Amazon recommendation to use the following to simulate a real crash:

ALTER SYSTEM CRASH [INSTANCE|NODE]

As such, I was not doing anything strange or out of the ordinary.

It is worth mentioning that of the eight seconds taken by MySQL/Galera to perform the failover, six were due to the HAProxy settings (which had a 3000 ms interval and two loops in the settings before executing failover).

Execution latency

The purpose of these tests was to identify the latency existing between the moment that application sends the request and the moment MySQL/Aurora took the request in “charge”. The expectation is that the busier the database, the higher the latency.

For this test, I reported both results: the one coming from the old tests with the small box, and the new one with the large box.

Small boxes:

AWS Aurora Benchmarking

Large boxes:

AWS Aurora Benchmarking

It is clear from the graphs that the two tests report different scenarios. In the first, Galera was able to manage the load more efficiently and serve requests with lower latency. For the new tests, I had used a higher number of threads than the ones for the small box. Nevertheless, in the second test the CPU utilization and the number of running threads lead me to think that Aurora was finally able to utilize resources more efficiently and the lower latency.

The latency jumped up again when the number of connections rose above 12K, but that was expected given previous tests results.

Conclusions

High Availability

The two platforms were able to manage the failover operation in a limited time frame (below 1 minute). Nevertheless, MySQL/Galera was shown to be more efficient and consistent. This result is a direct consequence of synchronous replication, which by design prevents MySQL/Galera from allowing an active node to fall behind.

In my opinion, the replication method used in Aurora is efficient, and given that data is shared across the read replicas, fail-over should happen faster.

The tests suffered because of the connector, and I have the feeling that having another solution in place may bring some surprises (actually, I would like to test that as well).

Performance

In this run of tests, Aurora was able to invert the results I had in the first test with the small boxes. In almost all cases, Aurora performed as well or better then XtraDB Cluster. There are still cases where Aurora is penalized, and those are the ones where hotspots are present. The contention in Aurora is killing performance, and raise errors (188). But I hope we will see a significant evolution soon.

General Comments on Aurora

The product is evolving quickly, and benchmark results may become obsolete in very short time (this is why it is important to have repeatable and comparable tests). From my point of view, in this set of tests Aurora clearly shows where it’s a better fit: higher-end levels, where high availability and CPU power is the focus (not concerns about the cost).

There is no reason to use Aurora in small-mid boxes: the platform is not going to be as efficient as a standard solution like XtraDB Cluster. But if cost is not an issue, and the applications require a lot of parallelism, Aurora on db.r3.8xlarge is a good solution.

I still see space for improvements (like for cluster connectors, or the time taken to restart a cluster after a full stop, or contention reduction). But I am also confident that the work led by the development team will fix most of my concerns (and more) soon.

Final note: it would be nice to have the code open source, so that the community could contribute (but I understand the business reasons not to).

About Cost

I don’t think it is this the right place to mention the cost of each solution (especially because each need is different).

As such, I am not reporting any specific numbers. You can, however, follow the links below and do the necessary math:

Aurora cost calculator

AWS cost calculator

 

by Marco Tusa at May 26, 2016 12:45 PM

Jean-Jerome Schmidt

Planets9s - ClusterControl 1.3 webinar replay, Polyglot Persistence Meetups, and more!

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch the replay: ClusterControl 1.3 webinar with new features for MySQL, MariaDB, Percona Server and PostgreSQL

Thanks to everyone who joined our ClusterControl 1.3 release webinar this week. If you weren’t able to attend or would simply like to watch it again, the replay is now available online. Johan Andersson, CTO at Severalnines, gave an overview of ClusterControl’s deployment, monitoring, management and scaling functionalities for MySQL, MariaDB, Percona Server, MongoDB and PostgreSQL, as well as the new features around key management, operational reports and the new deployment tool for MySQL NDB Cluster … and more!

Watch the webinar replay

Sign up for our European Polyglot Persistence Meetups Tour

Polyglot Persistence means that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by the application. We’re starting off with Amsterdam, then moving on to Dublin, Paris, Berlin, Stockholm and London. And for our kick-off in Amsterdam, we’re lucky to be sponsored by Booking.com. Please sign up for the meetup of your choice, as we’re announcing locations, speakers etc. via the Meetup platform.

Sign up for a meetup

Become a ClusterControl DBA: Operational Reports for MySQL and MariaDB

Performing database health checks is a lot more time-consuming than it should be. As a DBA, you have to gather data from multiple places in order to understand the state of your databases. This is why we introduced operational reports in ClusterControl 1.3, where we gather some of the most important data into a single document, which can be quickly reviewed. This new blog post explains how you can make best use of this new functionality in ClusterControl.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at May 26, 2016 11:43 AM

May 25, 2016

Peter Zaitsev

MongoDB 3.2: elections just got better!

MongoDB-replication-4

Introduction

In this blog, we’ll review MongoDB 3.2 elections and how they work, as well as what is really new and different in the election protocol.

MongoDB 3.2 revamped its election protocol for increased stability! Exciting times, with smarter and faster elections are here! With this latest release, you will find that replication (and the election protocol) have been improved. Some of the changes include:

  • The addition of
    electionTimeoutMS
  • WriteConcern
      now implies “j:true”
    • Old j:true meant just the primary node
    • New j:true means all involved nodes must ACK the journal
    • j:true means your journal MS will be thirded, and synchronization occurs every 10ms (MMAP) or 33ms (WiredTiger) by default
  • Optime in rs.status now an Object, not a Timestamp

You’ll need to enable the Election Protocol when upgrading MongoDB from an earlier version, while new replSets get it enabled by default.

Election Protocol: what is an election?

Mongo uses a consensus protocol. This means that all nodes must agree who is the most current when handing:

  • Hardware failure
  • Network split
  • Time shifts

New updates allow for faster elections using an (term) electionId to prevent timeout between separate voting rounds. This guarantees there aren’t double (and conflicting) votes while also reducing the time to wait to know a vote completed.

How does it do it?

Elections now have “term” or “vote” identifiers (ID). Terms are used to separate voting rounds. Every vote attempt increments the ID. The ID incrementation prevents a node from double voting in the same term, and makes it easier for nodes to know if a re-vote is needed where before it could be up to 5 minutes!

The protocol timeouts have some new features and behaviors:

  • Now configurable
  • Randomness added to each node
  • Less chance all node timeout at the same time

Normal election process

Below I’m going to walk you through a typical replica set operation. The configuration looks like the following:

MongoDB 3.2 elections

In this topology:

  • There are three members
  • All of them are heartbeating to each other
  • There is no arbiter, so you get full high availability (HA)

The following diagram provides a more detailed picture of the interactions:

MongoDB 3.2 elections

Notice how replication pulls from the primary to each secondary from the primary – the secondary does all the work. A heartbeat is still shared by all the nodes.

Now let’s see what happens when our primary crashes. It just did!

MongoDB 3.2 elections

Nodes will still try to heartbeat to it until two have failed in a short period.

MongoDB 3.2 elections

After the failure, things happen quickly.

  1. Secondaries give up on heartbeats
  2. They then vote with each other on who is newest in oplog
  3. If they have > 50% of total voting population they select a new winner

A new Primary is selected, and the heartbeat system is cleaned up.

MongoDB 3.2 elections

Replication now gets restarted. If the fatal node comes back online, it’s treated as a secondary once it “catches up” via the oplog.

Stepdown Election Process

The stepdown election process is the same as above, with the following caveats:

  • It’s MUCH faster, as the existing primary “starts” an election
  • There is less chance of the old primary not having data replicated
  • It kills writes while doing election
  • The election process doesn’t wait for heartbeat timeouts

Generally speaking, you should always try to use the stepdown election process. Timeouts are for crashes and failures, not general use.

 

by David Murphy at May 25, 2016 07:26 PM

Percona Server 5.6.30-76.3 is now available

percona server 5.6.30-76.3


percona server 5.6.30-76.3Percona
is glad to announce the release of Percona Server 5.6.30-76.3 on May 25, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.6.30, including all the bug fixes in it, Percona Server 5.6.30-76.3 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.30-76.3 milestone on Launchpad.

Bugs Fixed:

  • When Read Free Replication was enabled for TokuDB, and there was no explicit primary key for the replicated TokuDB table, there could be duplicated records in the table on update operation. The fix disables Read Free Replication for tables without an explicit primary key and does rows lookup for UPDATE and DELETE binary log events and issues warning. Bug fixed #1536663 (#950).
  • Attempting to execute a non-existing prepared statement with Response Time Distribution plugin enabled could lead to a server crash. Bug fixed #1538019.
  • TokuDB was using using different memory allocators; this was causing safemalloc warnings in debug builds and crashes because memory accounting didn’t add up. Bug fixed #1546538 (#962).
  • Fixed heap allocator/deallocator mismatch in Metrics for scalability measurement. Bug fixed #1581051.
  • Percona Server is now built with system zlib library instead of the older bundled one. Bug fixed #1108016.
  • Reduced the memory overhead per page in the InnoDB buffer pool. The fix was based on Facebook patch #91e979e. Bug fixed #1536693 (upstream #72466).
  • CREATE TABLE ... LIKE ... could create a system table with an unsupported enforced engine. Bug fixed #1540338.
  • Change buffer merge could throttle to 5% of I/O capacity on an idle server. Bug fixed #1547525.
  • Slave_open_temp_tables would fail to decrement on the slave with a disabled binary log if the master was killed. Bug fixed #1567361.
  • The server will now show a more descriptive error message when Percona Server fails with errno == 22 "Invalid argument", if innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1578604.
  • Killed connection threads could get their sockets closed twice on shutdown. Bug fixed #1580227.
  • AddressSanitizer build with LeakSanitizer enabled was failing at gen_lex_hash invocation. Bug fixed #1580993 (upstream #80014).
  • apt-cache show command for percona-server-client was showing innotop included as part of the package. Bug fixed #1201074.
  • mysql-systemd would fail with PAM authentication and proxies due to a regression introduced when fixing #1534825 in Percona Server 5.6.29-76.2. Bug fixed #1558312.
  • Upgrade logic for figuring if TokuDB upgrade can be performed from the version on disk to the current version was broken due to a regression introduced when fixing bug #684 in Percona Server 5.6.27-75.0. Bug fixed #717.
  • If ALTER TABLE was run while tokudb_auto_analyze variable was enabled it would trigger auto-analysis, which could lead to a server crash if ALTER TABLE DROP KEY was used because it would be operating on the old table/key meta-data. Bug fixed #945.
  • The TokuDB storage engine with tokudb_pk_insert_mode set to 1 is safe to use in all conditions. On INSERT IGNORE or REPLACE INTO, it tests to see if triggers exist on the table, or replication is active with !BINLOG_FORMAT_STMT before it allows the optimization. If either of these conditions is met, then it falls back to the “safe” operation of looking up the target row first. Bug fixed #952.
  • Bug in TokuDB Index Condition Pushdown was causing ORDER BY DESC to reverse the scan outside of the WHERE bounds. This would cause a query to hang in a sending data state for several minutes in some environments with large amounts of data (3 billion records) if the ORDER BY DESC statement was used. Bugs fixed #988, #233, and #534.

Other bugs fixed: #1399562 (upstream #75112), #1510564 (upstream #78981), #1496282 (#964), #1496786 (#956), #1566790, #1552673, #1567247, #1567869, #718, #914, #970, #971, #972, #976, #977, #981, #637, and #982.

Release notes for Percona Server 5.6.30-76.3 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

by Hrvoje Matijakovic at May 25, 2016 04:54 PM

Jean-Jerome Schmidt

Watch the replay: ClusterControl 1.3 webinar with new features for MySQL, MariaDB, Percona Server, PostgreSQL and more!

Thanks to everyone who joined us yesterday for our ClusterControl 1.3 release webinar!

Johan Andersson, CTO at Severalnines and creator of ClusterControl, walked us through the latest features of the 1.3 release and demonstrated them live as well. In addition to an overview of ClusterControl’s deployment, monitoring, management and scaling functionalities for MySQL, MariaDB, Percona Server, MongoDB and PostgreSQL, Johan focussed our attention on new features around key management, operational reports and more.

One feature-set that triggered particular interest in yesterday’s audience was the automated deployment of a production setup of NDB / MySQL Cluster: users can create a production setup of NDB/MySQL Cluster from ClusterControl and deploy management, SQL/API and data nodes - all via the ClusterControl interface.

The replay of this webinar and the slides are now available for viewing online:

Sign up for the the replay Read the slides

To get started with ClusterControl, download it today.

Webinar Agenda

  • ClusterControl overview
  • New features deep-dive
    • Key management and encryption
    • Additional operational reports
    • Improved security
    • Create / mirror repository
    • Create NDB / MySQL Cluster
  • Live Demo
  • Q&A

Speaker

Johan Andersson, CTO, Severalnines - Johan's technical background and interest are in high performance computing as demonstrated by the work he did on main-memory clustered databases at Ericsson as well as his research on parallel Java Virtual Machines at Trinity College Dublin in Ireland. Prior to co-founding Severalnines, Johan was Principal Consultant and lead of the MySQL Clustering & High Availability consulting group at MySQL / Sun Microsystems / Oracle, where he designed and implemented large-scale MySQL systems for key customers. Johan is a regular speaker at MySQL User Conferences as well as other high profile community gatherings with popular talks and tutorials around architecting and tuning MySQL Clusters.

For more information on ClusterControl 1.3:

To get started with ClusterControl, download it today.

by Severalnines at May 25, 2016 02:52 PM

Oli Sennhauser

Why you should take care of MySQL data types

A customer reported last month that MySQL does a full table scan (FTS) if a query was filtered by a INT value on a VARCHAR column. First I told him that this is not true any more because MySQL has fixed this behaviour long time ago. He showed me that I was wrong:

CREATE TABLE `test` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `data` varchar(64) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `data` (`data`)
) ENGINE=InnoDB;

EXPLAIN SELECT * FROM test WHERE data = 42\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ALL
possible_keys: data
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 522500
     filtered: 10.00
        Extra: Using where

EXPLAIN SELECT * FROM test WHERE data = '42'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ref
possible_keys: data
          key: data
      key_len: 67
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

When I executed the query I got some more interesting information:

SELECT * FROM test WHERE data = '42';
Empty set (0.00 sec)

SELECT * FROM test WHERE data = 42;
+--------+----------------------------------+---------------------+
| id     | data                             | ts                  |
+--------+----------------------------------+---------------------+
|   1096 | 42a5cb4a3e76857a3efe7af44ba9f4dd | 2016-05-25 10:26:59 |
...
| 718989 | 42a1921fb2df42126d85f9586532eda4 | 2016-05-25 10:27:12 |
+--------+----------------------------------+---------------------+
767 rows in set, 65535 warnings (0.26 sec)

Looking at the warnings we also find the reason: MySQL does the cast on the column and not on the value which is a bit odd IMHO:

show warnings;
| Warning | 1292 | Truncated incorrect DOUBLE value: '80f52706c2f9de40472ec29a7f70c992' |

A bit suspicious I looked at the warnings of the query execution plan again:

show warnings;
+---------+------+---------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                     |
+---------+------+---------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'data' due to type or collation conversion on field 'data'   |
| Warning | 1739 | Cannot use range access on index 'data' due to type or collation conversion on field 'data' |
+---------+------+---------------------------------------------------------------------------------------------+

I thought this was fixed, but it seems not. The following releases behave like this: MySQL 5.0.96, 5.1.73, 5.5.38, 5.6.25, 5.7.12 and MariaDB 5.5.41, 10.0.21 and 10.1.9

The other way around it seems to work in both cases:

SELECT * FROM test WHERE id = 42;
+----+----------------------------------+---------------------+
| id | data                             | ts                  |
+----+----------------------------------+---------------------+
| 42 | 81d74057d7be8f20563da404bb1b3ab0 | 2016-05-25 10:26:56 |
+----+----------------------------------+---------------------+

SELECT * FROM test WHERE id = '42';
+----+----------------------------------+---------------------+
| id | data                             | ts                  |
+----+----------------------------------+---------------------+
| 42 | 81d74057d7be8f20563da404bb1b3ab0 | 2016-05-25 10:26:56 |
+----+----------------------------------+---------------------+

EXPLAIN SELECT * FROM test WHERE id = 42\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

Taxonomy upgrade extras: 

by Shinguz at May 25, 2016 09:42 AM

May 24, 2016

Peter Zaitsev

Looking inside the MySQL 5.7 document store

MySQL-5.7-Document-Store

In this blog, we’ll look at the MySQL 5.7 document store feature, and how it is implemented.

Document Store

MySQL 5.7.12 is a major new release, as it contains quite a number of new features:

  1. Document store and “MongoDB” like NoSQL interface to JSON storage
  2. Protocol X / X Plugin, which can be used for asynchronous queries (I will write about it as well)
  3. New MySQL shell

Peter already wrote the document store overview; in this post, I will look deeper into the document store implementation. In my next post, I will demonstrate how to use document store for Internet of Things (IoT) and event logging.

Older MySQL 5.7 versions already have a JSON data type, and an ability to create virtual columns that can be indexed. The new document store feature is based on the JSON datatype.

So what is the document store anyway? It is an add-on to a normal MySQL table with a JSON field. Let’s take a deep dive into it and see how it works.

First of all: one can interface with the document store’s collections using the X Plugin (default port: 33060). To do that:

  1. Enable X Plugin and install MySQL shell.
  2. Login to a shell:
    mysqlsh --uri root@localhost
  3. Run commands (JavaScript mode, can be switched to SQL or Python):
    mysqlsh --uri root@localhost
    Creating an X Session to root@localhost:33060
    Enter password:
    No default schema selected.
    Welcome to MySQL Shell 1.0.3 Development Preview
    Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    Type 'help', 'h' or '?' for help.
    Currently in JavaScript mode. Use sql to switch to SQL mode and execute queries.
    mysql-js> db = session.getSchema('world_x')                                                                                                                                                                 <Schema:world_x>
    mysql-js> db.getCollections()
    {
        "CountryInfo": <Collection:CountryInfo>
    }

Now, how is the document store’s collection different from a normal table? To find out, I’ve connected to a normal MySQL shell:

mysql world_x
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 2396
Server version: 5.7.12 MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> show create table CountryInfo
*************************** 1. row ***************************
       Table: CountryInfo
Create Table: CREATE TABLE `CountryInfo` (
  `doc` json DEFAULT NULL,
  `_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL,
  PRIMARY KEY (`_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> show tables;
+-------------------+
| Tables_in_world_x |
+-------------------+
| City              |
| Country           |
| CountryInfo       |
| CountryLanguage   |
+-------------------+
4 rows in set (0.00 sec)

So the document store is actually an InnoDB table with one field: doc json + Primary key, which is a generated column.

As we can also see, there are four tables in the world_x database, but db.getCollections() only shows one. So how does MySQL distinguish between a “normal” table and a “document store” table? To find out, we can enable the general query log and see which query is being executed:

$ mysql -e 'set global general_log=1'
$ tail /var/log/general.log
2016-05-17T20:53:12.772114Z  186 Query  SELECT table_name, COUNT(table_name) c FROM information_schema.columns WHERE ((column_name = 'doc' and data_type = 'json') OR (column_name = '_id' and generation_expression = 'json_unquote(json_extract(`doc`,''$._id''))')) AND table_schema = 'world_x' GROUP BY table_name HAVING c = 2
2016-05-17T20:53:12.773834Z  186 Query  SHOW FULL TABLES FROM `world_x`

As you can see, every table that has a specific structure (doc JSON or specific generation_expression) is considered to be a JSON store. Now, how does MySQL translate the .find or .add constructs to actual MySQL queries? Let’s run a sample query:

mysql-js> db.getCollection("CountryInfo").find('Name= "United States"').limit(1)
[
    {
        "GNP": 8510700,
        "IndepYear": 1776,
        "Name": "United States",
        "_id": "USA",
        "demographics": {
            "LifeExpectancy": 77.0999984741211,
            "Population": 278357000
        },
        "geography": {
            "Continent": "North America",
            "Region": "North America",
            "SurfaceArea": 9363520
        },
        "government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "George W. Bush",
            "HeadOfState_title": "President"
        }
    }
]
1 document in set (0.02 sec)

and now look at the slow query log again:

2016-05-17T21:02:21.213899Z  186 Query  SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1

We can verify that MySQL translates all document store commands to SQL. That also means that it is 100% transparent to the existing MySQL storage level and will work with other storage engines. Let’s verify that, just for fun:

mysql> alter table CountryInfo engine=MyISAM;
Query OK, 239 rows affected (0.06 sec)
Records: 239  Duplicates: 0  Warnings: 0
mysql-js> db.getCollection("CountryInfo").find('Name= "United States"').limit(1)
[
    {
        "GNP": 8510700,
        "IndepYear": 1776,
        "Name": "United States",
        "_id": "USA",
        "demographics": {
            "LifeExpectancy": 77.0999984741211,
            "Population": 278357000
        },
        "geography": {
            "Continent": "North America",
            "Region": "North America",
            "SurfaceArea": 9363520
        },
        "government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "George W. Bush",
            "HeadOfState_title": "President"
        }
    }
]
1 document in set (0.00 sec)
2016-05-17T21:09:21.074726Z 2399 Query  alter table CountryInfo engine=MyISAM
2016-05-17T21:09:41.037575Z 2399 Quit
2016-05-17T21:09:43.014209Z  186 Query  SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1

Worked fine!

Now, how about the performance? We can simply take the SQL query and run

explain
:

mysql> explain SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: CountryInfo
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 239
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

Hmm, it looks like it is not using an index. That’s because there is no index on Name. Can we add one? Sure, we can add a virtual column and then index it:

mysql> alter table CountryInfo add column Name varchar(255)
    -> GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$.Name'))) VIRTUAL;
Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table CountryInfo add key (Name);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> explain SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: CountryInfo
   partitions: NULL
         type: ref
possible_keys: name
          key: name
      key_len: 768
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

That is really cool! We have added an index, and now the original query starts using it. Note that we do not have to reference the new field, the MySQL optimizer is smart enough to translate the

(JSON_EXTRACT(doc,'$.Name') = 'United States'
 to an index scan on the virtual column.

But please note: JSON attributes are case-sensitive. If you will use

(doc,'$.name')
 instead of
(doc,'$.Name')
 it will not generate an error, but will simply break the search and all queries looking for “Name” will return 0 rows.

Finally, if you looked closely at the output of

db.getCollection("CountryInfo").find('Name= "United States"').limit(1)
 , you noticed that the database has outdated info:

"government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "George W. Bush",
            "HeadOfState_title": "President"
        }

Let’s change “George W. Bush” to “Barack Obama” using the .modify clause:

mysql-js> db.CountryInfo.modify("Name = 'United States'").set("government.HeadOfState", "Barack Obama" );
Query OK, 1 item affected (0.02 sec)
mysql-js> db.CountryInfo.find('Name= "United States"')
[
    {
        "GNP": 8510700,
        "IndepYear": 1776,
        "Name": "United States",
        "_id": "USA",
        "demographics": {
            "LifeExpectancy": 77.0999984741211,
            "Population": 278357000
        },
        "geography": {
            "Continent": "North America",
            "Region": "North America",
            "SurfaceArea": 9363520
        },
        "government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "Barack Obama",
            "HeadOfState_title": "President"
        }
    }
]
1 document in set (0.00 sec)

Conclusion

Document store is an interesting concept and a good add-on on top of the existing MySQL JSON feature. Using the new .find/.add/.modify methods instead of the original SQL statements can be convenient in some cases.

Some might ask, “why do you want to use document store and store information in JSON inside the database if it is relational anyway?” Storing data in JSON can be quite useful in some cases, for example:

  • You already have a JSON (i.e., from external feeds) and need to store it anyway. Using the JSON datatype will be more convenient and more efficient.
  • You have a flexible schema, typical for the Internet of Things for example, where some sensors might only send temperature data, some might send temperature/humidity/light (but light information is only recorded during the day), etc. Storing it in the JSON format can be more convenient so that you do not have to declare all possible fields in advance, and do not have to run “alter table” if a new sensor starts sending new types of data.

In the next two blog posts, I will show how to use document store for Internet of Things / event streaming, and how to use X Protocol for asynchronous queries in MySQL.

by Alexander Rubin at May 24, 2016 10:36 PM

pt-online-schema-change (if misused) can’t save the day

MySQL-replication

In this blog post we’ll discuss pt-online-schema-change, and how to correctly use it.

Always use pt-osc?

Altering large tables can be still a problematic DBA task, even now after we’ve improved Online DDL features in MySQL 5.6 and 5.7. Some ALTER types are still not online, or sometimes just too expensive to execute on busy production master.

So in some cases, we may want to apply an

ALTER
 first on slaves, taking them out of traffic pool one by one and bringing them back after the
ALTER
 is done. In the end, we can promote one of the already altered slaves to be new master, so that the downtime/maintenance time is greatly minimized. The ex-master can be altered later, without affecting production. Of course, this method works best when the schema change is backwards-compatible.

So far so good, but there is another problem. Let’s say the table is huge, and

ALTER
 takes a lot of time on the slave. When it is a DML-blocking type ALTER (perhaps when using MySQL 5.5.x or older, etc.), there will be a long slave lag (if the table is being written by replication SQL thread at the same time, for example). So what do we to speed up the process and avoid the altered slave lag? One temptation that could tempt you is why not use pt-online-schema-change on the slave, which can do the ALTER in a non-blocking fashion?

Let’s see how it that would work. I need to rebuild big table on slave using MySQL version 5.6.16 (“null alter” was made online since 5.6.17) to reclaim disk space after some rows are deleted.

This example demonstrates the process (db1 is the master, db2 is the slave):

[root@db2 ~]# pt-online-schema-change --execute --alter "engine=innodb" D=db1,t=sbtest1
 No slaves found.  See --recursion-method if host db2 has slaves.
 Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
 Operation, tries, wait:
 analyze_table, 10, 1
 copy_rows, 10, 0.25
 create_triggers, 10, 1
 drop_triggers, 10, 1
 swap_tables, 10, 1
 update_foreign_keys, 10, 1
 Altering `db1`.`sbtest1`...
 Creating new table...
 Created new table db1._sbtest1_new OK.
 Altering new table...
 Altered `db1`.`_sbtest1_new` OK.
 2016-05-16T10:50:50 Creating triggers...
 2016-05-16T10:50:50 Created triggers OK.
 2016-05-16T10:50:50 Copying approximately 591840 rows...
 Copying `db1`.`sbtest1`:  51% 00:28 remain
(...)

The tool is still working during the operation, and the table receives some writes on master:

db1 {root} (db1) > update db1.sbtest1 set k=k+2 where id<100;
 Query OK, 99 rows affected (0.06 sec)
 Rows matched: 99  Changed: 99  Warnings: 0
db1 {root} (db1) > update db1.sbtest1 set k=k+2 where id<100;
 Query OK, 99 rows affected (0.05 sec)
 Rows matched: 99  Changed: 99  Warnings: 0

which are applied on slave right away, as the table allows writes all the time.

(...)
 Copying `db1`.`sbtest1`:  97% 00:01 remain
 2016-05-16T10:51:53 Copied rows OK.
 2016-05-16T10:51:53 Analyzing new table...
 2016-05-16T10:51:53 Swapping tables...
 2016-05-16T10:51:53 Swapped original and new tables OK.
 2016-05-16T10:51:53 Dropping old table...
 2016-05-16T10:51:53 Dropped old table `db1`.`_sbtest1_old` OK.
 2016-05-16T10:51:53 Dropping triggers...
 2016-05-16T10:51:53 Dropped triggers OK.
 Successfully altered `db1`.`sbtest1`.

Done! No slave lag, and the table is rebuilt. But . . . let’s just make sure data is consistent between the master and slave (you can use pt-table-checksum):

db1 {root} (db1) > select max(k) from db1.sbtest1 where id<100;
 +--------+
 | max(k) |
 +--------+
 | 392590 |
 +--------+
 1 row in set (0.00 sec)
db2 {root} (test) > select max(k) from db1.sbtest1 where id<100;
 +--------+
 | max(k) |
 +--------+
 | 392586 |
 +--------+
 1 row in set (0.00 sec)

No, it is not! The slave is clearly missing the updates that happened during a pt-osc run. Why?

The explanation is simple. The

pt-online-schema-change
 relies on triggers. The triggers are used to make the writes happening to the original table also populate to the temporary table copy, so that both tables are consistent when the final table switch happens at the end of the process. So what is the problem here? It’s the binary log format: in ROW based replication, the triggers are not fired on the slave! And my master is running in ROW mode:

db1 {root} (db1) > show variables like 'binlog_format';
 +---------------+-------+
 | Variable_name | Value |
 +---------------+-------+
 | binlog_format | ROW |
 +---------------+-------+
 1 row in set (0.01 sec)

So, if I used

pt-online-schema-change
 on the master, the data inconsistency problem doesn’t happen. But using it on the slave is just dangerous!

Conclusion

Whenever you use

pt-online-schema-change
, make sure you are not executing it on a slave instance. For that reason, I escalated this bug report: https://bugs.launchpad.net/percona-toolkit/+bug/1221372. Also in many cases, using a normal
ALTER
 will work well enough. As in my example, to rebuild the table separately on each slave in lockless mode, I would just need to upgrade to the more recent 5.6 version.

BTW, if you’re wondering about Galera replication (used in Percona XtraDB Cluster, etc.) since it also uses a ROW-based format, it’s not a problem. The pt-osc triggers are created in all nodes thanks to synchronous write-anywhere replication nature. It does not matter which node you start

pt-online-schema-change
 on, and which other nodes your applications writes on at the same time. No slaves, no problem! 🙂

by Przemysław Malkowski at May 24, 2016 06:27 PM

Webinar Thursday May 26: Troubleshooting MySQL hardware resource usage

shutterstock_412474702

Join Sveta on Thursday, May 26, 2016, at 10 am PDT (UTC-7) for her webinar Troubleshooting MySQL hardware resource usage.Troubleshooting MySQL hardware resource usage

MySQL does not just run on its own. It stores data on disk, and stores data and temporarily results in memory. It uses CPU resources to perform operations, and a network to communicate with its clients.

In this webinar, we’ll discuss common resource usage issues, how they affect MySQL Server performance, and methods to find out how resources are being used. We will employ both OS-level tools, and new features in Performance Schema that provide detailed information on what exactly is happening inside MySQL Server.

Register for the webinar here.

Troubleshooting MySQL hardware resource usageSveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015.

Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker, teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as Support Engineer in MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle.

She is the author of the book MySQL Troubleshooting and JSON UDF Functions for MySQL.

by Dave Avery at May 24, 2016 03:46 PM

Jean-Jerome Schmidt

Join us for our European Polyglot Persistence Meetups Tour this summer!

We’ve been gearing up for this in the past months and we’re delighted to announce the first dates of our European Polyglot Persistence Meetups this summer. We’re starting off with Amsterdam, then moving on to Dublin, Paris, Berlin, Stockholm and London.

Sign up for a Polyglot Persistence Meetup

Some of you may ask what Polyglot Persistence is all about …

Polyglot Persistence means that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by the application. In a world where developers reign, these developers would choose the optimal programming language for the job - and this trickles down to the data tier. The former is often referred to as polyglot languages and the latter as polyglot persistence. However, this freedom comes at a cost - more complexity.

In our world of 9s, we like to look at the database aspects of DevOps and this is also where we’re focussing with this new meetups series.

Initially, the idea is to cover MySQL, PostgreSQL and MongoDB storage backends with the topics of deployment, scaling, configuration, management, backups and monitoring. And with time and as members suggest topics, the scope is likely to broaden.

Sign up for a Polyglot Persistence Meetup

Our meetups schedule so far:

  • 06 June - Amsterdam
  • 08 June - Dublin
  • 16 June - Paris
  • 17 June - Berlin
  • Stockholm tbc
  • London tbc

Please sign up for the meetup of your choice, as we’re announcing locations, speakers etc. via the Meetup platform.

For our kick-off in Amsterdam, we’re lucky to be sponsored by booking.com and we’ll be hosting the meetup in their offices.

We look forward to seeing you there and to the polyglot discussions ahead!

by Severalnines at May 24, 2016 02:30 PM