Planet MariaDB

February 05, 2016

Peter Zaitsev

Measuring Percona Server Docker CPU/network overhead

Docker

Precona Server DockerNow that we have our Percona Server Docker images, I wanted to measure the performance overhead when we run the database in the container. Since Docker promises to use a lightweight container, in theory there should be very light overhead. We need to verify that claim, however. In this post I will show the numbers for CPU and network intensive workloads, and later I will take a look at IO.

For the CPU-bound load, I will use a sysbench OLTP read-only workload with data fitting into memory (so there is no IO performed, and the execution path only goes through the network and CPU).

My server is 24 cores (including hyper-threads), with Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz CPUs, RAM: 256GB, OS: Ubuntu 14.04. The Docker version is the latest on the day of publishing, which is 1.9.1.

First, I measured the throughput on a bare server, without containers – this will be the baseline. For reference, the command I used is the following:

/opt/sysbench/sysbench --test=/opt/tests/db/oltp.lua --oltp_tables_count=8 --oltp_table_size=10000000 --num-threads=16 --mysql-host=172.18.0.2 --mysql-user=root --oltp-read-only=on --max-time=1800 --max-requests=0 --report-interval=10 run

On the bare metal system, the throughput is 7100 transactions per second (tps).

In the next experiment, I started Percona Server in a Docker container and connect to it from the host:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name ps13 -p 3306:3306 -v /data/flash/d1/:/var/lib/mysql -v /data/flash/my.cnf:/etc/my.cnf --net=host percona/percona-server:5.6.28

In this case, the container exposed port 3306 to the host, and we used that as an access point in sysbench.

The throughput in this scenario is 2200 tps!!! That is a significant overhead. I suspect it comes from the Docker gateway, which is added to the execution path when we connect through port forwarding.

So to avoid the Docker gateway, in the next run I used the host network by running the container with

--net=host
:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name ps13 -v /data/flash/d1/:/var/lib/mysql -v /data/flash/my.cnf:/etc/my.cnf --net=host percona/percona-server:5.6.28

In this case the container ran directly in the host network stack, so this should exclude any Docker network overhead. In this case, the throughput is basically back to 7100 tps.

From these tests, I can make an important conclusion. There is NO measurable CPU overhead when running Percona Server in a Docker container. But the network path raises some questions.

So in the next experiment I ran both sysbench and MySQL in two different containers, connected over the Docker network bridge.

I created a sysbench container, which you can get from:

 https://hub.docker.com/r/percona/sysbench/

To run sysbench:

docker run --name sb -t percona/sysbench

Just for the reference, I created a Docker network:

docker network create sysbenchnet

and connected both containers to the same network:

docker network connect sysbenchnet ps13; docker network connect sysbenchnet sb;

In this configuration, the throughput I’ve observed is 6300 tps.

So there is still some network overhead, but not as significant as with the port gateway case.

For the last example, I again excluded the network path and ran the sysbench container inside the MySQL container network stack using the following command:

docker run --name sb --net container:ps13 -t percona/sysbench

The throughput in this configuration is back to 7100 tps. 

And the conclusion, again, is that there is no CPU overhead even if we run both client and server inside containers, but there is some network overhead – even when running on the same host. It will be interesting to measure the network overhead when the containers are on different physical hosts.

The following chart summarizes the results:

dockeroverhead

Next time I will try to measure IO overhead in Docker containers.

by Vadim Tkachenko at February 05, 2016 06:55 PM

Shlomi Noach

MySQL Community Awards 2016: Call for Nominations!

The 2016 MySQL Community Awards event will take place, as usual, in Santa Clara, during the Percona Live Data Performance Conference, April 2016.

The MySQL Community Awards is a community based initiative. The idea is to publicly recognize contributors to the MySQL ecosystem. The entire process of discussing, voting and awarding is controlled by an independent group of community members, typically based of past winners or their representatives, as well as known contributors.

It is a self-appointed, self-declared, self-making-up-the-rules-as-it-goes committee. It is also very aware of the importance of the community; a no-nonsense, non-political, adhering to tradition, self criticizing committee.

The Call for Nominations is open. We are seeking the community’s assistance in nominating candidates in the following categories:

MySQL Community Awards: Community Contributor of the year 2016

This is a personal award; a winner would a person who has made contribution to the MySQL ecosystem. This could be via development, advocating, blogging, speaking, supporting, etc. All things go.

MySQL Community Awards: Application of the year 2016

An application, project, product etc. which supports the MySQL ecosystem by either contributing code, complementing its behaviour, supporting its use, etc. This could range from a one man open source project to a large scale social service.

MySQL Community Awards: Corporate Contributor of the year 2016

A company who made contribution to the MySQL ecosystem. This might be a corporate which released major open source code; one that advocates for MySQL; one that help out community members by... anything.

For a list of previous winners, please see MySQL Hall of Fame.

Process of nomination and voting

Anyone can nominate anyone. When nominating, please make sure to provide a brief explanation on why the candidate is eligible to get the award. Make a good case!

The committee will review all nominations and vote; it typically takes two rounds of votes to pick the winners, and a lot of discussion.

There will be up to three winners in each category.

Methods of nomination:

  • Send en email to mysql.community.awards [ at ] gmail.com
  • Comment to this post
  • Assuming you can provide a reasonable description in 140 characters, tweet your nomination at #MySQLAwards.

Please submit your nominations no later than Monday, February 29, 2016.

The committee

Members of the committee are:

  • Baron Schwartz, Colin Charles, Daniël van Eeden, Davi Arnaut, Frederic Descamps, Geoffrey Anderson, Giuseppe Maxia, Justin Swanhart, Mark Leith, Morgan Tocker, Philip Stoev, Ronald Bradford, Santiago Lertora

Jeremy Cole and myself (Shlomi Noach) are acting as co-secretaries; we will be non-voting (except for breaking ties).

The committee communicates throughout the nomination and voting process to exchange views and opinions.

The awards

Awards are traditionally donated by some party whose identity remains secret. We are now securing the donation, but if you feel an urgent need to be an anonymous donator, please contact us in private, and thank you!

Support

This is a community effort; we ask for your support in spreading the word and of course in nominating candidates. Thanks!

by shlomi at February 05, 2016 03:09 PM

MariaDB Foundation

MariaDB JIRA is moving

The MariaDB JIRA instance that currently is in use for project and issue tracking will change. The current instance is hosted in Atlassian’s cloud and it has worked well, but we have hit the maximum user limit of 2000 users. It’s fantastic to see how many of you actually report bugs and other issues in the MariaDB […]

The post MariaDB JIRA is moving appeared first on MariaDB.org.

by rasmus at February 05, 2016 12:37 PM

Valeriy Kravchuk

MySQL Support People - Percona Support

I planned to continue this series of posts with the one about MySQL Support engineers who joined us in Oracle while I was working there, but based on recent events in my life I'd prefer to postpone it and move directly to the team I worked for during last 3+ years, Percona Support Team.

Disclamer: In the list below I still try to pay attention to public contribution by each of engineers mentioned to MySQL Community, mostly in a form of bug reports. This is not the only way they contributed, but I have to stick to this format in current series of posts. I also base my comments mostly on my memory and quick search of public sources of information, so I may be mistaking about the roles, time periods and other details. Please, inform me if you think there is something to fix.

So, here is the list of my former colleagues who formed the best MySQL Support team in the industry in 2013-2015 while working with me in Percona:
  • Miguel Angel Nieto - Miguel was my mentor during my first days in Percona, so I have to start with him. He joined Percona in 2011 and was a team manager of EMEA Support team for a long time. We always managed to work together in a productive and efficient way. Miguel is a real expert in Percona XtraDB Cluster and almost every new technology we had to support, up to MongoDB recently. He used to act as a great consultant as well when Support engineers worked together with Consulting on the same pool of issues, and he still takes SSC (Support Shift Captain) shifts, so he can be the first person from Percona whom you deal with in case of problems even today. I see 13 public bug reports from Miguel and suggest to check his Bug #77654. See also his 12 bug reports for Percona software at  https://bugs.launchpad.net/~miguelangelnieto/+reportedbugs.
  • Ovais Tariq - when I joined Percona Ovais was the only Principal Support Engineer in the team. He was really great in everything technical he did, from complex data recovery consulting cases to query optimization, and on top of that he was a really good writer. It was a pleasure to read every email in issues he worked on, it was like reading a good technical blog post if not a book. I see 16 public bug reports for MySQL by Ovais. Check his replication-related Bug #70923, for example. He had also reported 20 bugs for Percona software at https://bugs.launchpad.net/~ovais-tariq/+reportedbug. In 2013 Ovais moved to Consulting and a year later, in 2014, he quit from Percona to become a Lead MySQL DBA - Automation Engineer and later a Lead Reliability Engineer at Lithium. He is also active at TwinDB site.
  • Fernando Ipar - As far as I know, he was the first full time Support engineer and dedicated Support Shift Captain in Percona since September, 2009. For two years later he was a Director of Global Support. When I joined he was working in Consulting and recently he seems to do Development. But during all this time Fernando was interested in Support-related discussions and helped my colleagues promptly when needed. I see 11 bugs reported by him for MySQL, including funny optimizer regression since 5.1.x, Bug #66825. Fernando also reported 10 bugs for Percona software at https://bugs.launchpad.net/~fipar/+reportedbugs.
  • Michael Rikmas (a.k.a. Mixa) - he moved to Support from Administrative team in 2011, and there he was one of the first Percona employees. He always cared about Percona operations and customers, covering endless SSC shifts any time a day, sometimes for 12+ hours in a row. He was always ready to step into anything any time a day to help, if needed. While working in Support he improved his technical skills a lot in all MySQL-related areas, so he ended up as a good problem solver and eventually quit from Percona (in 2014) to join PSCE as a Consultant. I see only one bug he reported for MySQL, Bug #62426. See also https://bugs.launchpad.net/~michael.rikmas/+reportedbugs.
  • Martin Arrieta - Martin joined Percona in January, 2012, but was already a well known Support engineer when I joined. In 2013 he moved to Consulting and two years later quit from Percona in September, 2015, to join Pythian as Database Consultant. Martin had reported 6 MySQL bugs in public (he was really interested in MySQL Fabric), but he is better known as a blog posts author. I think we always cooperated really well with him. 
  • Marcos Albe (a.k.a. Markus) - he is the most hard working and successful Support engineer of all times in Percona. Marcos was also my team manager after I switched to AMER Support Team. Marcos joined Percona in 2010 and by 2012 he was not only a hard working SSC, but also a full stack expert. Marcos created many useful hints for the internal knowledge base and is also know as a great speaker at Percona Live conferences (I've met him for the first time at Percona Live New York, in 2012, where he had talks on several topics). The way he works (he is ready to call you using any media, join any chat, log in promptly and spend entire night helping any customer interactively) defines famous "Percona style" of providing Support (whatever I may think about it) and gave him huge experience with all technologies even remotely related to MySQL. Marcos had reported 4 public MySQL bugs, including a 5.6.x regression one, Bug #71616, that is still "Verified". I also see 5 bugs reported by him at https://bugs.launchpad.net/~markus-albe/+reportedbugs.
  • Nickolay Ihalainen - Nickolay is a classical Percona Consultant (who still works in Support for a long time), with huge real life experience of managing MySQL at scale and full application stack troubleshooting. He knows a lot about Linux, hardware used with MySQL and, as any other great consultant in Percona, is always ready to step into the issue of any complexity and length, call customer, write code, recover InnoDB data or fix a bug quickly. He works in Percona since January 2010, reported 3 MySQL bugs in public, including Bug #78051 that is still "Verified". I also see 7 bugs reported by Nickolay for Percona software at https://bugs.launchpad.net/~ihanick/+reportedbugs. I studied a lot while working with and talking to him.
  • Przemyslaw Malkowski - he joined Percona in August, 2012, just before me. We were at the same "bootcamp" session in famous "Hotel Percona" and worked closely together since that time. Przemek is a Principal Support Engineer since July, 2015. He is the only one who was promoted to that level while I was working in Percona, not just joined like that. He works hard on all kinds of complex issues and is one of key experts in the industry in Galera and Percona XtraDB Cluster (He can easily parse and understand obscure, huge logs from multiple nodes, and explain the interactions and problems  that happened, quickly. This is the skill I always missed). Moreover, he is a well known bug reporter, with 22 bugs reported for upstream MySQL, including infamous Bug #78777 that mislead numerous MySQL users and customers for months, and 37 (!) bugs reported for Percona software at https://bugs.launchpad.net/~pmalkowski/+reportedbugs. Przemek is able to work with any kind of annoying customer, on issues of any complexity, and still remain helpful. I really enjoyed working with him in one team, a dream team actually!
  • Nilnandan Joshi (a.k.a. Nil) - he joined Percona in May, 2012 and successfully worked there as Support Engineer till January, 2016. He managed to master all key tools and technologies in MySQL, while working hard on boring duties like SSC and bugs processing. Now Nil is a Big Data Engineer at zData Inc and is probably working with Hadoop or something of that kind. He also was visiting "Hotel Percona" with me in 2012 for a "bootcamp", where we spent a lot of time talking about the way InnoDB works (and drinking some beer). Since that time we were working in a close cooperation till his last day in Percona. Nil helped me a lot with bugs processing when I've got a task to manage it from Support side, and was one of few engineers in Support who were ready to report or properly process/verify any bug, build any Percona software, test it on all kinds of weird Linux versions and distros, using all kinds of virtual and real machines. Nil reported 13 public MySQL bugs, including Bug #79469. I also see 9 bugs reported by him for Percona software at https://bugs.launchpad.net/~nilnandan-joshi/+reportedbugs. I miss our useful work together on bugs and Percona Live sessions already...
  • Jervin Real - he worked in Percona Support since early 2010 and until summer 2013, when he moved to Consulting (he worked great there as well, always ready to step in when we needed help in Support). When I joined he was a key Support engineer for APAC customers. Jervin is a well known blogger, who writes a lot about HA and other technologies for MySQL. I finally met him in real life few days ago, at FOSDEM conference, where he was speaking about TokuDB. Now Jervin is a Technical Service Manager in Percona. He had contributed a lot to MySQL Community also with his 20 public bug reports, including Bug #77715 that is still "Open". I also see 34 bugs reported by Jervin for Percona software at https://bugs.launchpad.net/~revin/+reportedbugs.
  • Jaime Sicam - Jaime joined Percona Support in June, 2011 and later became a good manager of APAC team in Support. He always worked hard and helped colleagues with anything, from covering SSC shift to complex troubleshooting. He is a great engineer capable to deal with any kind of issues, but he was our key expert in anything related to PAM and authentication in general. Jaime reported 7 bugs for MySQL software, including infamous Bug #77344, and 11 bugs for Percona software at https://bugs.launchpad.net/~jssicam/+reportedbugs. We rarely were both online and working together, but I already miss him.
  • Muhammad Irfan - he joined Percona Support soon after me, in December 2012. Muhammad actively started to work and cooperate with me and other senior colleagues from the beginning, and soon had become a high skilled engineer in everything related to MySQL, including XtraDB Cluster and bugs processing. He is a quite famous blogger, who was happily writing about new features, tools and best practices (while I prefer to write about bugs and problems, that is, worst practices). I am proud to get a change to help him with some posts as a reviewer. Muhammad spent fair amount of time on bugs processing and reporting as well, as we can conclude from his 8 MySQL bug reports including "Verified" Bug #73094 and 9 bugs reported for Percona software at https://bugs.launchpad.net/~muhammad-irfan/+reportedbugs.
  • Fernando Laudares (a.k.a. Nando) - he joined Percona in January, 2013 and worked in AMER Support Team. I'd call him a student of Marcos, as he quickly worked out similar habits and approaches to Support, "Percona style". Fernando always tried his best to help both customers and colleagues, without any limits, and this quickly made him one of the key team members. He works hard, on many issues, day after day without becoming tired in a visible way. Fernando had written many great blog posts and is great in all kinds of automation/testing environments setups. He was fast in studying, accepting and following new technologies we were supposed to support in Percona. His community contributions also include recent enough Bug #77684 and 2 bugs reported at https://bugs.launchpad.net/~fernando-laudares/+reportedbugs.
  • Jericho Rivera - with mostly sysadmin and developer background, Jericho had to work really hard since October, 2013 to reach the level of MySQL expertise we required in Percona Support. But he did that and became a key support provider in 2015, that can deal efficiently with all kinds of customer issues. He helped us to manage Support servers all this time and became a local expert in Docker and other technologies (as you can figure out from his blog posts). He had reported Bug #77073 for MySQL server (still "Verified") and 3 bugs for Percona software at https://bugs.launchpad.net/~jericho-rivera/+reportedbugs. It was really great to see how he becomes better every month, until he ended up just doing an awesome job one day.
  • Peiran Song - I was really happy when she joined us in 2014. By May 2014, just in a couple of months, she already did a great job in Support, working on many complex issues. She was really good with troubleshooting, InnoDB and query optimization. We discussed not only technical details, but also procedures and values of Support, a lot. Peiran reported 6 upstream MySQL bugs, including incredible Bug #73369 that is still waiting for explanation and fix, and 8 bugs for Percona software at https://bugs.launchpad.net/~peiran-song/+reportedbugs. It was really sad to meet her in 2015 during Percona Live just to find out that she can not stay with us any more (and not because of any her faults). Peiran is a Data Architect at Smule, Inc now.
  • Justin Swanhart - Justin worked for Percona at different positions (mostly as a Consultant and Instructor), since 2010. In August 2013 he joined Support as a Principal Support Engineer, and since that time we often worked together, in the same AMER team. He had got the Community Contributor of the year 2015 award, for his great contributions including (but not limited to) his software. Justin created 40 MySQL community bug reports, including Bug #76210, and I see 12 bugs reported by him for Percona software at https://bugs.launchpad.net/~greenlion/+reportedbugs. Justin worked well, if you ask me, and helped customers a lot, while being a source of unique expertise and great insights for colleagues and customers (based on his trainer and software developer experience, among other things), but one sad day in June 2015 he found himself fired in the middle of the shift. I tried to find out what was really that much wrong with his work, and found literally nothing myself, so I let others to comment on what happened and why. I think firing Justin was a very bad mistake, even though he had better NOT to comment on it in public in the way he did... I hope we'll work together one day again, maybe even soon.
  • Bill Karwin - Bill worked for Percona since 2010, mostly as a Consultant and Trainer. We were really lucky that he joined Support in 2014 full time (as a Principal Support Engineer and Senior Knowledge Manager). Using his huge experience in SQL and everything MySQL, he started to play a key role in AMER team, working a lot and providing great service to customers. He is very famous for his answers at Quora and StackOverflow. He is a great writer and speaker on MySQL topics, and his book on SQL anti-patters is well known. Bill is literally second to nobody in helping MySQL Community. Check also his 9 MySQL bugs including Bug #73283 (still "Verified"), and 2 bugs at https://bugs.launchpad.net/~bill-karwin/+reportedbugs. At the end of 2014 Bill left us, probably because it was boring for him to resolve the same problems again and again for customers, instead of doing things right from the very beginning. He is now a Senior Database Architect and Software Engineer at SchoolMessenger.
  • Agustin Gallego (a.k.a. Guli) - I met him first in 2012, at Percona Live New York. He worked for Percona since February, 2012, but joined Support later, in December, 2013. Since that time Agustin worked as SSC and Support Engineer in AMER Support team and played an important role in making it operational and efficient. He had reported Bug #77186 for MySQL and 6 bugs for Percona software at https://bugs.launchpad.net/~agustin-gallego/+reportedbugs. He is gaining experience every day, cooperates well with colleagues and now is a very reliable, respected and useful team member.
  • Roben Paul Namuag - Paul joined Percona Support in July, 2013. In January, 2015 he left us to continue working in Percona as Remote DBA, but we still discuss technical details and, sometimes, life after that. Paul had good and harder times in Support, but he always tried to do his best and played an important role in APAC team, both as SSC and as an experienced engineer working on complex enough issues, even when there was nobody around to help. He had reported 3 MySQL bugs, including Bug #77180 that is till "Open", and 2 bugs for Percona software at https://bugs.launchpad.net/~paul-namuag/+reportedbugs. I'll remember him as a kind, helpful and friendly person.
  • Akshay Suryavanshi - we chatted with with Akshay a lot before he joined Percona, he was interested in a lot of in-depth details on how MySQL works. Eventually he joined Percona Support, in March 2013, but immediately switched to Remote DBA role and had a great success there, moving later to management positions, up to DBA Team Lead and Senior position in Technical Operations recently. Then, in December 2015, he quit from Percona, to become a MySQL Engineer at Tumblr. He helped community with his blog posts and webinars, and also reported Bug #70818
  • Pablo Padua - he joined Percona in 2014 as a Jr. Support Engineer, and made a lot of progress since that. Unfortunately I can not find any public MySQL bug reports by him, but he worked a lot as SSC in AMER Support Team, and he gets more experience every day. We often worked together on technical issues, when I just provided some background help, checks and comments, while he did real time work for customer. It had been a pleasure to cooperate that way, and, as a result, for many areas in MySQL Pablo now is ready to work without any assistance.
  • Abdelhak Errami - he joined us as Senior Support Engineer in April 2015, when Tokutek was acquired. Since that time he played a key role in TokuDB Support, but also often helped with pure MySQL issues as well. We spent a good time together during Percona Live. Abdel often works on TokuDB bugs, check DB-922 as one of recent examples.
  • Joe Laflamme - I think he also joined us some time in 2015 in the process of Tokutek acquisition. Check DB-809 for one of his TokuDB related bug reports. He coordinated TokuDB bug fixing and support during the period when I describe. Now he is a manager at Percona Technical Services.
I had not mentioned Sveta Smirnova, who joined Percona in March 2015, because she was already mentioned in the previous post. I still plan to write about managers and coordinators of Support who had never reported MySQL bugs in public separately, so stay tuned Tom Basil, Peter Farkas and Erzsebet Olsovzsky. I remember about you and I value your contribution a lot, but engineers first...

I had not named colleagues in Percona who joined after November 1, 2015 explicitly here, and this is the date of the beginning of the entirely new epoch in the history of Percona that should be described separately, if ever.

by Valeriy Kravchuk (noreply@blogger.com) at February 05, 2016 08:47 AM

February 04, 2016

Peter Zaitsev

MySQL password expiration features to help you comply with PCI-DSS

MySQL passwordPCI Compliance (section 8.2.4) requires users to change password every 90 days. Until MySQL 5.6.6 there wasn’t a built-in way to comply with this requirement.

Since MySQL version 5.6.6 there’s a password_expired feature which allows to set a user’s password as expired.
This has been added to the mysql.user table and its default value it’s “N.” You can change it to “Y” using the

ALTER USER
 statement.

Here’s an quick example on how to set expiration date for a MySQL user account:

mysql> ALTER USER 'testuser'@'localhost' PASSWORD EXPIRE;

Once this is set to “Y” the username will still be able to login to the MySQL server, but it will not be able to run any queries before setting the new password. You will instead get an ERROR 1820 message:

mysql> SHOW DATABASES;
ERROR 1820 (HY000): You must SET PASSWORD before executing this statement

Keep in mind that this does not affect any current connections the account has open.

After setting a new password, all operations performed using the account will be allowed (according to the account privileges):

mysql> SET PASSWORD=PASSWORD('mechipoderranen');
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| data               |
| logs               |
| mysql              |
| performance_schema |
| test               |
+--------------------+
6 rows in set (0.00 sec)
mysql>

This allows administrators to perform password expiration by scheduling the

ALTER USER
 via cron.

Since MySQL 5.7.4, this has been improved and there’s a new feature to set a policy for password expiration, that provides more control through a global variable,

default_password_lifetime
which allows to set a global automatic password expiration policy.

Example usage:

Setting a default value on our configuration file. This will set all account passwords to expire every 90 days, and will start counting from the day this variable was set effective on your MySQL server:

[mysqld]
default_password_lifetime=90

Setting a global policy for the passwords to never expire. Note this is the default value (so it is not strictly necessary to declare in the configuration file):

[mysqld]
default_password_lifetime=0

This variable can also be changed at runtime if the user has SUPER privileges granted:

mysql> SET GLOBAL default_password_lifetime = 90;
Query OK, 0 rows affected (0.00 sec)

You can also set specific values for each user account using

ALTER USER
. This will override the global password expiration policy. Please note that
ALTER USER
 only understands
INTERVAL
 expressed in
DAY
:

ALTER USER ‘testuser’@‘localhost' PASSWORD EXPIRE INTERVAL 30 DAY;

Disable password expiration:

ALTER USER 'testuser'@'localhost' PASSWORD EXPIRE NEVER;

Set to default value, which is the current value of

default_password_lifetime
:

ALTER USER 'testuser'@'localhost' PASSWORD EXPIRE DEFAULT;

Since MySQL 5.7.6, you can use the

ALTER USER
 to change the user’s password:

mysql> ALTER USER USER() IDENTIFIED BY '637h1m27h36r33K';
Query OK, 0 rows affected (0.00 sec)

For more information on this variable, please refer to the documentation page: https://dev.mysql.com/doc/refman/5.7/en/password-expiration-policy.html

Bonus post:

Another new feature in MySQL 5.7.8 related to user management is locking/unlocking user accounts when

CREATE USER
, or at a later time running the
ALTER USER
 statement.

In this example, we will first create a username with the

ACCOUNT LOCK
:

mysql> CREATE USER 'furrywall'@'localhost' IDENTIFIED BY '71m32ch4n6317' ACCOUNT LOCK;
Query OK, 0 rows affected (0.00 sec)

As you can see below, the newly created user gets an ERROR 3118 message while trying to login:

$ mysql -ufurrywall -p
Enter password:
ERROR 3118 (HY000): Access denied for user 'furrywall'@'localhost'. Account is locked.

We can unlock the account using the

ALTER USER ... ACCOUNT UNLOCK;
 statement:

mysql>ALTER USER 'furrywall'@'localhost' ACCOUNT UNLOCK;
Query OK, 0 rows affected (0.00 sec)

Now the user account is unlocked and accessible:

$ mysql -ufurrywall -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 17
Server version: 5.7.8-rc MySQL Community Server (GPL)
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql>

If necessary, you can lock it again:

mysql> ALTER USER 'furrywall'@'localhost' ACCOUNT LOCK;
Query OK, 0 rows affected (0.00 sec)

Please check this following documentation for more details: https://dev.mysql.com/doc/refman/5.7/en/account-locking.html

by Pablo Padua at February 04, 2016 03:12 PM

MariaDB AB

Vote for Database of the Year on LinuxQuestions.org

mariadb

Vote for MariaDB for Database of the Year!

Vote MariaDB for Database of the Year! MariaDB is nominated for Database of the Year by LinuxQuestions.org. LQ for short is Linux user community that each year votes for the best database of the year.

You can, of course, vote for MariaDB!

The site requires a simple registration - which may take a few moments, but it is well worth your time!

  1. Register on linuxquestions.org
  2. Activate account by email
  3. Post a message to get the right to vote, e.g. write on the vote thread
  4. After a delay, vote

Be patient. It takes a couple of hours for registration to be approved. Once approved, you will be able to vote.

Tags: 

by mariadb at February 04, 2016 02:24 PM

Jean-Jerome Schmidt

Webinar Replay & Slides: Managing MySQL Replication for High Availability

Thanks to everyone who participated in this week’s live webinar on Managing MySQL Replication for High Availability led by our colleague Krzysztof Książek, Senior Support Engineer at Severalnines. The webinar included theory discussions bundled with live demo demonstrations on all of the key elements that were discussed, which made for a nicely interactive webinar to watch.

If you missed the session and/or would like to watch the replay and read through the slides in your own time, they are now available online for sign up and viewing.

Whether you’re looking into deploying a MySQL Replication topology or maintaining one, you’ll find great insight here about topology changes, managing slave lag, promoting slaves, repairing replication issues, fixing broken nodes, managing schema changes and scheduling backups. Multi-datacenter replication was also covered.

Replay details

Get access to the replay here

Read the slides here

Get access to all of our replays here

Agenda

View the full agenda here!

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

If you have any questions or would like a personalised live demo, please do contact us.

Follow our technical blogs here: http://severalnines.com/blog

Blog category:

by Severalnines at February 04, 2016 11:44 AM

February 03, 2016

Peter Zaitsev

New GIS Features in MySQL 5.7

MySQL 5.7MySQL 5.7 has been released, and there are some exciting new features now available that I’m going to discuss in this blog — specifically around geographic information system (GIS).

I’ve used GIS features in MySQL for a long time. In my previous blog entries I’ve shown how to create geo-enabled applications with MySQL 5.6 and use MySQL 5.6 geo-spatial functions. In this blog post, I’ll look into what is new in MySQL 5.7 and how we can use those new features in practice for geo-enabled applications.

New in MySQL 5.7

MySQL 5.7 introduces the following major improvements and features for GIS:

  1. Spatial indexes for InnoDB. Finally it is here! This was a long overdue feature, which also prevented many companies from converting all tables to InnoDB.
  2. st_distance_sphere: native function to calculate a distance between two points on earth. Finally it is here as well! Like many others, I’ve created my stored procedure to calculate the distance between points on earth using haversine formula. The native function is ~20x faster than the stored procedure (in an artificial benchmark, see below). This is not surprising, as stored procedures are slow computationally – especially for trigonometrical functions.
  3. New functions: GeoHash and GeoJSON. With GeoJSON we can generate the results that are ready for visualizing on Google Maps.
  4. New GIS implementation based on Boost.Geometry library. This is great news, as originally GIS was implemented independently from scratch with a very limited set of features. Manyi Lu from MySQL server team provides more reasoning behind the choice of Boost.Geometry.

This is the great news. The bad news is that except for the st_distance_sphere, all other functions use planar geometry (no change since MySQL 5.6) and do not support Spatial Reference System Identifier (SRID). That means that if I want to calculate the distance of my favorite bike path in miles or kilometers, I’ll still have to use a stored function (see below for an example) or write an application code for that. Native function st_distance will ignore SRID for now and return a value which represents a distance on a planar – not very useful for our purposes (may be useful for order by / compare).

Distance on Sphere

MySQL 5.7 introduces the function st_distance_sphere, which uses a haversine formula to calculate distance. He is the example:

mysql> select st_distance_sphere(point(-78.7698947, 35.890334), point(-122.38657, 37.60954));
+--------------------------------------------------------------------------------+
| st_distance_sphere(point(-78.7698947, 35.890334), point(-122.38657, 37.60954)) |
+--------------------------------------------------------------------------------+
|                                                             3855600.7928957273 |
+--------------------------------------------------------------------------------+
1 row in set (0.00 sec)

The distance is in meters by default (you can also change the radius of the earth to meters using the 3rd optional parameter, default: 6,370,986). Although our earth is represented as an oblate spheroid, all practical applications use the distance on a sphere. The difference between the haversine formula and more precise (and much slower) functions is negligible for our purposes.

The st_distance_sphere is much faster than using stored routines. Here is the artificial benchmark:

mysql> select benchmark(1000000, haversine_distance_sp(37.60954, -122.38657, 35.890334, -78.7698947));
+-----------------------------------------------------------------------------------------+
| benchmark(1000000, haversine_distance_sp(37.60954, -122.38657, 35.890334, -78.7698947)) |
+-----------------------------------------------------------------------------------------+
|                                                                                       0 |
+-----------------------------------------------------------------------------------------+
1 row in set (22.55 sec)
mysql> select benchmark(1000000, st_distance_sphere(point(-78.7698947, 35.890334), point(-122.38657, 37.60954)));
+----------------------------------------------------------------------------------------------------+
| benchmark(1000000, st_distance_sphere(point(-78.7698947, 35.890334), point(-122.38657, 37.60954))) |
+----------------------------------------------------------------------------------------------------+
|                                                                                                  0 |
+----------------------------------------------------------------------------------------------------+
1 row in set (0.77 sec)

haversine_distance_sp is a stored routine implementation of the same algorithm.

InnoDB GIS example: find 10 restaurants near me 

In my previous blog post I’ve demonstrated how to use st_within function to find restaurants inside my zipcode (US postal code) and sort by distance. In MySQL 5.7 there will be 2 changes:

  1. We can use InnoDB table
  2. We can use st_distance_sphere function

For this example, I’ve converted Open Street Map data to MySQL and then created a new InnoDB table:

CREATE TABLE `points_new` (
  `OGR_FID` int(11) NOT NULL AUTO_INCREMENT,
  `SHAPE` geometry NOT NULL,
  `osm_id` text,
  `name` text,
  `barrier` text,
  `highway` text,
  `ref` text,
  `address` text,
  `is_in` text,
  `place` text,
  `man_made` text,
  `other_tags` text,
  UNIQUE KEY `OGR_FID` (`OGR_FID`),
  SPATIAL KEY `SHAPE` (`SHAPE`)
) ENGINE=InnoDB AUTO_INCREMENT=13660668 DEFAULT CHARSET=latin1

SHAPE is declared as geometry (and stores points in this table). We also have SPATIAL KEY SHAPE in the InnoDB table.

The following query will find all cafe or restaurants in Durham, NC (zipcode: 27701):

SELECT osm_id, name, round(st_distance_sphere(shape, st_geomfromtext('POINT (-78.9064543 35.9975194)', 1) ), 2) as dist
FROM points_new
WHERE st_within(shape,
      (select shape from zcta.tl_2013_us_zcta510 where zcta5ce10='27701') )
	  and (other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%')
	  and name is not null
ORDER BY dist asc LIMIT 10;

Table tl_2013_us_zcta510 stores the shapes of polygons for all US zipcodes. (It needs to be converted to MySQL.) In this example I’m using st_within to filter only the POIs I need, and st_distance_sphere to get the distance from my location (-78.9064543 35.9975194 are the coordinates of Percona’s office in Durham) to the restaurants.

Explain plan:

mysql> EXPLAIN
    -> SELECT osm_id, name, round(st_distance_sphere(shape, st_geomfromtext('POINT (-78.9064543 35.9975194)', 1) ), 2) as dist
    -> FROM points_new
    -> WHERE st_within(shape,
    ->       (select shape from zcta.tl_2013_us_zcta510 where zcta5ce10='27701') )
    ->   and (other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%')
    ->   and name is not null
    -> ORDER BY dist asc LIMIT 10G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: points_new
   partitions: NULL
         type: range
possible_keys: SHAPE
          key: SHAPE
      key_len: 34
          ref: NULL
         rows: 21
     filtered: 18.89
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: tl_2013_us_zcta510
   partitions: NULL
         type: ref
possible_keys: zcta5ce10
          key: zcta5ce10
      key_len: 8
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
2 rows in set, 1 warning (0.00 sec)

That looks pretty good: MySQL is using and index on the SHAPE field (even with the subquery, btw).

Results:

mysql> SELECT osm_id, name, round(st_distance_sphere(shape, st_geomfromtext('POINT (-78.9064543 35.9975194)', 1) ), 2) as dist, st_astext(shape)
    -> FROM points_new
    -> WHERE st_within(shape,
    ->       (select shape from zcta.tl_2013_us_zcta510 where zcta5ce10='27701') )
    ->   and (other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%')
    ->   and name is not null
    -> ORDER BY dist asc LIMIT 10;
+------------+----------------------------+--------+--------------------------------------+
| osm_id     | name                       | dist   | st_astext(shape)                     |
+------------+----------------------------+--------+--------------------------------------+
| 880747417  | Pop's                      | 127.16 | POINT(-78.9071795 35.998501)         |
| 1520441350 | toast                      | 240.55 | POINT(-78.9039761 35.9967069)        |
| 2012463902 | Pizzeria Toro              | 256.44 | POINT(-78.9036457 35.997125)         |
| 398941519  | Parker & Otis              | 273.39 | POINT(-78.9088833 35.998997)         |
| 881029843  | Torero's                   | 279.96 | POINT(-78.90829140000001 35.9995516) |
| 299540833  | Fishmonger's               | 300.01 | POINT(-78.90850250000001 35.9996487) |
| 1801595418 | Lilly's Pizza              | 319.83 | POINT(-78.9094462 35.9990732)        |
| 1598401100 | Dame's Chicken and Waffles | 323.82 | POINT(-78.9031929 35.9962871)        |
| 685493947  | El Rodeo                   | 379.18 | POINT(-78.909865 35.999523)          |
| 685504784  | Piazza Italia              | 389.06 | POINT(-78.9096472 35.9998794)        |
+------------+----------------------------+--------+--------------------------------------+
10 rows in set (0.13 sec)

0.13 seconds response time on AWS t2.medium box sounds reasonable to me. The same query on the MyISAM table shows ~same response time: 0.14 seconds.

GeoJSON feature and Google Maps

Another nice feature of MySQL 5.7 GIS is GeoJSON function: you can convert your result set to GeoJSON, which can be used with other applications (for example Google Maps API).

Let’s say I want to visualize the above result set on Google Map. As the API requires a specific format, I can use concat / group_concat to apply the format inside the SQL:

SELECT CONCAT('{
  "type": "FeatureCollection",
  "features": [
  ',
   GROUP_CONCAT('{
   "type": "Feature",
      "geometry": ', ST_AsGeoJSON(shape), ',
      "properties": {}
   }'),
  ']
}') as j
FROM points_new
WHERE st_within(shape,
      (select shape from zcta.tl_2013_us_zcta510 where zcta5ce10='27701') )
	  and (other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%')
	  and name is not null

I will get all the restaurants and cafes in zipcode 27701. Here I’m using ST_AsGeoJSON(shape) to convert to GeoJSON, and concat/group_concat to “nest” the whole result into the format suitable for Google Maps.

Result:

mysql> set group_concat_max_len = 1000000;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT CONCAT('{
    '>   "type": "FeatureCollection",
    '>   "features": [
    '>   ',
    ->    GROUP_CONCAT('{
    '>    "type": "Feature",
    '>       "geometry": ', ST_AsGeoJSON(shape), ',
    '>       "properties": {}
    '>    }'),
    ->   ']
    '> }') as j
    -> FROM points_new
    -> WHERE st_within(shape,
    ->       (select shape from zcta.tl_2013_us_zcta510 where zcta5ce10='27701') )
    ->   and (other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%')
    ->   and name is not null
*************************** 1. row ***************************
j: {
  "type": "FeatureCollection",
  "features": [
  {
   "type": "Feature",
      "geometry": {"type": "Point", "coordinates": [-78.890852, 35.9903403]},
      "properties": {}
   },{
   "type": "Feature",
      "geometry": {"type": "Point", "coordinates": [-78.8980807, 35.9933562]},
      "properties": {}
   },{
   "type": "Feature",
      "geometry": {"type": "Point", "coordinates": [-78.89972490000001, 35.995879]},
      "properties": {}
   } ... ,{
   "type": "Feature",
      "geometry": {"type": "Point", "coordinates": [-78.9103211, 35.9998494]},
      "properties": {}
   },{
   "type": "Feature",
      "geometry": {"type": "Point", "coordinates": [-78.9158326, 35.9967114]},
      "properties": {}
   }]
}
1 row in set (0.14 sec)

I did not include the full result set for the lack of space; I also had to change the group concat max length, otherwise MySQL will cut the result of the group_concat function.

Now I can visualize it:

 

 

 

 

 

 

 

 

 

 

Example: Find the longest bike path

MySQL 5.7 (as well as the older versions) supports st_length function to calculate a length of a linestring. However, even in MySQL 5.7 st_length can’t calculate the distance on earth. To find the distance of a linestring I’ve created a very simple stored procedure:

DELIMITER //
CREATE DEFINER=CURRENT_USER() FUNCTION `ls_distance_sphere`(ls GEOMETRY) RETURNS DECIMAL(20,8)
    DETERMINISTIC
BEGIN
DECLARE i, n INT DEFAULT 0;
DECLARE len DECIMAL(20,8) DEFAULT 0;
SET i = 1;
SET n = ST_NumPoints(ls);
 WHILE i < n DO
    SET len = len +  st_distance_sphere(st_pointN(ls, i), st_pointN(ls, i+1));
SET i = i + 2;
 END WHILE;
 RETURN len;
END //
DELIMITER ;

As the Open Street Map data has the information about roads in North America, we can use this function to calculate the length (in meters) for every road it stores:

mysql> select name, ls_distance_sphere(shape) from lines_new where highway = 'cycleway' and name is not null limit 10;
+---------------------------------------+---------------------------+
| name                                  | ls_distance_sphere(shape) |
+---------------------------------------+---------------------------+
| Highbury Park Drive Bypass            |                0.97386664 |
| Ygnacio Canal Trail                   |                0.86093199 |
| South Marion Parkway                  |                1.06723424 |
| New River Greenway                    |                1.65705401 |
| Northern Diversion Trail              |                2.08269808 |
| Gary L. Haller Trail;Mill Creek Trail |                2.09988209 |
| Loop 1                                |                2.05297129 |
| Bay Farm Island Bicycle Bridge        |                2.51141623 |
| Burrard Street                        |                1.84810259 |
| West 8th Avenue                       |                1.76338236 |
+---------------------------------------+---------------------------+
10 rows in set (0.00 sec)

Index the polygon/area distance using MySQL 5.7 virtual fields

To really answer the question “what is the longest bikepath (cyclepath) in North America?” we will have to order by stored function result. This will cause a full table scan and a filestort, which will be extremely slow for 30 millions of rows. The standard way to fix this is to materialize this road distance: add an additional field to the table and store the distance there.

In MySQL 5.7 we can actually use Generated (Virtual) Columns feature:

CREATE TABLE `lines_new` (
  `OGR_FID` int(11) NOT NULL AUTO_INCREMENT,
  `SHAPE` geometry NOT NULL,
  `osm_id` int(11) DEFAULT NULL,
  `name` varchar(255) DEFAULT NULL,
  `highway` varchar(60) DEFAULT NULL,
  `waterway` text,
  `aerialway` text,
  `barrier` text,
  `man_made` text,
  `other_tags` text,
  `linestring_length` decimal(15,8) GENERATED ALWAYS AS (st_length(shape)) VIRTUAL,
  PRIMARY KEY (`OGR_FID`),
  SPATIAL KEY `SHAPE` (`SHAPE`),
  KEY `linestring_length` (`linestring_length`),
  KEY `highway_len` (`highway`,`linestring_length`)
) ENGINE=InnoDB AUTO_INCREMENT=27077492 DEFAULT CHARSET=latin1

Unfortunately, MySQL 5.7 does not support non-native functions (stored procedures or UDF) in generated columns, so I have to use st_length in this example. Ordering by value of st_length may be OK though:

mysql> select name, ls_distance_sphere(shape) from lines_new where highway = 'cycleway' and name is not null order by linestring_length desc limit 10;
+-----------------------------+---------------------------+
| name                        | ls_distance_sphere(shape) |
+-----------------------------+---------------------------+
| Confederation Trail         |            55086.92572725 |
| Cowboy Trail                |            43432.06768706 |
| Down East Sunrise Trail     |            42347.39791330 |
| Confederation Trail         |            29844.91038542 |
| Confederation Trail         |            26141.04655981 |
| Longleaf Trace              |            29527.66063726 |
| Cardinal Greenway           |            30613.24487294 |
| Lincoln Prairie Grass Trail |            19648.26787218 |
| Ghost Town Trail            |            25610.52158647 |
| Confederation Trail         |            27086.54829531 |
+-----------------------------+---------------------------+
10 rows in set (0.02 sec)

The query is very fast as it uses an index on both highway and linestring:

mysql> explain select name, ls_distance_sphere(shape) from lines_new where highway = 'cycleway' and name is not null order by linestring_length desc limit 10G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: lines_new
   partitions: NULL
         type: ref
possible_keys: highway_len
          key: highway_len
      key_len: 63
          ref: const
         rows: 119392
     filtered: 90.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

Conclusion

MySQL 5.7 contains a great set of features to work with geospatial data. Finally, spatial indexes are supported in InnoDB; st_distance_sphere as a native function is very useful. Unfortunately, other spatial functions only work with planar coordinates and do not support SRID. I hope this will be fixed in the new releases.

by Alexander Rubin at February 03, 2016 03:25 PM

Henrik Ingo

Moving to MongoDB Engineering

It will soon be 3 years that I've been with MongoDB. I joined the company amidst a strong growth spurt, and 5 months later the HR website told me that I had now been in the company longer than 50% of my colleagues.

read more

by hingo at February 03, 2016 12:15 PM

February 02, 2016

Peter Zaitsev

Percona Live Crash Courses: for MySQL and MongoDB!

Percona Live

Percona Live Crash Courses for MySQL and MongoDB

The database community constantly tells us how hard it is to find someone with MySQL and MongoDB DBA skills who can help with the day-to-day management of their databases. This is especially difficult when companies don’t have a full-time requirement for a DBA. Developers, system administrators and IT staff spend too much time trying to solve basic database problems that keep them from doing their day job. Eventually the little problems or performance inefficiencies that start to pile up  lead to big problems.  

In answer to this growing need, Percona Live is once again hosting Crash Courses for developers, systems administrators, and other technical resources. This year, we’ve compacted the training into a single day, and are offering two options: MySQL 101 and MongoDB 101!

Don’t let the name fool you: these courses are led by Percona MySQL experts who will show you the fundamentals of MySQL or MongoDB tools and techniques.  

And it’s not just for DBAs: developers are encouraged to attend to hone their database skills. Developers who create code that can scale to match the demands of the online community are both a resource and and an investment.

Below are a list of the topics covered for each course:

MySQL 101 Topics

MongoDB 101 Topics

  • Schema Review 101: How and What You Should Be Looking at…
  • Choosing a MySQL High Availability Solution Today
  • MySQL Performance Troubleshooting Best Practices
  • Comparing Synchronous Replication Solutions in the Cloud
  • Cost Optimizations Through MySQL Performance Optimizations
  • SQL with MySQL or NoSQL with MongoDB?
  • MongoDB for MySQL DBA’s
  • MongoDB Storage Engine Comparison
  • MongoDB 3.2: New Features Overview

 

Attendees will return ready to quickly and correctly take care of the day-to-day and week-to-week management of your MySQL or MongoDB environment.

The schedule and non-conference cost for the 101 courses are:

  • MySQL 101: Tuesday April 19th ($400)
  • MongoDB 101: Wednesday April 20th ($400)
  • Both MySQL and MongoDB 101 sessions ($700)

(Tickets to the 101 sessions do not grant access to the main Percona Live breakout sessions. Full Percona Live conferences passes will grant admission to the 101 sessions. 101 Crash Course attendees will have full access to Percona Live keynote speakers the exhibit hall and receptions.)

As a special promo, the first 101 people to purchase the 101 talks receive a $299.00 discount off the ticket price! Each session only costs $101! Get both sessions for a mere $202! Register now, and use the following codes for your first 101 discount:

  • Single101= $299 off of either the MySQL or MongoDB tickets
  • Double101= $498 off of the combined MySQL/MongoDB ticket

Sign up now for special track pricing. Click here to register.

Birds of a Feather

Birds of a Feather (BOF) sessions enable attendees with interests in the same project or topic to enjoy some quality face time. BOFs can be organized for individual projects or broader topics (e.g., best practices, open data, standards). Any attendee or conference speaker can propose and moderate an engaging BOF. Percona will post the selected topics and moderators online and provide a meeting space and time. The BOF sessions will be held Tuesday, April 19, 2016 at 6:00 p.m. The deadline for BOF submissions is February 7.

Lightning Talks

Lightning Talks provide an opportunity for attendees to propose, explain, exhort, or rant on any MySQL, NoSQL or Data in the Cloud-related topic for five minutes. Topics might include a new idea, successful project, cautionary story, quick tip, or demonstration. All submissions will be reviewed, and the top 10 will be selected to present during one of the scheduled breakout sessions during the week. Lighthearted, fun or otherwise entertaining submissions are highly welcome. The deadline for submitting a Lightning Talk topic is February 7, 2016.

by Kortney Runyan at February 02, 2016 07:45 PM

Experimental Percona Docker images for Percona Server

Docker

Percona DockerDocker is incredibly popular tool for deploying software, so we decided to provide a Percona Docker image for both Percona Server MySQL and Percona Server for MongoDB.

We want to create an easy way to try our products.

There are actually some images available from https://hub.docker.com/_/percona/, but these images are provided by Docker itself, not from Percona.

In our images, we provide all the varieties of storage engines available in Percona Server (MySQL/MongoDB).

Our images are available from https://hub.docker.com/r/percona/.

The simplest way to get going is to run the following:

docker run --name ps -e MYSQL_ROOT_PASSWORD=secret -d percona/percona-server:latest

for Percona Server/MySQL, and:

docker run --name psmdb -d percona/percona-server-mongodb:latest

for Percona Server/MongoDB.

It is very easy to try the different storage engines that comes with Percona Server for MongoDB. For example, to use RocksDB, run:

docker run --name psmdbrocks -d percona/percona-server-mongodb:latest
--storageEngine=RocksDB

or PerconaFT:

docker run --name psmdbperconaft -d percona/percona-server-mongodb:latest
--storageEngine=PerconaFT

We are looking for any feedback  you’d like to provide: if this is useful, and what improvements we could make.

by Vadim Tkachenko at February 02, 2016 05:02 PM

February 01, 2016

MariaDB AB

Recent release of MariaDB 10.1.11 contains two new authentication plugins

wlad

The recent release of MariaDB 10.1.11 contains two new authentication plugins:

Named pipe plugin

This plugins works only if user logs in using named pipe. It uses the operating system's username of currently logged on user running the client program. The plugin mirrors functionality of the already existing Unix socket authentication plugin on Windows.

GSSAPI plugin

For this plugin, the more correct name would be GSSAPI/SSPI plugin. It offer:

  • Kerberos authentication, on Unixes (via GSSAPI) and Windows (via SSPI)
  • Windows authentication, also in cases of standalone workstation (i.e outside of domain), via SSPI. Thus, this authentication plugins offers the functionality of MySQL Enterprise Windows authentication plugin and plus cross-platform interoperability.

This plugin was first written by Shuang Qui during the Google summer of Code back in 2013, and also contains contributions by Robbie Harwood of Redhat. Thanks for your contributions!

Tags: 

by wlad at February 01, 2016 08:39 PM

Peter Zaitsev

InnoDB and TokuDB on AWS

InnoDB and TokuDBIn a recent post, Vadim compared the performance of Amazon Aurora and Percona Server on AWS. This time, I am comparing write throughput for InnoDB and TokuDB, using the same workload (sysbench oltp/update/update_non_index) and a similar set-up (r3.xlarge instance, with general purpose ssd, io2000 and io3000 volumes) to his experiments.

All the runs used 16 threads for sysbench, and the following MySQL configuration files for InnoDB and TokuDB respectively:

[mysqld]
table-open-cache-instances=32
table_open_cache=8000
innodb-flush-method            = O_DIRECT
innodb-log-files-in-group      = 2
innodb-log-file-size           = 16G
innodb-flush-log-at-trx-commit = 1
innodb_log_compressed_pages     =0
innodb-file-per-table          = 1
innodb-buffer-pool-size        = 20G
innodb_write_io_threads        = 8
innodb_read_io_threads         = 32
innodb_open_files              = 1024
innodb_old_blocks_pct           =10
innodb_old_blocks_time          =2000
innodb_checksum_algorithm = crc32
innodb_file_format              =Barracuda
innodb_io_capacity=1500
innodb_io_capacity_max=2000
metadata_locks_hash_instances=256
innodb_max_dirty_pages_pct=90
innodb_flush_neighbors=1
innodb_buffer_pool_instances=8
innodb_lru_scan_depth=4096
innodb_sync_spin_loops=30
innodb-purge-threads=16

[mysqld]
tokudb_read_block_size=16K
tokudb_fanout=128
table-open-cache-instances=32
table_open_cache=8000
metadata_locks_hash_instances=256
[mysqld_safe]
thp-setting=never

You can see the full set of graphs here, and the complete results here.

Let me start illustrating the results with this summary graph for the io2000 volume, showing how write throughput varies over time, per engine and workload (for all graphs, size is in 1k rows, so 1000 is actually 1M):

We can see a few things already:

  • InnoDB has better throughput for smaller table sizes.
  • The reverse is true as size becomes big enough (after 10M rows here).
  • TokuDB’s advantage is not noticeable on the oltp workload, though it is for InnoDB.

Let’s dig in a bit more and look at the extreme ends in terms of table size, starting with 1M rows:

and ending in 50M:

In the first case, we can see that not only does InnoDB show better write throughput, it also shows less variance. In the second case, we can confirm that the difference does not seem significant for oltp, but it is for the other workloads.

This should come as no surprise, as one of the big differences between TokuDB’s Fractal trees and InnoDB’s B-tree implementation is the addition of message buffers to nodes, to handle writes (the other big difference, for me, is node size). For write-intensive workloads, TokuDB needs to do a lot  less tree traversing than InnoDB (in fact, this is done only to validate uniqueness constraints when required, otherwise writes are just injected into the message buffer and the buffer is flushed to lower levels of the tree asynchronously. I refer you to this post for more details).

For oltp, InnoDB is at advantage at smaller table sizes, as it does not need to scan message buffers all across the search path when reading (nothing is free in life, and this is the cost for TokuDB’s advantage for writes). I suspect this advantage is lost for high enough table sizes because at that point, either engine will be I/O bound anyway.

My focus here was write throughput, but as a small example see how this is reflected on response time if we pick the 50M table size and drop oltp from the mix:

At this point, you may be wondering why I focused on the io2000 results (and if you’re not, bear with me please!). The reason is the results for io3000 and the general purpose ssd showed characteristics that I attribute to latency on the volumes. You can see what I mean by looking at the io3000 graph:

I say “I attribute” because, unfortunately, I do not have any metrics other than sysbench’s output to go with (an error I will amend on future benchmarks!). I have seen the same pattern while working on production systems on AWS, and in those cases I was able to correlate it with increases in stime and/or qtime on diskstats. The fact that this is seen on the lower and higher capacity volumes for the same workload, but not the io2000 one, increases my confidence in this assumption.

Conclusion

I would not consider TokuDB a general purpose replacement for InnoDB, by which I mean I would never blindly suggest someone to migrate from one to the other, as the performance characteristics are different enough to make this risky without a proper assessment.

That said, I believe TokuDB has great advantages for the right scenarios, and this test highlights some of its strengths:

  • It has a significant advantage over InnoDB on slower devices and bigger data sets.
  • For big enough data sets, this is even the case on fast devices and write intensive workloads, as the B-tree becomes I/O bound much faster

Other advantages of TokuDB over InnoDB, not directly evidenced from these results, are:

  • Better compression (helped by the much larger block size).
  • Better SSD lifetime, due to less and more sequential writes (sequential writes have, in theory at least, no write amplification compared to random ones, so even though the sequential/random difference should not matter for SSDs for performance, it does for lifetime).

by Fernando Ipar at February 01, 2016 04:38 PM

January 29, 2016

Federico Razzoli

Reusing Prepared Statements in MariaDB 10.1

I may be wrong, but I think that MariaDB has a strange characteristic: it has many good features, some of which were not implemented intentionally. Well, this sentence is weird too, I know. But I have some examples: I’m not sure that the SEQUENCE engine was designed to generate a sequence of dates and/or times; but it can. And I believe that the CONNECT author had no idea that someone would have used his engine to create cursors for CALL statements; but I do.

Now I have a new example. MariaDB 10.1 supports compound statements out of stored procedures. Which means that you can write IF or WHILE in your install.sql files to create your databases in a dynamic way. This is done via the BEGIN NOT ATOMIC construct.

I played with this feature, like I usually do with new features. And what I’ve found out is amazing for me: BEGIN NOT ATOMIC allows us to nest prepared statements!

Uh, wait… maybe what I’ve just written sounds weird to you. Maybe you’re thinking: “prepared statements can be nested since they were first implemented!”. Which is only true in the documentation. The docs doesn’t lie of course, but it doesn’t work out there, in the real world’s complexity. PREPARE, EXECUTE and DEALLOCATE PREPARE statements cannot be prepared, and this limitation can be very frustrating if you try to write a reusable (perhaps public) stored procedure library.

I tried to explain the reason in this post, but it was becoming way too long, so I had to delete that boring explanation. I’ll just mention an example. You can write a procedure that prepares and executes a statement; but the prepared statement cannot call the procedure itself, recursively. Why? Because you cannot reuse its name, and annot dynamically generate a name for the new prepared statement. If this explanation is too short, just code and you’ll find out.

How can BEGIN NOT ATOMIC possibly fix this problem? Well, for a reason that’s far beyond me, the following blocks can be prepared and executed:

  • BEGIN NOT ATOMIC PREPARE ... ; END;
  • BEGIN NOT ATOMIC EXECUTE ... ; END;
  • BEGIN NOT ATOMIC DEALLOCATE PREPARE ... ; END;

Now you may be thinking that this feature is totally useless. But it isn’t. Thanks to this change, I’ve written a procedure that:

  • Executes a specified SQL string.
  • Autogenerates a “free” name for that statement, or uses an id passed by the user.
  • Returns the autogenerated id, so you can reuse it, or deallocate the statement.

Writing this procedure has been a bit annoying, because after all it uses a dirty trick. But now the procedure is written, and the dirty trick is encapsulated in it. You can use it as if it was a native feature, and forget the trick. Here’s the code:

CREATE DATABASE IF NOT EXISTS _;

CREATE TABLE IF NOT EXISTS _.prepared_statement
(
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
)
    ENGINE = MEMORY
;

CREATE PROCEDURE _.exec(IN p_sql TEXT, INOUT p_stmt_id INTEGER UNSIGNED)
BEGIN
    IF p_stmt_id IS NULL THEN
        INSERT INTO _.prepared_statement (id) VALUES (DEFAULT);
        SET p_stmt_id := LAST_INSERT_ID();
    END IF;
    
    SET @_SQL_exec := CONCAT(
        'BEGIN NOT ATOMIC PREPARE stmt_dyn_', p_stmt_id, ' '
        , 'FROM ', QUOTE(p_sql), IF(RIGHT(p_sql, 1) = ';', ' ', '; ')
        , 'END;'
    );
    PREPARE _stmt_exec FROM @_SQL_exec;
    EXECUTE _stmt_exec;
    
    SET @_SQL_exec := CONCAT(
        'BEGIN NOT ATOMIC EXECUTE stmt_dyn_', p_stmt_id, '; END;'
    );
    PREPARE _stmt_exec FROM @_SQL_exec;
    EXECUTE _stmt_exec;
    DEALLOCATE PREPARE _stmt_exec;
    SET @_SQL_exec := NULL;
END;

How do I use it? Very simple:

MariaDB [_]> -- redundant: @id is not set, so it's NULL
MariaDB [_]> SET @id := NULL;
Query OK, 0 rows affected (0.00 sec)
MariaDB [_]> CALL _.exec('SELECT 42 AS answer', @id);
+--------+
| answer |
+--------+
| 42 |
+--------+
1 row in set (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
MariaDB [_]> -- reuse @id
MariaDB [_]> CALL _.exec('SHOW SCHEMAS LIKE \'info%\'', @id);
+--------------------+
| Database (info%) |
+--------------------+
| information_schema |
+--------------------+
1 row in set (0.00 sec)
Query OK, 0 rows affected (0.00 sec)

I am writing a general purpose stored procedure library, that should make stored procedures developing more friendly. It will include this procedure, as well as a procedure for deallocating a specified statement, or all statements you prepared. As soon as the code is interesting and tested, I’ll make it public.

Enjoy!
Federico


by Federico at January 29, 2016 09:47 PM

Peter Zaitsev

EXPLAIN FORMAT=JSON knows everything about UNIONs: union_result and query_specifications

EXPLAIN FORMAT=JSON

EXPLAIN FORMAT=JSONReady for another post in the EXPLAIN FORMAT=JSON is Cool series! Great! This post will discuss how to see all the information that is contained in optimized queries with

UNION
 using the
union_result
 and
query_specifications
 commands.

 

When optimizing complicated queries with

UNION
, it is easy to get lost in the regular
EXPLAIN
  output trying to identify which part of the output belongs to each part of the
UNION
.

Let’s consider the following example:

mysql> explain
    ->     select emp_no, last_name, 'low_salary' from employees
    ->     where emp_no in (select emp_no from salaries
    ->         where salary < (select avg(salary) from salaries))
    -> union
    ->     select emp_no, last_name, 'high salary' from employees
    ->     where emp_no in (select emp_no from salaries
    ->         where salary >= (select avg(salary) from salaries))G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: employees
   partitions: NULL
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299778
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: PRIMARY
      key_len: 4
          ref: employees.employees.emp_no
         rows: 9
     filtered: 33.33
        Extra: Using where; FirstMatch(employees)
*************************** 3. row ***************************
           id: 3
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2557022
     filtered: 100.00
        Extra: NULL
*************************** 4. row ***************************
           id: 4
  select_type: UNION
        table: employees
   partitions: NULL
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299778
     filtered: 100.00
        Extra: NULL
*************************** 5. row ***************************
           id: 4
  select_type: UNION
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: PRIMARY
      key_len: 4
          ref: employees.employees.emp_no
         rows: 9
     filtered: 33.33
        Extra: Using where; FirstMatch(employees)
*************************** 6. row ***************************
           id: 6
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2557022
     filtered: 100.00
        Extra: NULL
*************************** 7. row ***************************
           id: NULL
  select_type: UNION RESULT
        table: <union1,4>
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using temporary
7 rows in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'low_salary' AS `low_salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) union /* select#4 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'high salary' AS `high salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`)))

While we can guess that subquery 3 belongs to the first query of the union, and subquery 6 belongs to the second (which has number 4 for some reason), we have to be very careful (especially in our case) when queries use the same tables in both parts of the

UNION
.

The main issue with the regular

EXPLAIN
 for
UNION
  is that it has to re-present the hierarchical structure as a table. The same issue occurs when you want to store objects created in programming language, such as Java, in the database.

EXPLAIN FORMAT=JSON
, on the other hand, has hierarchical structure and more clearly displays how
UNION
 was optimized:

mysql> explain format=json select emp_no, last_name, 'low_salary' from employees where emp_no in (select emp_no from salaries  where salary < (select avg(salary) from salaries)) union select emp_no, last_name, 'high salary' from employees where emp_no in (select emp_no from salaries where salary >= (select avg(salary) from salaries))G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "union_result": {
      "using_temporary_table": true,
      "table_name": "<union1,4>",
      "access_type": "ALL",
      "query_specifications": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 1,
            "cost_info": {
              "query_cost": "921684.48"
            },
            "nested_loop": [
              {
                "table": {
                  "table_name": "employees",
                  "access_type": "ALL",
                  "possible_keys": [
                    "PRIMARY"
                  ],
                  "rows_examined_per_scan": 299778,
                  "rows_produced_per_join": 299778,
                  "filtered": "100.00",
                  "cost_info": {
                    "read_cost": "929.00",
                    "eval_cost": "59955.60",
                    "prefix_cost": "60884.60",
                    "data_read_per_join": "13M"
                  },
                  "used_columns": [
                    "emp_no",
                    "last_name"
                  ]
                }
              },
              {
                "table": {
                  "table_name": "salaries",
                  "access_type": "ref",
                  "possible_keys": [
                    "PRIMARY",
                    "emp_no"
                  ],
                  "key": "PRIMARY",
                  "used_key_parts": [
                    "emp_no"
                  ],
                  "key_length": "4",
                  "ref": [
                    "employees.employees.emp_no"
                  ],
                  "rows_examined_per_scan": 9,
                  "rows_produced_per_join": 299778,
                  "filtered": "33.33",
                  "first_match": "employees",
                  "cost_info": {
                    "read_cost": "302445.97",
                    "eval_cost": "59955.60",
                    "prefix_cost": "921684.48",
                    "data_read_per_join": "4M"
                  },
                  "used_columns": [
                    "emp_no",
                    "salary"
                  ],
                  "attached_condition": "(`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",
                  "attached_subqueries": [
                    {
                      "dependent": false,
                      "cacheable": true,
                      "query_block": {
                        "select_id": 3,
                        "cost_info": {
                          "query_cost": "516948.40"
                        },
                        "table": {
                          "table_name": "salaries",
                          "access_type": "ALL",
                          "rows_examined_per_scan": 2557022,
                          "rows_produced_per_join": 2557022,
                          "filtered": "100.00",
                          "cost_info": {
                            "read_cost": "5544.00",
                            "eval_cost": "511404.40",
                            "prefix_cost": "516948.40",
                            "data_read_per_join": "39M"
                          },
                          "used_columns": [
                            "salary"
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 4,
            "cost_info": {
              "query_cost": "921684.48"
            },
            "nested_loop": [
              {
                "table": {
                  "table_name": "employees",
                  "access_type": "ALL",
                  "possible_keys": [
                    "PRIMARY"
                  ],
                  "rows_examined_per_scan": 299778,
                  "rows_produced_per_join": 299778,
                  "filtered": "100.00",
                  "cost_info": {
                    "read_cost": "929.00",
                    "eval_cost": "59955.60",
                    "prefix_cost": "60884.60",
                    "data_read_per_join": "13M"
                  },
                  "used_columns": [
                    "emp_no",
                    "last_name"
                  ]
                }
              },
              {
                "table": {
                  "table_name": "salaries",
                  "access_type": "ref",
                  "possible_keys": [
                    "PRIMARY",
                    "emp_no"
                  ],
                  "key": "PRIMARY",
                  "used_key_parts": [
                    "emp_no"
                  ],
                  "key_length": "4",
                  "ref": [
                    "employees.employees.emp_no"
                  ],
                  "rows_examined_per_scan": 9,
                  "rows_produced_per_join": 299778,
                  "filtered": "33.33",
                  "first_match": "employees",
                  "cost_info": {
                    "read_cost": "302445.97",
                    "eval_cost": "59955.60",
                    "prefix_cost": "921684.48",
                    "data_read_per_join": "4M"
                  },
                  "used_columns": [
                    "emp_no",
                    "salary"
                  ],
                  "attached_condition": "(`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))",
                  "attached_subqueries": [
                    {
                      "dependent": false,
                      "cacheable": true,
                      "query_block": {
                        "select_id": 6,
                        "cost_info": {
                          "query_cost": "516948.40"
                        },
                        "table": {
                          "table_name": "salaries",
                          "access_type": "ALL",
                          "rows_examined_per_scan": 2557022,
                          "rows_produced_per_join": 2557022,
                          "filtered": "100.00",
                          "cost_info": {
                            "read_cost": "5544.00",
                            "eval_cost": "511404.40",
                            "prefix_cost": "516948.40",
                            "data_read_per_join": "39M"
                          },
                          "used_columns": [
                            "salary"
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'low_salary' AS `low_salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` < (/* select#3 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`))) union /* select#4 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`last_name` AS `last_name`,'high salary' AS `high salary` from `employees`.`employees` semi join (`employees`.`salaries`) where ((`employees`.`salaries`.`emp_no` = `employees`.`employees`.`emp_no`) and (`employees`.`salaries`.`salary` >= (/* select#6 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`)))

First it puts member

union_result
 in the
query_block
  at the very top level:

EXPLAIN: {
  "query_block": {
    "union_result": {

The

union_result
 object contains information about how the result set of the
UNION
 was processed:

"using_temporary_table": true,
      "table_name": "<union1,4>",
      "access_type": "ALL",

And also contains the 

query_specifications
 array which also contains all the details about queries in the
UNION
:

"query_specifications": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 1,
<skipped>
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 4,

This representation is much more clear, and also contains all the details which the regular

EXPLAIN
misses for regular queries.

Conclusion:

EXPLAIN FORMAT=JSON
 not only contains additional optimization information for each query in the
UNION
, but also has a hierarchical structure that is more suitable for the hierarchical nature of the
UNION
.

by Sveta Smirnova at January 29, 2016 07:09 PM

MariaDB Foundation

MariaDB 10.1.11 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.11. See the release notes and changelog for details on this release. Download MariaDB 10.1.11 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.1.11 now available appeared first on MariaDB.org.

by Daniel Bartholomew at January 29, 2016 07:01 PM

Peter Zaitsev

Percona XtraDB Cluster 5.6.28-25.14 is now available

How to calculate the correct size of Percona XtraDB Cluster's gcache

Percona XtraDB Cluster 5.6.26-25.12Percona is glad to announce the new release of Percona XtraDB Cluster 5.6 on January 29, 2016. Binaries are available from the downloads area or from our software repositories.

Percona XtraDB Cluster 5.6.28-25.14 is now the current release, based on the following:

All of Percona software is open-source and free, and all the details of the release can be found in the 5.6.28-25.14 milestone at Launchpad.

For more information about relevant Codership releases, see this announcement.

Bugs Fixed:

  • 1494399: Fixed issue caused by replication of events on certain system tables (for example, mysql.slave_master_info, mysql.slave_relay_log_info). Replication in the Galera eco-system is now avoided when bin-logging is disabled for said tables.
    NOTE: As part of this fix, when bin-logging is enabled, replication in the Galera eco-system will happen only if BINLOG_FORMAT is set to either ROW or STATEMENT. The recommended format is ROW, while STATEMENT is required only for the pt-table-checksum tool to operate correctly. If BINLOG_FORMAT is set to MIXED, replication of events in the Galera eco-system tables will not happen even with bin-logging enabled for those tables.
  • 1522385: Fixed GTID holes caused by skipped replication. A slave might ignore an event replicated from master, if the same event has already been executed on the slave. Such events are now propagated in the form of special GTID events to maintain consistency.
  • 1532857: The installer now creates a /var/lib/galera/ directory (assigned to user nobody), which can be used by garbd in the event it is started from a directory that garbd cannot write to.

Known Issues:

  • 1531842: Two instances of garbd cannot be started from the same working directory. This happens because each instance creates a state file (gvwstate.dat) in the current working directory by default. Although garbd is configured to use the base_dir variable, it was not registered due to a bug. Until garbd is fixed, you should start each instance from a separate working directory.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at January 29, 2016 01:34 PM

January 28, 2016

Peter Zaitsev

Vote Percona Server in LinuxQuestions.org Members Choice Awards

Percona Server

Percona ServerPercona is calling on you! Vote Percona for Database of the Year in LinuxQuestions.org Members Choice Awards 2015. Help our Percona Server get recognized as one of the best database options for data performance. Percona Server is a free, fully compatible, enhanced, open source drop-in replacement for MySQL® that provides superior performance, scalability and instrumentation.

LinuxQuestions.org, or LQ for short, is a community-driven, self-help web site for Linux users. Each year, LinuxQuestions.org holds an annual competition to recognize the year’s best-in-breed technologies. The winners of each category are determined by the online Linux community!

You can vote now for your favorite products of 2015 (Percona, of course!). This is your chance to be heard!

Voting ends on February 10th, 2016. You must be a registered member of LinuxQuestions.org with at least one post on their forums to vote.

by Dave Avery at January 28, 2016 09:13 PM

Jean-Jerome Schmidt

Get all the insight on open source database management and infrastructure operations with Severalnines whitepapers

Whether you’re looking into ways to automate various aspects of administering your open source databases or to take better control of your data, we have the relevant whitepaper that will help you in your quest and hopefully provide you with good food for thought on how to achieve your database management objectives.

Management and Automation of Open Source Databases

As the adoption of open source databases, such as MySQL / MariaDB, PostgreSQL or MongoDB, increases in the enterprise, especially for mission-critical applications, so does the need for robust and integrated tools. Operational staff need to able to manage everything from provisioning, capacity, performance and availability of the database environment. This is needed to minimize the risk for service outages or poor application performance.

This whitepaper discusses the database infrastructure lifecycle, what tools to build (or buy) for effective management, database deployment options beyond Chef or Puppet, important aspects of monitoring and managing open source database infrastructures and how ClusterControl enables a systematic approach to open source database operations.

You may also be interested in our related blog series on:

All of our white papers can be downloaded here: http://severalnines.com/whitepapers

Happy clustering!

Blog category:

by Severalnines at January 28, 2016 08:57 PM

Peter Zaitsev

Setup a MongoDB replica/sharding set in seconds

MongoDB sharding

MongoDBIn the MySQL world, we’re used to playing in the MySQL Sandbox. It allows us to deploy a testing replication environment in seconds, without a great deal of effort or navigating multiple virtual machines. It is a tool that we couldn’t live without in Support.

In this post I am going to walk through the different ways we have to deploy a MongoDB replica/sharding set test in a similar way. It is important to mention that this is not intended for production, but to be used for troubleshooting, learning or just playing around with replication.

Replica Set regression test’s diagnostic commands

MongoDB includes a .js that allows us to deploy a replication set from the MongoDB’s shell. Just run the following:

# mongo --nodb
> var rstest = new ReplSetTest( { name: 'replicaSetTest', nodes: 3 } )
> rstest.startSet()
ReplSetTest Starting Set
ReplSetTest n is : 0
ReplSetTest n: 0 ports: [ 31000, 31001, 31002 ]	31000 number
{
	"useHostName" : true,
	"oplogSize" : 40,
	"keyFile" : undefined,
	"port" : 31000,
	"noprealloc" : "",
	"smallfiles" : "",
	"rest" : "",
	"replSet" : "replicaSetTest",
	"dbpath" : "$set-$node",
	"restart" : undefined,
	"pathOpts" : {
		"node" : 0,
		"set" : "replicaSetTest"
	}
}
ReplSetTest Starting....
[...]

At some point our mongod daemons will be running, each with its own data directory and port:

2133 pts/0    Sl+    0:01 mongod --oplogSize 40 --port 31000 --noprealloc --smallfiles --rest --replSet replicaSetTest --dbpath /data/db/replicaSetTest-0 --setParameter enableTestCommands=1
 2174 pts/0    Sl+    0:01 mongod --oplogSize 40 --port 31001 --noprealloc --smallfiles --rest --replSet replicaSetTest --dbpath /data/db/replicaSetTest-1 --setParameter enableTestCommands=1
 2213 pts/0    Sl+    0:01 mongod --oplogSize 40 --port 31002 --noprealloc --smallfiles --rest --replSet replicaSetTest --dbpath /data/db/replicaSetTest-2 --setParameter enableTestCommands=1

Perfect. Now we need to initialize the replicaset:

> rstest.initiate()
{
	"replSetInitiate" : {
		"_id" : "replicaSetTest",
		"members" : [
			{
				"_id" : 0,
				"host" : "debian:31000"
			},
			{
				"_id" : 1,
				"host" : "debian:31001"
			},
			{
				"_id" : 2,
				"host" : "debian:31002"
			}
		]
	}
}
 m31000| 2016-01-24T10:42:36.639+0100 I REPL     [ReplicationExecutor] Member debian:31001 is now in state SECONDARY
 m31000| 2016-01-24T10:42:36.639+0100 I REPL     [ReplicationExecutor] Member debian:31002 is now in state SECONDARY
[...]

and it is done!

> rstest.status()
{
	"set" : "replicaSetTest",
	"date" : ISODate("2016-01-24T09:43:41.261Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 0,
			"name" : "debian:31000",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 329,
			"optime" : Timestamp(1453628552, 1),
			"optimeDate" : ISODate("2016-01-24T09:42:32Z"),
			"electionTime" : Timestamp(1453628554, 1),
			"electionDate" : ISODate("2016-01-24T09:42:34Z"),
			"configVersion" : 1,
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "debian:31001",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 68,
			"optime" : Timestamp(1453628552, 1),
			"optimeDate" : ISODate("2016-01-24T09:42:32Z"),
			"lastHeartbeat" : ISODate("2016-01-24T09:43:40.671Z"),
			"lastHeartbeatRecv" : ISODate("2016-01-24T09:43:40.677Z"),
			"pingMs" : 0,
			"configVersion" : 1
		},
		{
			"_id" : 2,
			"name" : "debian:31002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 68,
			"optime" : Timestamp(1453628552, 1),
			"optimeDate" : ISODate("2016-01-24T09:42:32Z"),
			"lastHeartbeat" : ISODate("2016-01-24T09:43:40.672Z"),
			"lastHeartbeatRecv" : ISODate("2016-01-24T09:43:40.690Z"),
			"pingMs" : 0,
			"configVersion" : 1
		}
	],
	"ok" : 1
}

There are many more commands you can run, just type rstest. and then press Tab twice to get the list. Follow this link if you need more info:

http://api.mongodb.org/js/current/symbols/_global_.html#ReplSetTest

What about sharding? Pretty similar:

> var shtest = new ShardingTest({ shards: 2, mongos: 1 })

This is the documentation link if you need more info:

http://api.mongodb.org/js/current/symbols/_global_.html#ShardingTest

It is important to mention that if you close the mongo shell where you run the commands, then all the spawned mongod will also shut down.


Mtools

mtools is a collection of tools and scripts that make MongoDB’s DBA lives much easier. It includes mlaunch, which can be used to start replicate sets and sharded systems for testing.

https://github.com/rueckstiess/mtools

The mlaunch tool requires pymongo, so you need to install it:

# pip install pymongo

You can also use pip to install mtools:

# pip install mtools

Then, we can just start our replica set. In this case, with two nodes and one arbiter:

# mlaunch --replicaset --nodes 2 --arbiter --name "replicaSetTest" --port 3000
launching: mongod on port 3000
launching: mongod on port 3001
launching: mongod on port 3002
replica set 'replicaSetTest' initialized.
# ps -x | grep mongod
10246 ?        Sl     0:03 mongod --replSet replicaSetTest --dbpath /root/data/replicaSetTest/rs1/db --logpath /root/data/replicaSetTest/rs1/mongod.log --port 3000 --logappend --fork
10257 ?        Sl     0:03 mongod --replSet replicaSetTest --dbpath /root/data/replicaSetTest/rs2/db --logpath /root/data/replicaSetTest/rs2/mongod.log --port 3001 --logappend --fork
10274 ?        Sl     0:03 mongod --replSet replicaSetTest --dbpath /root/data/replicaSetTest/arb/db --logpath /root/data/replicaSetTest/arb/mongod.log --port 3002 --logappend --fork

Done. You can also deploy a shared cluster, or a sharded replica set. More information in the following link:

https://github.com/rueckstiess/mtools/wiki/mlaunch


Ognom Toolkit

“It is a set of utilities, functions and tests with the goal of making the life of MongoDB/TokuMX administrators easier.”

This toolkit has been created by Fernando Ipar and Sveta Smirnova, and includes a set of scripts that allow us to deploy a testing environment for both sharding and replication configurations. The main difference is that you can specify what storage engine will be the default, something you cannot do with other to methods.

https://github.com/Percona-Lab/ognom-toolkit

We have the tools we need under “lab” directory. Most of the names are pretty self-explanatory:

~/ognom-toolkit/lab# ls
README.md  start_multi_dc_simulation  start_sharded_test  stop_all_mongo    stop_sharded_test
common.sh  start_replica_set	      start_single	  stop_replica_set  stop_single

So, let’s say we want a replication cluster with four nodes that will use PerconaFT storage engine. We have to do the following:

Set a variable with the storage engine we want to use:

# export MONGODB_ENGINE=PerconaFT

Specify where is our mongod binary:

# export MONGOD=/usr/bin/mongod

Start our 4 nodes replica set:

# ./start_replica_set
Starting 4 mongod instances
2016-01-25T12:36:04.812+0100 I STORAGE  Compression: snappy
2016-01-25T12:36:04.812+0100 I STORAGE  MaxWriteMBPerSec: 1024
2016-01-25T12:36:04.813+0100 I STORAGE  Crash safe counters: 0
about to fork child process, waiting until server is ready for connections.
forked process: 1086
child process started successfully, parent exiting
[...]
MongoDB shell version: 3.0.8
connecting to: 127.0.0.1:27001/test
{
	"set" : "rsTest",
	"date" : ISODate("2016-01-25T11:36:09.039Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 0,
			"name" : "debian:27001",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 5,
			"optime" : Timestamp(1453721767, 5),
			"optimeDate" : ISODate("2016-01-25T11:36:07Z"),
			"electionTime" : Timestamp(1453721767, 2),
			"electionDate" : ISODate("2016-01-25T11:36:07Z"),
			"configVersion" : 4,
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "debian:27002",
			"health" : 1,
			"state" : 5,
			"stateStr" : "STARTUP2",
			"uptime" : 1,
			"optime" : Timestamp(0, 0),
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2016-01-25T11:36:07.991Z"),
			"lastHeartbeatRecv" : ISODate("2016-01-25T11:36:08.093Z"),
			"pingMs" : 0,
			"configVersion" : 2
		},
		{
			"_id" : 2,
			"name" : "debian:27003",
			"health" : 1,
			"state" : 0,
			"stateStr" : "STARTUP",
			"uptime" : 1,
			"optime" : Timestamp(0, 0),
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2016-01-25T11:36:07.991Z"),
			"lastHeartbeatRecv" : ISODate("2016-01-25T11:36:08.110Z"),
			"pingMs" : 2,
			"configVersion" : -2
		},
		{
			"_id" : 3,
			"name" : "debian:27004",
			"health" : 1,
			"state" : 0,
			"stateStr" : "STARTUP",
			"uptime" : 1,
			"optime" : Timestamp(0, 0),
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2016-01-25T11:36:08.010Z"),
			"lastHeartbeatRecv" : ISODate("2016-01-25T11:36:08.060Z"),
			"pingMs" : 18,
			"configVersion" : -2
		}
	],
	"ok" : 1
}

Now, just start using it:

rsTest:PRIMARY> db.names.insert({ "a" : "Miguel"})
rsTest:PRIMARY> db.names.stats()
{
	"ns" : "mydb.names",
	"count" : 1,
	"size" : 36,
	"avgObjSize" : 36,
	"storageSize" : 16384,
	"capped" : false,
	"PerconaFT" : {
[...]


Conclusion

When dealing with bugs, troubleshooting or testing some application that needs a complex MongoDB infrastructure, these processes can save us lot of time. No need of set up multiple virtual machines, deal with networking and human mistakes. Just say “I want a sharded cluster, do it for me.”

by Miguel Angel Nieto at January 28, 2016 07:09 PM

January 27, 2016

Peter Zaitsev

MongoDB revs you up: What storage engine is right for you? (Part 4)

MongoDB

MongoDBDifferentiating Between MongoDB Storage Engines: PerconaFT

In this series of posts, we discussed what a storage engine is, and how you can determine the characteristics of one versus the other:

“A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems.”

Generally speaking, it’s important to understand what type of work environment the database is going to interact with, and to select a storage engine that is tailored to that environment.

The first post looked at MMAPv1, the original default engine for MongoDB (through release 3.0). The second post examined WiredTiger, the new default MongoDB engine. The third post reviewed RocksDB, an engine developed for the Facebook environment.

This post will cover PerconaFT. PerconaFT was developed out of Percona’s acquisition of Tokutek, from their TokuDB product.

PerconaFT

Find it in: Percona Builds

PerconaFT is the newest version of the Fractal Tree storage engine that was designed and implemented by Tokutek, which was acquired by Percona in April of 2015. Designed at MIT, SUNY Stony Brook and Rutgers, the Fractal Tree is a data structure that aimed to remove disk bottlenecks from databases that were using the B-tree with datasets that were several times larger that cache.

PerconaFT is arguably the most “mature” storage engine for MongoDB, with support for document level concurrency and compression. The Fractal Tree was first commercially implemented in June of 2013 in TokuMX, a fork of MongoDB, with an advanced feature set.

As described previously, the Fractal Tree (which is available for MongoDB in the PerconaFT storage engine) is a write-optimized data structure utilizing many log-like “queues” called message buffers, but has an arrangement like that of a read-optimized data structure. With the combination of these properties, PerconaFT can provide high performance for applications with high insert rates, while providing very efficient lookups for update/query-based applications. This will theoretically provide very predictable and consistent performance as the database grows. Furthermore, PerconaFT typically provides, comparatively, the deepest compression rates of any of the engines we’ve discussed in this series.

An ideal fit for the PerconaFT storage engine is a system with varied workloads, where predictable vertical scaling is required in addition to the horizontal scaling provide MongoDB. Furthermore, the ability of PerconaFT to maintain performance while compressing – along with support for multiple compression algorithms (snappy, quicklz, zlib and lzma) – make it one of the best options for users looking to optimize their data footprint.

Conclusion

Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of PerconaFT and WiredTiger to help specifically differentiate between these engines.

Part 1: Intro and the MMAPv1 storage engine.

Part 2: WiredTiger storage engine.

Part 3: RocksDB storage engine.

by Jon Tobin at January 27, 2016 08:13 PM

Percona CEO Peter Zaitsev discusses working remotely with Fortune Magazine

remote worker

working remotelyAs a company that believes in and supports the open source community, embracing innovation and change is par for the course at Percona. We wouldn’t be the company we are today without fostering a culture that rewards creative thinking and rapid evolution.

Part of this culture is making sure that Percona is a place where people love to work, and can transmit their passion for technology into tangible rewards – both personally and financially. One of the interesting facts about Percona’s culture is that almost 95 percent of its employees are working remotely. Engineers, support, marketing, even executive staff – most of these people interact daily via electronic medium rather than in person. Percona’s staff is worldwide across 29 countries and 19 U.S. states. How does that work? How do you make sure that the staff is happy, committed, and engaged enough to stay on? How do you attract prospective employees with this unusual model?

It turns out that not only does it work, but it works very well. It can be challenging to manage the needs of such a geographically diverse group, but the rewards (and the results) outweigh the effort.

The secret is, of course, good communication, an environment of respect and personal empowerment.

Percona’s CEO Peter Zaitsev recently provided some of his thoughts to Fortune magazine about how our business model helps to not only to foster incredible dedication and innovation, but create a work environment that encourages passion, commitment and teamwork.

Read about his ideas on Percona’s work model here.

Oh, and by the way, Percona is currently hiring! Perhaps a career here might fit in with your plans . . .

by Dave Avery at January 27, 2016 07:33 PM

MariaDB AB

MariaDB & Database Security

maria-luisaraviol

Data, Security, Requirements, Procedures, Access, System, etc. Picture is under creative commons, from https://www.flickr.com/photos/purpleslog/2907496392One of the key issues in 2016 for DBAs to tackle will be Database Security, mainly associated to the increasing adoption of public and private clouds, as well as mission critical applications running on open source databases in large Enterprises.

Database security is one of the key topics for all the major vendors in the MySQL and MariaDB ecosystem. Oracle has just released version 5.7 of MySQL, with more features for standard authentication and proxy users, long awaited by the Community. Enterprise customers can also benefit of a PAM authentication plugin that can support LDAP. Percona has improved its PAM plugin and it is very much focused on features that are related to security, naming audit.

The recent release of the 10.1 version of MariaDB has given it a significant boost in security features, available, as usual, to the whole Community.

The efforts of the MariaDB team for 10.1 and the development on 10.2 are focused on 5 specific areas:

  • Internal security and password check
  • PAM and LDAP authentication
  • Kerberos
  • User Roles
  • Database Encryption

Internal security and password check

With 10.1, MariaDB has introduced the Password Validation Plugin API. This means that it is now easy for users and contributors to create their own validator beyond what is already available. Does your organisation require a two-factor authentication provided by a selected vendor? It is now possible to implement it with relatively little effort. Some examples of the implementation with Google Authenticator are already available by Community contributors.

10.1 also provides ready-made plugins, such as the simple_password_check, where users can set simple checks like minimum length and mandatory characters, and the cracklib_password_check, where the criteria for a password in MariaDB must match the CrackLib checking library.

PAM and LDAP authentication

The PAM Authentication Plugin has been added to MariaDB long time ago (since 5.2). The plugin allows DBAs to set a database environment where users can share passwords from normal shell logins and other services. In addition to that, an integration with LDAP (using the pam_ldap shared library) allows DB users to authenticate against a LDAP server.

Kerberos

The Kerberos plugin has been in a development stage for quite a long period for MariaDB. The engineering team is now committed to add a production-ready version of this plugin for 10.2. The progress of this plugin can be followed on the public Jira for MariaDB Server here.

User Roles

User roles have been introduced in MariaDB 10.0 and they have been improved with 10.1. Now DBAs can set roles, i.e. they can bundle a set of privileges and associate them to a role, then they can grant a role and automatically grant the related privileges to a user or to a set of users. The SET DEFAULT ROLE statement has been added to 10.1, in order to define a default set of privileges for new users. Extra qualifiers are now available for the CREATE/DROP ROLE statements.

Database Encryption

Database encryption is probably the most important and interesting aspect for database security that is now available in MariaDB 10.1. New features include tablespaces, tables and logs encryption, a new key management file plugin and new parameters used to tune the encryption, such as rotation keys, table scrubbing, binary and relay log encryption.

These features raise MariaDB in database security to a level that goes beyond what has been commonly perceived for open source databases, and make 10.1 the most secure open source database for the Cloud.

About the Author

maria-luisaraviol's picture

Maria-Luisa Raviol is a Senior Sales Engineer with over 20 years industry experience.

by maria-luisaraviol at January 27, 2016 09:11 AM

January 26, 2016

Peter Zaitsev

Finding MySQL Table Size on Disk

MySQL table size

MySQL table sizeSo you want to know how much space a given MySQL table takes on disk. Looks trivial, right? Shouldn’t this information be readily available in the

INFORMATION_SCHEMA.TABLES
? Not so fast!

This simple question actually is quite complicated in MySQL. MySQL supports many storage engines (some of which don’t store data on disk at all) and these storage engines often each store data in different layouts. For example, there are three “basic” layouts that the InnoDB storage engine supports for MySQL 5.7, with multiple variations for

row_formats
 and two types of available compression.

So let’s simplify the situation: instead of a general question, let’s ask how to find the table size on disk for an InnoDB table stored in its own tablespace (as the parameter

innodb_file_per_table=1
 provides).

Before we get to the answer, let me show you the table size graph that I get by running

sysbench
 prepare (basically populating tables with multi-value inserts):

Click graphic to enlarge

This graphs shows the table size defined by

data_length
 plus
index_length
 
captured from
INFORMATION_SCHEMA.TABLES
. You would expect gradual table growth as data is inserted into it, rather than a flat table size followed by jumps (sometimes by 10GB or more).

The graph does not match how data is changing on disk, where it is growing gradually (as expected):

-rw-r----- 1 mysql mysql 220293234688 Jan 25 17:03 sbtest1.ibd
-rw-r----- 1 mysql mysql 220310011904 Jan 25 17:03 sbtest1.ibd
-rw-r----- 1 mysql mysql 222499438592 Jan 25 17:07 sbtest1.ibd

As we see from this experiment, MySQL does not really maintain live

data_length
 and
index_length
 
values,  but rather refreshes them periodically – and rather irregularly. The later part of the graph is especially surprising, where we see a couple of data refreshes becoming more regular. This is different from the first part of the graph which seems to be in line with statistics being updated when when 10 percent of the rows are changed.  (manual)

What makes it especially confusing is that there are other values such as

table_rows
,
data_free
 
or
update_time
 
 that are updated in the real time (even though I can’t imagine why table size related values would be any more difficult to maintain in real time!).

Is there way to get real time

data_length
 and
index_length
 updates as we query
information_schema
? There is, but it is costly.

To get

information_schema
 to provide accurate information in MySQL 5.7, you need to do two things: disable
innodb_stats_persistent
 
and enable
innodb_stats_on_metadata
 
– both of which come with significant side effects.

Disabling persistent statistics means InnoDB has to refresh the statistics each time the server starts, which is expensive and can produce volatile query plans between restarts. Enabling

innodb_stats_on_metadata
 makes access to
information_schema
 
slower, much slower, as I wrote few years ago.

Is there a better way? It turns out there is. You can look into the tablespaces information table using

INNODB_SYS_TABLESPACES
 to see the actual file size. Unlike
index_length
 
and
data_length
,
INNODB_SYS_TABLESPACES
 
is updated in real time with no special configuration required:

mysql> select * from INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES where name='sbinnodb/sbtest1' G
*************************** 1. row ***************************
        SPACE: 42
         NAME: sbinnodb/sbtest1
         FLAG: 33
  FILE_FORMAT: Barracuda
   ROW_FORMAT: Dynamic
    PAGE_SIZE: 16384
ZIP_PAGE_SIZE: 0
   SPACE_TYPE: Single
FS_BLOCK_SIZE: 4096
    FILE_SIZE: 245937209344
ALLOCATED_SIZE: 245937266688
1 row in set (0.00 sec)

The great thing about using this table is that it also handles new “Innodb Page Compression” properly showing the difference between

file_size
  (which is the logical file size on disk) and
allocated_size
 
(which is space allocated for this file and can be significantly smaller):

mysql> select * from INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES where name='sbinnodb/testcomp' G
*************************** 1. row ***************************
        SPACE: 48
         NAME: sbinnodb/testcomp
         FLAG: 33
  FILE_FORMAT: Barracuda
   ROW_FORMAT: Dynamic
    PAGE_SIZE: 16384
ZIP_PAGE_SIZE: 0
   SPACE_TYPE: Single
FS_BLOCK_SIZE: 4096
    FILE_SIZE: 285212672
ALLOCATED_SIZE: 113004544
1 row in set (0.00 sec)

Finally, let’s look into how different InnoDB compression variants impact the information provided in

information_schema
.   

If you use the old Innodb compression (Innodb Table Compression) you will see the compressed data size shown in

data_length
 and
index_length
 
as result. For example,
avg_row_length
 will be much lower than you would expect.

If you use the new InnoDB compression in MySQL 5.7 (Innodb Page Compression) you will see the values corresponding to file size, not allocated size as shown in

information_schema
.

Conclusion
Answering the trivial question “How much space does this table take on disk?” is really not a simple request in MySQL – look at the obvious place and you’re likely to get the wrong value. Look at

INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES
 to get the actual file size value for InnoDB tables.  

by Peter Zaitsev at January 26, 2016 10:16 PM

Daniël van Eeden

When simple SQL can be complex

I think SQL is a very simple language, but ofcourse I'm biased.

But even a simple statement might have more complexity to it than you might think.

Do you know what the result is of this statement?
SELECT FALSE = FALSE = TRUE;
scroll down for the answer.



























The answer is: it depends.

You might expect it to return false because the 3 items in the comparison are not equal. But that's not the case.

In PostgreSQL this is the result:
postgres=# SELECT FALSE = FALSE = TRUE;
?column?
----------
t
(1 row)
So it compares FALSE against FALSE, which results in TRUE and then That is compared against TRUE, which results in TRUE. PostgreSQL has proper boolean literals.

Next up is MySQL:
mysql> SELECT FALSE = FALSE = TRUE;
+----------------------+
| FALSE = FALSE = TRUE |
+----------------------+
| 1 |
+----------------------+
1 row in set (0.00 sec)
This is similar but it's slightly different. The result is 1 because in MySQL TRUE and FALSE evalueate to 0 and 1. If you use BOOLEAN in your DDL this will be changed to tinyint(1). But note that the (1) is only the display width and doesn't change the storage space (tinyint is 1 byte).

And SQLite has yet another result:
sqlite> SELECT FALSE = FALSE = TRUE;
Error: no such column: FALSE
This is because SQLite doesn't have a boolean type and you're expected to use 0 and 1.
If we use the suggested solution we get the same result as with MySQL.
sqlite> SELECT 0 = 0 = 1;
1

What about the SQL standard?

There is a boolean literal in the SQL:1999 standard according to this Wikipedia article. Note that 1999 is 17 years ago. It is an optional feature so it isn't required. Note that a boolean can have 3 values according to the standard. It can be TRUE, FALSE or UNKNOWN. It suggests that the UNKNOWN literal may evaluate to NULL. Neither MySQL, PostgreSQL or SQLite implements the UNKNOWN literal.

What about commercial databases?

DB2, Oracle and SQL Server don't have a boolean type according to this webpage. For DB2 this has changed, according to this page from IBM BOOLEAN support was added in DB2 9.7.0. It supports TRUE, FALSE and NULL, but not UNKNOWN for what I can see.
Ingres 10.0 has full standards complient support for BOOLEAN according to their wiki.

Interestingly enough there are multiple suggestions about what to use when there is no boolean type: BIT, CHAR(1), NUMBER(1). This blogpost from Peter Zaitsev also lists another option: CHAR(0).

So even something simple as a boolean might be less portable than you might have thought it was.

But what about doing a real three-way compare in SQL?

One solution would be to use the & operator:
postgres=# SELECT FALSE::int & FALSE::int & TRUE::int;
?column?
----------
0
(1 row)
 
mysql [(none)] > SELECT FALSE & FALSE & TRUE;
+----------------------+
| FALSE & FALSE & TRUE |
+----------------------+
| 0 |
+----------------------+
1 row in set (0.00 sec)
 
sqlite> SELECT 0 & 0 & 1;
0

by Daniël van Eeden (noreply@blogger.com) at January 26, 2016 08:56 PM

Erkan Yanar

MariaDB AB

MariaDB Security and Encryption at London MySQL Meetup Group

maria-luisaraviol

MariaDB Security and Encryption at London MySQL Meetup GroupIn December 2015, MariaDB Evangelist Colin Charles was asked to present on MariaDB Security and Encryption at the London MySQL Meetup group. This blog is a summary of Colin’s presentation.

A few words about meetup groups

In December 2015, the London MySQL Meetup Group meeting took place at an amazing location: the Yoox Net-a-Porter Group offices at Westfield London Shopping Centre. A brilliant location and fantastic host: Yoox-Net-a-Porter not only sponsored the venue, but also offered great food and drinks (special thanks to them!).

The goal of the London MySQL Meetup Group is to keep up with the MySQL ecosystem awareness and it’s great to see rooms full of old and new faces every new meeting. Some of the group members work for the three major distribution companies, some are DBAs, some are developers, but the aim is to learn and to learn from each other’s experiences and this is really the amazing spirit of the group.

The meetings are normally scheduled every two months and at the end of every meeting the organiser requests the audience to suggests topics for the next upcoming Meetups. It’s then on Ivan Zoratti, the group organiser, to work on these suggestions and ensure that the group can have the right speakers presenting and covering the requested topics.

Volunteers are absolutely welcome and anybody who wants to share any kind of MySQL/MariaDB/Percona experience or report results of specific tests or benchmarks can apply and present.

In May 2015, the Meetup main topic was focused on the last Percona Live event in Santa Clara. The purpose of the meeting was to share with the community all the news, announcements and also some details of the upcoming releases of MySQL, Percona and MariaDB that have been presented in the Percona Live sessions.

Special guest for the evening was Colin Charles, Chief Evangelist at MariaDB Corporation who talked about MariaDB 10.1 features in deep. The audience showed great interest in the new features of MariaDB 10.1 and at the end of the meeting many participants requested to have a future Meetup focused on security and encryption in MariaDB 10.1. That’s why Colin was asked to return as a special guest to cover what MariaDB has developed in terms of security to back up enterprises to accomplish the security requirements.

Colin’s presentation on MariaDB 10.1 security and encryption

As requested, this time Colin started the presentation introducing MariaDB 10.1 encryption and how the MariaDB approach on database encryption was since the beginning focused on tablespace and table level encryption more than encrypting the whole database.

In MariaDB 10.1 GA tablespace encryption encrypts everything including the binary logs, temporary tables and the binlog caches (10.1.5).

Colin also explained the MariaDB engagement in providing the best possible encryption solution for the MariaDB users: several months were needed to ensure that the MariaDB encryption solution was properly tested and absolutely reliable, even if this causes some months delay in releasing MariaDB 10.1.

He enriched the presentation with practical examples and technical hints - see the slides here on slideshare.

Colin also introduced the key management plugin which is an encryption plugin that reads encryption keys from a file and key rotation solution that MariaDB has implemented. Another interesting topic which was discussed was related to encryption and compression (FusionIO or InnoDB compression for instance) that MariaDB does to compress first and then encrypt.

Encryption does not come out of the box, encryption has to be enabled in the configuration file. Unfortunately, at the moment some bits and pieces are still missing: encryption only works with InnoDB, XtraDB and Aria storage engines. Also, Galera encryption is not fully supported yet, and Xtrabackup, at the moment, does not read encrypted binary logs.

Colin highlighted especially the security plugins such as the password validation plugin, the audit plugin and the authentication plugin. Next MariaDB release will probably also include the Kerberos authentication plugin which is actually already completed and under testing.

Questions came up regarding how to switch on and off encryption, our Java connector and the interaction with the authentication plugins (MariaDB Java connector support both PAM and Kerberos plugin).

Next meetup will take place at The Lamb, one of the “traditional” places for these Meetups on the 17th of February, all the details can be found here.

References

https://mariadb.com/kb/en/mariadb/data-at-rest-encryption/
https://mariadb.com/kb/en/mariadb/password-validation/

Tags: 

About the Author

maria-luisaraviol's picture

Maria-Luisa Raviol is a Senior Sales Engineer with over 20 years industry experience.

by maria-luisaraviol at January 26, 2016 11:59 AM

January 25, 2016

Peter Zaitsev

EXPLAIN FORMAT=JSON has details for subqueries in HAVING, nested selects and subqueries that update values

EXPLAIN FORMAT=JSONOver several previous blog posts, we’ve already discussed what information the 

EXPLAIN FORMAT=JSON
 output provides for some subqueries. You can review those discussions here, here and here. EXPLAIN FORMAT=JSON shows many details that you can’t get with other commands. Let’s now finish this topic and discuss the output for the rest of the subquery types.

First, let’s look at the subquery in the 

HAVING
 clause, such as in the following example:

select count(emp_no), salary
from salaries
group by salary
having salary > ALL (select avg(s)
                     from (select dept_no, sum(salary) as s
                           from salaries join dept_emp using (emp_no) group by dept_no) t
                     )

This example prints the number of employees and their salaries, if their salary is greater than the average salary in their department.

EXPLAIN FORMAT=JSON
 provides a lot details on how this subquery is optimized:

mysql> explain format=json select count(emp_no), salary from salaries group by salary having salary > ALL (select avg(s) from (select dept_no, sum(salary) as s from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "3073970.40"
    },
    "grouping_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "cost_info": {
        "sort_cost": "2557022.00"
      },
      "table": {
        "table_name": "salaries",
        "access_type": "ALL",
        "rows_examined_per_scan": 2557022,
        "rows_produced_per_join": 2557022,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "5544.00",
          "eval_cost": "511404.40",
          "prefix_cost": "516948.40",
          "data_read_per_join": "39M"
        },
        "used_columns": [
          "emp_no",
          "salary",
          "from_date"
        ]
      },
      "having_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "771970.25"
            },
            "table": {
              "table_name": "t",
              "access_type": "ALL",
              "rows_examined_per_scan": 3087841,
              "rows_produced_per_join": 3087841,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "154402.05",
                "eval_cost": "617568.20",
                "prefix_cost": "771970.25",
                "data_read_per_join": "94M"
              },
              "used_columns": [
                "dept_no",
                "s"
              ],
              "materialized_from_subquery": {
                "using_temporary_table": true,
                "dependent": false,
                "cacheable": true,
                "query_block": {
                  "select_id": 3,
                  "cost_info": {
                    "query_cost": "1019140.27"
                  },
                  "grouping_operation": {
                    "using_filesort": false,
                    "nested_loop": [
                      {
                        "table": {
                          "table_name": "dept_emp",
                          "access_type": "index",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no",
                            "dept_no"
                          ],
                          "key": "dept_no",
                          "used_key_parts": [
                            "dept_no"
                          ],
                          "key_length": "4",
                          "rows_examined_per_scan": 331570,
                          "rows_produced_per_join": 331570,
                          "filtered": "100.00",
                          "using_index": true,
                          "cost_info": {
                            "read_cost": "737.00",
                            "eval_cost": "66314.00",
                            "prefix_cost": "67051.00",
                            "data_read_per_join": "5M"
                          },
                          "used_columns": [
                            "emp_no",
                            "dept_no"
                          ]
                        }
                      },
                      {
                        "table": {
                          "table_name": "salaries",
                          "access_type": "ref",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no"
                          ],
                          "key": "PRIMARY",
                          "used_key_parts": [
                            "emp_no"
                          ],
                          "key_length": "4",
                          "ref": [
                            "employees.dept_emp.emp_no"
                          ],
                          "rows_examined_per_scan": 9,
                          "rows_produced_per_join": 3087841,
                          "filtered": "100.00",
                          "cost_info": {
                            "read_cost": "334520.92",
                            "eval_cost": "617568.35",
                            "prefix_cost": "1019140.27",
                            "data_read_per_join": "47M"
                          },
                          "used_columns": [
                            "emp_no",
                            "salary",
                            "from_date"
                          ]
                        }
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)`,`employees`.`salaries`.`salary` AS `salary` from `employees`.`salaries` group by `employees`.`salaries`.`salary` having <not>((`employees`.`salaries`.`salary` <= <max>(/* select#2 */ select avg(`t`.`s`) from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t`)))

We see that the subquery in the 

HAVING
 clause is not dependent, but cacheable:

"having_subqueries": [
        {
          "dependent": false,
          "cacheable": true,

It has its own query block:

"query_block": {
            "select_id": 2,

Which accesses table “t”:

"table": {
              "table_name": "t",
              "access_type": "ALL",
              "rows_examined_per_scan": 3087841,
              "rows_produced_per_join": 3087841,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "154402.05",
                "eval_cost": "617568.20",
                "prefix_cost": "771970.25",
                "data_read_per_join": "94M"
              },
              "used_columns": [
                "dept_no",
                "s"
              ],

Table “t” was also materialized from the subquery:

],
              "materialized_from_subquery": {
                "using_temporary_table": true,
                "dependent": false,
                "cacheable": true,
                "query_block": {
                  "select_id": 3,

Another kind of subquery is in the 

SELECT
 list. If we want to compare the salary of an employee with the average salary in the company, for example, we can use the query
select emp_no, salary, (select avg(salary) from salaries) from salaries
. Lets examine the 
EXPLAIN
 output:

mysql> explain format=json select emp_no, salary, (select avg(salary) from salaries) from salariesG
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "516948.40"
    },
    "table": {
      "table_name": "salaries",
      "access_type": "ALL",
      "rows_examined_per_scan": 2557022,
      "rows_produced_per_join": 2557022,
      "filtered": "100.00",
      "cost_info": {
        "read_cost": "5544.00",
        "eval_cost": "511404.40",
        "prefix_cost": "516948.40",
        "data_read_per_join": "39M"
      },
      "used_columns": [
        "emp_no",
        "salary"
      ]
    },
    "select_list_subqueries": [
      {
        "dependent": false,
        "cacheable": true,
        "query_block": {
          "select_id": 2,
          "cost_info": {
            "query_cost": "516948.40"
          },
          "table": {
            "table_name": "salaries",
            "access_type": "ALL",
            "rows_examined_per_scan": 2557022,
            "rows_produced_per_join": 2557022,
            "filtered": "100.00",
            "cost_info": {
              "read_cost": "5544.00",
              "eval_cost": "511404.40",
              "prefix_cost": "516948.40",
              "data_read_per_join": "39M"
            },
            "used_columns": [
              "salary"
            ]
          }
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary`,(/* select#2 */ select avg(`employees`.`salaries`.`salary`) from `employees`.`salaries`) AS `(select avg(salary) from salaries)` from `employees`.`salaries`

EXPLAIN FORMAT=JSON
 in this case shows that the subquery is part of the first
query_block
, not dependent and cacheable.

The last type of subquery I want to discuss is the subquery updating values. For example, I added a new column to the

titles
 table from the standard employees database:

mysql> alter table titles add column full_title varchar(100);
Query OK, 0 rows affected (24.42 sec)
Records: 0  Duplicates: 0  Warnings: 0

Now I want

full_title
 to contain both the department’s name and title, separated by a space. I can use 
UPDATE
 with the subquery to achieve this:

update titles
set full_title=concat((select dept_name
                       from departments
                       join dept_emp using(dept_no)
                       where dept_emp.emp_no=titles.emp_no and dept_emp.to_date='9999-01-01')
               ,' ', title)
where to_date = '9999-01-01';

To find out how it is optimized, we can use

EXPLAIN FORMAT=JSON
:

mysql> explain format=json update titles set full_title=concat((select dept_name from departments join dept_emp using(dept_no) where dept_emp.emp_no=titles.emp_no and dept_emp.to_date='9999-01-01') ,' ', title) where to_date = '9999-01-01'G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "table": {
      "update": true,
      "table_name": "titles",
      "access_type": "index",
      "key": "PRIMARY",
      "used_key_parts": [
        "emp_no",
        "title",
        "from_date"
      ],
      "key_length": "59",
      "rows_examined_per_scan": 442843,
      "filtered": "100.00",
      "using_temporary_table": "for update",
      "attached_condition": "(`employees`.`titles`.`to_date` = '9999-01-01')"
    },
    "update_value_subqueries": [
      {
        "dependent": true,
        "cacheable": false,
        "query_block": {
          "select_id": 2,
          "cost_info": {
            "query_cost": "1.35"
          },
          "nested_loop": [
            {
              "table": {
                "table_name": "dept_emp",
                "access_type": "ref",
                "possible_keys": [
                  "PRIMARY",
                  "emp_no",
                  "dept_no"
                ],
                "key": "PRIMARY",
                "used_key_parts": [
                  "emp_no"
                ],
                "key_length": "4",
                "ref": [
                  "employees.titles.emp_no"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 0,
                "filtered": "10.00",
                "cost_info": {
                  "read_cost": "1.00",
                  "eval_cost": "0.02",
                  "prefix_cost": "1.22",
                  "data_read_per_join": "1"
                },
                "used_columns": [
                  "emp_no",
                  "dept_no",
                  "to_date"
                ],
                "attached_condition": "(`employees`.`dept_emp`.`to_date` = '9999-01-01')"
              }
            },
            {
              "table": {
                "table_name": "departments",
                "access_type": "eq_ref",
                "possible_keys": [
                  "PRIMARY"
                ],
                "key": "PRIMARY",
                "used_key_parts": [
                  "dept_no"
                ],
                "key_length": "4",
                "ref": [
                  "employees.dept_emp.dept_no"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 0,
                "filtered": "100.00",
                "cost_info": {
                  "read_cost": "0.11",
                  "eval_cost": "0.02",
                  "prefix_cost": "1.35",
                  "data_read_per_join": "5"
                },
                "used_columns": [
                  "dept_no",
                  "dept_name"
                ]
              }
            }
          ]
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1276): Field or reference 'employees.titles.emp_no' of SELECT #2 was resolved in SELECT #1

We can see in this output that the subquery is dependent, not cacheable, and will be executed for each row that needs to be updated.

Conclusion:

EXPLAIN FORMAT=JSON
  provides various information about all kind of subqueries.

by Sveta Smirnova at January 25, 2016 10:18 PM

MariaDB Foundation

2015 in the MariaDB Foundation

The mariadb.org website had over one million page views in 2015, a growth of about 9% since 2014. Good growth has been visible all over the MariaDB ecosystem and we can conclude that 2015 was a successful year for MariaDB. Increased adoption MariaDB was included for the first time in an official Debian release (version […]

The post 2015 in the MariaDB Foundation appeared first on MariaDB.org.

by Otto Kekäläinen at January 25, 2016 10:31 AM

January 22, 2016

Peter Zaitsev

Peter Zaitsev webinar January 27th: Compression In Open Source Databases

Database Compression

CompressionPercona invites you to attend a webinar Wednesday, January 27th, with CEO Peter Zaitsev: Compression In Open Source Databases. Register now!

Data growth has been tremendous in the last decade and shows no signs of stopping. To deal with this trend database technologies have implemented a number of approaches, and data compression is by far the most common and important. Compression in open source databases is complicated, and there are a lot of different approaches – each with their own implications.

In this talk we will perform a survey of compression in some of the most popular open source database engines including: Innodb, TokuDB, MongoDB, WiredTiger, RocksDB, and PostgreSQL.

Important information:

Webinar: Compression In Open Source Databases

Presenter: Peter Zaitsev, CEO, Percona

Date: Wednesday, January 27, 2016

Time: 10:00am PST (UTC – 8)

Register now, and we look forward to seeing you there!

About Peter Zaitsev, CEO Percona:

Peter co-founded Percona in 2006, assuming the role of CEO. Percona helps companies of all sizes maximize their success with MySQL. Percona was named to the Inc. 5000 in 2013. Peter was an early employee at MySQL AB, eventually leading the company’s High Performance Group. A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University where he majored in Computer Science. As CEO of Percona, Peter enjoys mixing business leadership with hands on technical expertise. Peter is co-author of High Performance MySQL published by O’Reilly, one of the most popular books on MySQL performance. Peter blogs regularly on MySQLPerformanceBlog.com and speaks frequently at conferences. Peter lives in North Carolina with his wife and two children. In his spare time, Peter enjoys travel and spending time outdoors.

by Dave Avery at January 22, 2016 11:44 PM

Percona Live Data Performance Conference 2016: news you need to know!

Percona Live

Percona LiveThe Percona Live Data Performance Conference 2016 is rapidly approaching, and we’re looking forward to providing an outstanding experience April 18-21 for all whom attend.

Percona Live is the premier event for the rich and diverse open source community and businesses that thrive in the MySQL and NoSQL marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, and CEOs representing organizations from industry giants such as Oracle to start-ups. Vendors increasingly rely on the conference as a major opportunity to connect with potential high-value customers from around the world.

Below are some highlights for the upcoming conference regarding the conference schedule, Tutorial sessions, Birds of a Feather talks, and Lightning talks.

Conference Schedule

Percona Live is packed with engaging sessions, helpful tutorials, and brief talks that will both enlighten and entertain attendees — featuring the best and brightest from the database and open source communities! Below are just a few of the exciting talks that will happen at the Percona Live Data Performance Conference 2016. (You can find the full schedule here.):

Session: Dirty Little Secrets

Speakers:   Jeremy Tinley, Sr. MySQL Operations Engineer, Etsy

Jenni Snyder, MySQL DBA, Yelp

Jonah Berquist, Database Infrastructure Engineer, GitHub

Geoffrey Anderson, Database Operations Engineer, Box

Silvia Botros, Sr. MySQL DBA, SendGrid

Shlomi Noach, Sr. Systems Engineer, GitHub

Session: What’s New in MySQL

Speakers:   Geir Høydalsvik, Sr. Software Development Director, Oracle

Simon Mudd, DBA, booking.com

Session: Espresso: LinkedIn’s distributed document store on top of MySQL

Speakers:   Yun Sun, Staff Software Engineer, LinkedIn

Eun-Gyu Kim, Staff Software Engineer, LinkedIn

Davi Arnaut Software Engineer, LinkedIn

Session: Shifting the Paradigm: MongoDB and the MEAN Stack

Speakers:   Kat Styons, Senior Full Stack Developer, The Washington Post

Sruti Cheedalla, Senior Web Developer, The Washington Post

 

The full schedule can be found here.

Tutorials

Percona Live tutorial sessions provided expert insight into various technology topics. The Percona Live tutorial schedule is also up, you can find it here.

Birds of a Feather

Birds of a Feather (BOF) sessions enable attendees with interests in the same project or topic to enjoy some quality face time. BOFs can be organized for individual projects or broader topics (e.g., best practices, open data, standards). Any attendee or conference speaker can propose and moderate an engaging BOF. Percona will post the selected topics and moderators online and provide a meeting space and time. The BOF sessions will be held Tuesday, April 19, 2016 at 6:00 p.m. The deadline for BOF submissions is February 7.

Lightning Talks

Lightning Talks provide an opportunity for attendees to propose, explain, exhort, or rant on any MySQL, NoSQL or data-in-the-cloud-related topic for five minutes. Topics might include a new idea, successful project, cautionary story, quick tip, or demonstration.   The deadline for submitting a Lightning Talk topic is February 7, 2016.

All submissions will be reviewed, and the top ten will be selected to present during one of the scheduled breakout sessions during the week. Lighthearted, fun or otherwise entertaining submissions are highly welcome.

 

We’re looking forward to seeing you at Percona Live!

 

by Kortney Runyan at January 22, 2016 05:08 PM

January 21, 2016

Peter Zaitsev

Tired of MySQL Making You Wait? Webinar: Questions and Answers

MySQLWe’d like to thank everybody for joining us on January 7th for our “Tired of MySQL Making You Wait?” webinar with Percona’s Alexander Rubin, Principal Consultant and SolarWinds’ Janis Griffin, Database Evangelist.

Too often developers and DBAs struggle to pinpoint the root cause of performance issues and then spend too much time in trying to fix them. In the webinar, we discussed how you can significantly increase the performance of your applications while also reducing database response time.

You can find an archived version of the webinar here.

Below are the questions that were asked during the webinar, with responses from Alex and Janis. If you need further clarification, please respond in the comments.

Thanks again, and we look forward to you joining us at our next webinar (with Percona CEO Peter Zaitsev), Compression In Open Source Databases!

 

Q: Are there special tuning tips for Galera Cluster?

A: Since Galera Cluster (Percona XtraDB Cluster) is based on MySQL, all query tuning tips will apply as well. There are a number of Galera Cluster configuration tips available: for example the blog post at this link talks about tuning the PXC for maximum write throughput: https://www.percona.com/blog/2015/06/03/optimizing-percona-xtradb-cluster-write-hotspots/

 

Q: Does DPA support Galera Cluster ?

A: Yes, DPA has the ability to group the cluster together to see load balancing, top 15 SQLs across the cluster, plus the top wait states.

 

Q: Can I create a covered index when I have “group by” and “order by” instructions together?

A: Yes, you can create a covered index and MySQL will use it to satisfy the query (you will see “using index”). If you have “group by” and “order by” on a different columns, however, MySQL will still have to perform a filesort and create a temp table. To create this index, specify all the following fields in your query in the index:

  1. All fields in the “where” condition
  2. The “group by” fields
  3. The “order by” fields
  4. The fields that the query is selecting.

Please note the limitations of such approach described here: http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html

 

Q: Can we use DPA with Azure MySQL?

A: Yes, DPA will monitor, tune and analyze the SQL server performance running on Microsoft Azure.

 

Q: Do you know if MariaDB has or is planning to follow with these virtual fields and/or SYS schema enhancements from MySQL 5.7?

A: MariaDB has had virtual or computed columns since version 5.2. I don’t believe MariaDB comes with the sys schema already installed, but you can download and install it.

 

Q: Does DPA support PostgreSQL? If not, is it in the roadmap?

A: Currently, DPA does not support PostgresSQL. However, we continually re-evaluate it with each new release.

 

Q: Does DPA support RDS instances?

A: Yes, DPA supports the monitoring of RDS instances.

 

Q: Does the performance schema show any information about how the load data is performing?

A: MySQL 5.5 performance_schema became available in 5.5.3 and has only 11 tables. Most of the tables deal with wait events and file information. In addition, you would need turn on the consumers and enable the instrumentation of the wait events. Once you’ve done that, you will be able to see the threads and what they are waiting on.

 

Q: I didn’t understand the reasoning that leads to the index on ORDER BY. I can’t link it to the previous slide query.

A: I assume this question is about the ORDER BY + LIMIT optimization. When you create an index on the ORDER BY field only, MySQL can start reading the whole table in the order of the index. As the index is sorted, it can start fetching the rows and filter out the rows that don’t match the ORDER BY condition. As there is a LIMIT N on the query, MySQL will stop after fetching N rows.

 

Q: How can I analyze parts of a stored procedure that runs nightly to see where by bottlenecks are? It has 100+ update queries that it performs every night to build a table with one million plus rows.

A: You can do it using the slow query log in Percona Server (5.5/5.6) and/or Performance Schema in MySQL 5.7. If you are running Percona Server, you can enable extended stored procedures logging as described here: https://www.percona.com/doc/percona-server/5.6/diagnostics/slow_extended.html. Another way is using a deprecated “show profile” method as described here: https://www.percona.com/blog/2009/01/19/profiling-mysql-stored-routines/

 

Q: How will DPA use the index when there are more than five columns in the “where” conditions? How would you create indexes?

A: I would suggest checking the “cardinality” of the fields (= number of unique values). Usually (unless you create a covered index or are optimizing the group by) it makes much more sense to limit the number of fields in an index, and only include the fields with the high cardinality. For example, PRIMARY KEY or UNIQUE INDEX works best, whereas the “gender” field (with only two unique values, “male” and “female”) would not be very useful.

 

Q: How would the analytics tool work in an open stack VM environment, where we have 100 database servers?

A: One installation of DPA can monitor hundreds of database servers. In fact, we have several very large companies that monitor 1000s of servers worldwide.

 

Q: If you have a small table with only 100 records, is it worth creating indexes on specific fields or just do a table scan?

A: If the table is only 100 records and you are not joining it with other tables, it usually does not make sense to add indexes. But because the table is so small it doesn’t really matter either way.

 

Q: Is the SolarWinds tool better than MONyog, and how expensive is the license cost for this?

A: MONyog is also a monitoring tool, but it doesn’t have the advisors, alarms, granularity, history, or customizations that DPA gives you. The retail cost per server is currently $1,995 per monitored server, but is heavily discounted the more you purchase.

 

Q: In many cases, due to the randomness and complexity of queries thrown at various tables, I end up creating a lot of indexes. At what point would there be too many indexes? Should I then create MySQL views instead of indexes? Should one use MySQL views at all to optimize searches?

A: First of all there are no “materialized views” in MySQL, so it is not a useful replacement for indexes. You can create “summary” tables manually, which will usually help a lot. Although it is hard to say when you have too many indexes, lots of indexes can decrease the performance of your insert/update/delete operations, as well as confuse MySQL. So a great many indexes might cause MySQL to start choosing a wrong index when doing selects.

 

Q: Sometime, we need to add indices for different queries for the same table. Eventually, the table has too many indices. Any suggestion for such cases?

A: See the response to the previous question.

 

Q: Is there a way in DPA to see what queries are currently running? In other words, to know about slow queries as they run rather than only knowing about them historically?

A: Yes. In the “Current” dashboard, click the “Currently Active Sessions” box. With this option, you can sort by longest running, etc.

 

Q: Why is delay indexed in the composite key? It only covers the query, but the temp table can be avoided by the first two fields?

A: You are referring to this example:

mysql> alter table ontime_2012
add key covered(dayofweek, Carrier, DepDelayMinutes);
explain select max(DepDelayMinutes), Carrier, dayofweek from ontime_2012 where dayofweek =7 group by CarrierG
...                    
possible_keys: DayOfWeek,covered
          key: covered
          key_len: 2
          ref: const
          rows: 905138
          Extra: Using where; Using index

The reason we add DepDelayMinutes is to make the index covered, so MySQL will be able to satisfy the query with an index only.

 

by Alexander Rubin at January 21, 2016 03:33 PM

January 20, 2016

Jean-Jerome Schmidt

Managing MySQL Replication for High Availability

Join us on February 2nd for this new webinar on Managing MySQL Replication for High Availability led by Krzysztof Książek, Senior Support Engineer at Severalnines. This is part of our ongoing ‘Become a MySQL DBA’ series.

Deploying a MySQL Replication topology is only the beginning of your journey. Maintaining it also involves topology changes, managing slave lag, promoting slaves, repairing replication issues, fixing broken nodes, managing schema changes and scheduling backups. Multi-datacenter replication also adds another dimension of complexity. It is always good to be prepared up front and know how to deal with these cases.

In this webinar we will cover deployment and management of MySQL replication topologies using ClusterControl, show how to schedule backups, promote slaves and what are the most important metrics to keep a close eye on. We will also cover how you can deal with schema and topology changes as well as some of the most common replication issues.

Date & time

Europe/MEA/APAC

Tuesday, February 2nd at 09:00 GMT / 10:00 CET (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, February 2nd at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • Deployment of MySQL replication topologies using ClusterControl
  • Schedule backups
  • Promote slaves
  • Important metrics to keep an eye on
  • Schema changes
  • Topology changes
  • Common replication issues

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

We look forward to “seeing” you there and to some good discussions!

To read our new MySQL Replication online tutorial, please visit:
http://severalnines.com/tutorials/mysql-replication-high-availability-tutorial

To view all the blogs of the ‘Become a MySQL DBA’ series visit:
http://www.severalnines.com/blog-categories/db-ops

To view all our webinar replays, please visit:
http://severalnines.com/webinars-replay

Blog category:

by Severalnines at January 20, 2016 07:02 PM

MariaDB Foundation

What’s new in MariaDB Connector/C 3.0 – Part I: SSL

New SSL alternatives SSL connections in previous versions of MariaDB Connector/C are based on the OpenSSL library. But because of the OpenSSL heartbleed bug, licensing problems and the lack of support for different transport layers, we decided to add support for alternative SSL implementations. That’s why Connector/C 3.0 can use not only OpenSSL, but also […]

The post What’s new in MariaDB Connector/C 3.0 – Part I: SSL appeared first on MariaDB.org.

by Georg Richter at January 20, 2016 03:49 PM

Peter Zaitsev

MongoDB revs you up: What storage engine is right for you? (Part 3)

MongoDBDifferentiating Between MongoDB Storage Engines: RocksDB

In this series of posts, we discussed what a storage engine is, and how you can determine the characteristics of one versus the other:

“A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems.”

Generally speaking, it’s important to understand what type of work environment the database is going to interact with, and to select a storage engine that is tailored to that environment.

The first post looked at MMAPv1, the original default engine for MongoDB (through release 3.0). The second post examined WiredTiger, the new default MongoDB engine.

This post will cover RocksDB. RocksDB builds on LevelDB, Google’s open source key value database library. It was designed to address several scenarios:

  1. Scale to run on servers with many CPU cores.
  2. Use fast storage efficiently.
  3. Be flexible to allow for innovation.
  4. Support IO-bound, in-memory, and write-once workloads.

RocksDB

Find it in: Percona Builds

RocksDB, designed originally at Facebook, uses LSM trees to store data, unlike most other storage engines which are using B-Trees.

LSM trees are designed to amortize the cost of writes: data is written to log files that are sequentially written to disk and never modified. Then a background thread merges the log files (compaction) into a tree like structure. With this design a single I/O can flush to disk tens or hundreds of write operations.

The tradeoff is that reading a document is more complex and therefore slower than for a B-Tree; because we don’t know in advance in which log file the latest version of the data is stored, we may need to read multiple files to perform a single read. RocksDB uses bloom filters and fractional cascading to minimize the impact of these issues.

As far as workload fit, RocksDB can provide very good insert and query performance while providing compression ratios that are typically better than wiredTiger and slightly worse than PerconaFT. Also, RocksDB is theoretically better than PerconaFT at keeping up with the frequent and heavy delete workloads that accompany TTL indexes in high insert workloads.

Percona is excited to offer enterprise support for RocksDB! RocksDB as part of our MongoDB support options: https://www.percona.com/services/support/rocksdb-support.

Conclusion

Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of, PerconaFT and WiredTiger to help specifically differentiate between these engines.

In Part Four of this blog series, we’ll take a closer look at Percona’s MongoDB storage engine: PerconaFT.

Part 1: Intro and the MMAPv1 storage engine.

Part 2: WiredTiger storage engine.

by Jon Tobin at January 20, 2016 02:22 PM

Serge Frezefond

MariaDB and Native JSON support ?

A question raised by my previous post is : What about MariaDB and native JSON support ? In my previous post I mentioned the possibility to use the MariaDB CONNECT storage Engine to store and access JSON content in normal text field. Of course having a native JSON datatype brings more value. It introduces JSON [...]

by Serge at January 20, 2016 02:03 PM

January 19, 2016

Peter Zaitsev

Dealing with corrupted InnoDB data

MySQL

MySQLData corruption! It can happen. Maybe because of a bug or storage problem that you didn’t expect, or MySQL crashes when a page checksum’s result is different from what it expected. Either way, corrupted data can and does occur. What do you do then?

Let’s look at the following example and see what can be done when you face this situation.

We have some valuable data:

> select * from t limit 4;
+---+--------+
| i | c      |
+---+--------+
| 1 | Miguel |
| 2 | Angel  |
| 3 | Miguel |
| 4 | Angel  |
+---+--------+
> select count(*) from t;
+----------+
| count(*) |
+----------+
|  2097152 |
+----------+

One day the query you usually run fails and your application stops working. Even worse, it causes the crash already mentioned:

> select * from t where i=2097151;
ERROR 2006 (HY000): MySQL server has gone away

Usually this is the point when panic starts. The error log shows:

2016-01-13 08:01:48 7fbc00133700 InnoDB: uncompressed page, stored checksum in field1 2912050650, calculated checksums for field1: crc32 1490770609, innodb 1549747911, none 3735928559, stored checksum in field2 1670385167, calculated checksums for field2: crc32 1490770609, innodb 2416840536, none 3735928559, page LSN 0 130051648, low 4 bytes of LSN at page end 1476903022, page number (if stored to page already) 4651, space id (if created with >= MySQL-4.1.1 and stored already) 7
InnoDB: Page may be an index page where index id is 22
InnoDB: (index "PRIMARY" of table "test"."t")
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 4651.
InnoDB: You may have to recover from a backup.
InnoDB: It is also possible that your operating
InnoDB: system has corrupted its own file cache
InnoDB: and rebooting your computer removes the
InnoDB: error.
InnoDB: If the corrupt page is an index page
InnoDB: you can also try to fix the corruption
InnoDB: by dumping, dropping, and reimporting
InnoDB: the corrupt table. You can use CHECK
InnoDB: TABLE to scan your table for corruption.
InnoDB: See also http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 4651.
InnoDB: You may have to recover from a backup.
2016-01-13 08:01:48 7fbc00133700 InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex ad925dda0000122b0000122affffffff0000000007c06e4045bf00000000000000000
[...]

OK, our database is corrupted and it is printing the page dump in ASCII and hex. Usually, the recommendation is to recover from a backup. In case you don’t have one, the recommendation would be the same as the one given by the error log. When we hit corruption, first thing we should try is dumping the data and then re-importing to another server (if possible). So, how we can read a corrupted TABLE and avoid the crash? In most cases, the 

innodb_force_recovery
  option will help us. It has values from 1 to 6. They are documented here:

http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html

The idea is to start with 1. If that doesn’t work, proceed to 2. If it fails again, then go to 3 . . . until you find a value that allows you to dump the data. In this case I know that the problem is a corrupted InnoDB page, so a value of 1 should be enough:

“Lets the server run even if it detects a corrupt page. Tries to make SELECT * FROM tbl_name jump over corrupt index records and pages, which helps in dumping tables.”

We add

innodb_force_recovery=1
 and restart the service. Now it’s time to try and dump our data with
mysqldump
. If the corruption is even worse you need to keep trying different modes. For example, I have this error:

> create table t2 like t;
> insert into t2 select * from t;
ERROR 1034 (HY000): Incorrect key file for table 't'; try to repair it
> insert into t2 select * from t;
ERROR 1712 (HY000): Index t is corrupted

innodb_force_recovery=1
 doesn’t work here. It doesn’t allow me to dump the data:

# mysqldump -uroot -pmsandbox --port 5623 -h 127.0.0.1 --all-databases > dump.sql
Error: Couldn't read status information for table t ()

but in my test server, it seems that

innodb_force_recovery=3
  helps.

This procedure sounds good and usually works. The problem is that the feature is mostly broken after 5.6.15.

innodb_force_recovery
 values greater or equal 4 won’t allow the database to start:

2015-07-08 10:25:25 315 [ERROR] Unknown/unsupported storage engine: InnoDB
2015-07-08 10:25:25 315 [ERROR] Aborting

Bug are reported and verified here: https://bugs.mysql.com/bug.php?id=77654

That means that if you have Insert Buffer, Undo Log or Redo log corruption (values 4, 5 and 6) you can’t continue. What to do?

  • You can install a older version of MySQL (previous to 5.6.15) to use higher values of
    innodb_force_recovery
    . Modes 4, 5 and 6 can corrupt your data (even more) so they are dangerous. If there are no backups this is our only option, so my recommendation would be to make a copy of the data we have now and then proceed with higher values of
    innodb_force_recovery
    .

or

  • If you are using Percona Server,
    innodb_corrupt_table_action
      can be used to dump the data. You can use the value “salvage”. When the option value is salvage, XtraDB allows read access to a corrupted tablespace, but ignores corrupted pages.

https://www.percona.com/doc/percona-server/5.6/reliability/innodb_corrupt_table_action.html

If you can’t still dump your data, then you should try more advance solutions like Undrop for InnoDB. Also, it would be good idea to start planning to create regular database backups.    :)

by Miguel Angel Nieto at January 19, 2016 08:06 PM

January 18, 2016

Erkan Yanar

Proposals Percona Live 2016


Ahoi, I made three proposals for Percona Live. (Even I know being a lonely (sniff) freelancer with no backup). 
At least I'm for sure an early adopter about the three topics I submitted and know for sure what I'm talking about *g*   

Even being Puppet certified. Ansible is my love. Never been that fast in doing automation \o/

This is a tutorial about Docker and Galera/MySQL. 


This talk is going to tell something about the "Docker Batteries" helping you to even bootstrap clusters etc. 

Feel free to vote for \o/

by erkan at January 18, 2016 11:52 PM

Valeriy Kravchuk

MySQL Support People - The Next Generation

My first post in this series caused quite an active discussion on Facebook. Readers correctly noted some mistakes in dates and missing names in the list. I've corrected some of the mistakes already and will correct some others later. What was also noted is that initially support for MySQL was provided by developers (this is really important, we'll get back to this later), and of them many had never even got a title of Support engineer. Of those who had not I listed only Monty...

I just want to explain why I made these mistakes and/or why I (intentionally) had not listed developers etc. First of all, I based my post on my memories (so I could be also wrong about start dates and many details) and that's why I could not list those whom I had never closely worked with. I worked with Monty while in Support and he was involved, to some extent, into my work on the problems that were interesting for me, and he provided support well before I joined. That's why he is in the list. My posts are not about the history of MySQL (or even the history of MySQL Support) - they are about MySQL Support as I was introduced to it in 2005 and about people who made it look and feel as I understand it now - the best, most efficient and useful service for MySQL users.

There are pure managers among these people, those who had never reported any MySQL bug in public probably and rarely (if at all) resolved any pure technical problems for customers. They well deserve separate posts, but here I want to name two of them who played the most important role from my point of few: Tom Basil (Director of MySQL Support in MySQL and Sun till 2008) and Peter Farkas (Director of Support in Percona till 2015).

Now it's time to get back to the topic of this post. "The Next Generation" are engineers who started to provide support for MySQL some time in 2005 or later, when the system as I remember it (created mostly by those named in the first post led by Tom Basil) was already in place, with Eventum as an issue tracking system, public bugs database managed by Support, SOP in place in Wiki, SSC duties defined, and so on. I probably have to include myself into this new generation as well, but this series is not about me.

So, these great engineers joined MySQL Support in MySQL and/or Sun (that had not changed that much in the way we worked, mostly accepted and appreciated our experience) after me or before me, but whom I consider the "next generation" for sure. There are really famous people in MySQL world among them:
  • Geert Vanderkelen - he actually joined in April, 2005, so before me, as Senior Support engineer and MySQL Cluster expert. I just do not understand why I missed him in the previous list and no one corrected me. His list of public bugs reported (I see 171!) is quite impressive. He keeps reporting bugs in public, check Bug #79621, the last one he recently reported. Some day in 2011 he got tires of Support (this happens to many creative people) and moved to development where he works on his Connector/Python project, now in Oracle.
  • Mark Leith - he had became an essential team member immediately when joined, some day in 2005 after me. With this Oracle database background he helped to make the entire MySQL more aware about Oracle database, related tools and approaches (before him there was only Peter Zaitsev who created many worklogs obviously inspired by Oracle, and me, "Oracle agent in MySQL", the rest of the company mostly pretended Oracle never existed, non-standard or they do not care about it, at least until Oracle acquired InnoDB at the end of 2005...). He was a manager of AMER Support team in MySQL and Sun. Mark had reported a lot of bugs in public (I see 177) both while in Support and later when he led MySQL Enterprise Monitor development. Check one of his recent bug reports, Bug #76049. For me he is also a "godfather" of Performance_Schema and father of sys schema in MySQL. Mark is a Senior Software Development Manager in Oracle now and still a very active community member contributing a lot with his blog and presentations.
  • Domas Mituzas - he is all times the greatest MySQL Support engineer I had ever worked with. I probably had to stop at this phrase, but as one of my main job duties used to be "PR manager for Domas" once, I'll continue. He managed to assign 14 issues per single 8 hours shift back in 2008, and resolve 7 or 8 of them completely by the end of that shift. During his first days in MySQL Support in 2005 he managed to teach customer from China how to use Chinese properly in Oracle RDBMS, while doing MySQL support. He managed MySQL in Wikipedia and had his own fork of MySQL maintained before most of current forks appeared. He reported some 53 or so MySQL bugs in public (check also his older account at bugs database), my all times favorite being Bug #12113, that allowed me to resolve at least a dozen of customer issues over years and was fixed somehow only in MariaDB 10.1 recently. He is well known as a speaker about MySQL and cool blogger. Check this his famous post about query cache! His work probably requires a book to describe it, not a post, and he is mentioned in the books. Domas, who joined MySQL Support in October, 2005 is a small data engineer in Facebook since 2009. I am proud that I worked in one team with him, honestly.
  • Chris Calender - he joined probably soon after me and he still provides support in MariaDB. He is an active blogger and bug reporter (I see 69 reports). Check his relatively recent Bug #71395. His posts on building MySQL and MariaDB from source on Windows helped me a lot back in 2012. I always had a huge respect to everything he did.
  • Morgan Tocker - now a MySQL Product Manager in Oracle (this seems to be a popular career path among former MySQL Support engineers...), Morgan joined MySQL Support back in January, 2006. He worked on MySQL in different roles and companies since that, including Percona where he was Director of Training, returning back to Oracle for MySQL Community Manager role. He is an active blogger and bug reporter (I see 41 public bug reports from him). Check his old Bug #28338 that is still "Verified".
  • Shawn Green - he joined MySQL in 2006 and had been a very reliable team member in AMER Support team from the very beginning, great SSC (no wonder, he served in the US Navy) and helpful friend to many of us. He had reported 25 bugs in public as far as I can see. Check his latest still "Verified" public feature request, Bug #74640. He is still a MySQL Senior Principal Technical Support Engineer in Oracle, even though he has enough problems to deal with outside of MySQL during last years.
  • Kyle Joiner - he joined in 2007 based on Shawn's recommendation and with the goal to help us with DB2 Storage engine support (MySQL used to work on IBM System i and use its DB2 database as a storage engine back then). He is probably the strongest MySQL Support engineer, literally. He was a great engineer and SSC in AMER Support (I always felt safe as weekend SSC knowing that Kyle is supposed to be on call, he was always ready to help). Kyle had reported 14 MySQL bugs in public. Check his Bug #48049 that is still "Verified". He quit Oracle to join MariaDB recently, where he is still a Support Engineer.
  • Sean Pringle - he joined MySQL Support in 2006 and worked in APAC Team that I always tried my best in cooperation with. He was a great Support Engineer and also was my guide in San Francisco and Cupertino when I visited USA for the first time in 2007. He left us in 2009 and worked in SkySQL later. I try to keep in touch and still hope to continue working with him one day in one team.
  • Tonči Grgin - he joined MySQL in March, 2006, during last (for me, in MySQL AB life) company meeting in Sorrento. We met in the airport of Munich and later Sinisa, who knew him for decades as a developer and MySQL user, just told me that I have to teach Tonči everything for him to be ready to work in the Bugs Verification Team by the end of the meeting. I did that, and he was my first successful student in MySQL Support. His is my family friend and I had an honor to visit his home near Zagreb and his father's home in Split. Tonči worked on processing bugs for various connectors and complex support issues related to connectors, and later, already in Oracle, he moved to development, to fix those connectors finally, and eventually he became a manager of MySQL Connectors team in Oracle. As far as I know that team recently disappeared and he is now working in Oracle on MySQL performance improvements on Windows. He was always active not only in processing bugs, but also in reporting them, so I see 67 he had reported in public. Check Bug #48228, still "Verified".
  • Sveta Smirnova - she joined probably in May, 2006, but back in Sorrento we were already discussing steps to hire her. As I insisted we should get her into the Bugs Verification Team, I've  got a task to be her mentor, and forced her to work on bugs even before she joined. Long story in short, she managed to follow my steps and do everything I did, but better, in MySQL, Sun, Oracle and now Percona, where she is my colleague and also a Principal Support Engineer. She also do a lot more than I did, as a developer, blogger and presenter at various conferences for years. She is well known as an author of a book on MySQL Troubleshooting published in 2012 by O'Reilly. She is famous. Sveta reported a lot of bugs, check also her older account, 256 in total. Check her latest bug report, Bug #79596. Awful bug, really... To summarize, I am proud to work with her these days, and she is a key member of our team in Percona now.
  • Todd Farmer had reported 327(!) bugs in public and is a very active blogger. He is one of they key contributors to MySQL Community for sure. He joined MySQL as a Support Engineer back in August, 2006 and then was AMER team manger. He was brave enough to step in as a Director of MySQL Support in Oracle back in 2010, and while he is not in my list of top 3 Directors of Support of all times, he still did a lot of useful things for MySQL Support in Oracle. Since 2013 Todd is a Director, Technical Product Management, MySQL in Oracle.
  • Matthey Montgomery - yet another great MySQL Cluster Support engineer and public bug reporter (I see 106 bug reports from him), he joined us in August, 2006. He is a Principal Technical Support Engineer in Oracle now and still deals with NDB. This is incredible! Check Bug #20924, one of his first bug reports that is still "Verified". MySQL had never been able to deal with data types properly, especially in explicit and implicit conversions... Topic for another post, sorry.
  • Johan Idrén - he joined MySQL Support in November, 2006, and worked on all kinds of issues initially, being very productive. Later he specialized in Merlin (a.k.a. MySQL Enterprise Manager now) and when Oracle acquired Sun he decided to move to SkySQL to continue doing MySQL Support. Recently he works as a Systems Engineer in DICE (EA Digital Illusions CE AB). I see 22 public bugs that he had reported, including Bug #28331 that is still "Verified". 
  • Susanne Ebrecht (now Holzgraefe) - yes, we have a family of MySQL support engineers who once worked in the company while not yet a family. Susanne joined us in March, 2007, and is actually of PostgreSQL community origin! She was an active bug reporter, I see 165 bugs she reported, so eventually she ended up in Bugs Verification team where we worked together. She worked mostly on bugs for GUI tools, and one of them is still "Verified", see Bug #55497. Susanne quit with her future husband Hartmut at the end of 2010, when they had either to join Oracle or move elsewhere. She is a consultant now.
  • Andrew Hutchings (a.k.a. Linuxjedi) - I thought he was with us in MySQL AB already, but quick check shows that he joined as MySQL Support Engineer for Sun Microsystems only in 2008. Andrew worked on very complex MySQL Cluster and C API issues. He also developed new features and bug fixes for MySQL and MySQL Cluster in his spare time. Andrew was one of only a few MySQL engineers to win a Sun employee recognition award. Later he had really hard times with Oracle and eventually quit to work on Drizzle. Andrew is a Technical Product Manager at NGINX, Inc now.
  • Gustaf Thorslund - he joined us in October 2007 and was Support Engineer specialized in MySQL Cluster and NDB API. He is probably the best one in anything related to NDB API. With long  breaks he took for parental leave, study etc, he is still working as a Senior (still?) Technical Support Engineer in Oracle. We always had a lot of fun discussing life, work, Ada programming and other matters (excluding NDB that I try to avoid by all means). I see he had reported 17 bugs in public, but they are all closed by now.
  • Gary Pendergast - I am not sure when he joined us, but, as far as I remember, some time after the meeting in Riga (and Sun's acquisition of MySQL) he had quit. Probably it was early 2009 or so. He was a good Support Engineer in APAC Support Team and he is still working with MySQL. Check his blog.
  • Andrii Nikitin - he probably worked in MySQL as developer (of GUI tools) in 2007 already. Later he moved to Support and quickly became a great colleague for us. Switching roles is hard when you are already good in something, but he managed to deal with that as well as with many other hard problems in his life. He actively worked on bugs and I can see 86 bugs he had reported in public. Check his latest public bug that is still "Verified", Bug #77789. He is a Senior Principal Technical Support Engineer in Oracle now and is a core hard working EMEA Support team member.
  • Leandro Morgado - he joined us in Sun in 2008 and worked really well from day one in EMEA team. He used to be active bug reporter and I see 46 his public bug reports, including Bug #43523 that is still "Verified". He is still a Senior(?) Technical Support Engineer in Oracle.
  • Ligaya Turmelle - she joined us in April, 2008 and is well known as a full time Goddess, part time MySQL DBA, occasional PHP programmer and active member of both MySQL and PHP communities as a speaker at various conferences. She probably had several accounts in the public bugs database, but I was able to identify only two of them quickly. Check her Bug #67056 - it's full of words I am trying to avoid by all means, like NDB and memcached... I've got a chance to met her in person at Percona Live 2015 in Santa Clara and we had a drink (or two) together. She is a Principal Technical Support Engineer in Oracle now.
  • Rene' Cannao' - he joined us in Sun in August, 2008, and I remember him as a very capable and reliable Support Engineer in EMEA Team from day one. He had several accounts in public bugs database, so it's hard to aggregate all bugs he had reported over years, but there were many. Check his recent regression Bug #78680. He seems to be working for Dropbox now and also develops his ProxySQL software.
  • Trent Lloyd - he joined MySQL back in June, 2007 and since that time he works in Support (now he is a MySQL Principal Technical Support Engineer in Oracle), often on complex issues including those related to HA, DRBD and Pacemaker and tuning Linux for MySQL. I see 13 his public bug reports, including Bug #43775 that is still "Verified".
  • Mikiya Okuno - he joined MySQL in October, 2007, and was one of those few engineers who ran Japanese language support in MySQL (yes, we did that, with a separate project in Eventum and separate team of engineers in Japan that I worked with a lot!). He joined us from Sun to be soon aquired back by Sun with the rest of MySQL. Mikiya had reported 107 public bug for MySQL software. Check his Bug #65035 that is still "Verified". Mikiya is a MySQL Technical Analyst at Oracle now.
  • Meiji Kimura - he joined MySQL Support in September, 2007 and is a Support Engineer in MySQL KK, Sun and Oracle since that time. He is an active bug reporter and seems to have several accounts in public bugs database. Check his Bug #80018 opened today!
  • Roel Van de Paar leads QA in Percona now. With his 296 public bug reports he is a key contributor to MySQL quality in general. His first public bug report, Bug #41908, is still "Verified" and not fixed. Roel joined MySQL Support (already in Sun) in 2008 and was one of the key support providers in APAC team before he switched to QA. Roel contributes to community (and MySQL Support, indirectly) also with his numerous great posts in Percona's blog.
  • Oli Sennhauser - he was a MySQL Consultant from 2006 or so, but later in Sun he worked in MySQL Support for a couple of years, and I think he was great in this role. He reported 24 bugs recently (for some reason I think he had a second account as well), check the latest of them still "Verified", Bug #78898. Now Oli is a Senior MySQL Consultant and CTO at FromDual GmbH.
  • Bogdan Kecman - he joined us in October, 2007, specialized in MySQL Cluster support and quickly became one of the most useful engineers in this area. His knowledge of hardware also always impressed me. He is still in Oracle and now is probably a member of my former Bugs Verification Team there, as I see his name as bug assignee from time to time. He was an active bug reporter, with 44 bugs reported in public. Check Bug #44760, still "Verified".
  • Ben Krug - he joined back in December, 2007 and played mostly SSC and later TAM role for Facebook. He is a Senior MySQL Support Engineer in Oracle now. Ben was active enough bug reporter, but it's a bit hard to consolidate different accounts he seemed to use over these years to count all bugs. Check his Bug #70056 that I consider really serious for many users and customers who plan to upgrade.
  • Jonathon Coombes - he joined us in Sun in July 2008 as Senior MySQL Support Engineer, mostly to work on MySQL Cluster and MySQL Enterprise Monitor issues. He still works in Oracle and does more or less the same job as Principal MySQL Support Engineer. I remember him as a great team mate. He reported 45 bugs for MySQL software in public, including Bug #48789 that is still "Verified".
  • Roger Nay - he joined us in Sun in October, 2008. He worked a lot on MySQL Enterprise Monitor, as obvious from the list of his 42 public bug reports, including Bug #57574. But he was good in other MySQL software as well. Roger still works as a Senior (?) MySQL Support Engineer in Oracle. Not sure what is his current focus of interests though.
I have to stop for now. The post is already huge... This list is surely not complete (I do remember some other names, for example, but fail to find public bugs they reported) and I am ready to correct it and add new names based on your feedback provided either here in comments or on Facebook.

Stay tuned, next time I'll list those who joined MySQL Support only in Oracle, but still contributed (and still contributing) a lot for MySQL Community.

by Valeriy Kravchuk (noreply@blogger.com) at January 18, 2016 04:56 AM

January 15, 2016

Peter Zaitsev

Making Apache Spark Four Times Faster

Apache SparkThis is a followup to my previous post Apache Spark with Air ontime performance data.

To recap an interesting point in that post: when using 48 cores with the server, the result was worse than with 12 cores. I wanted to understand the reason is was true, so I started digging. My primary suspicion was that Java (I never trust Java) was not good dealing with 100GB of memory.

There are few links pointing to the potential issues with a huge HEAP:

http://stackoverflow.com/questions/214362/java-very-large-heap-sizes
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/

Following the last article’s advice, I ran four instances of Spark’s slaves. This is an old technique to better utilize resources, as often (as is well known from old MySQL times) one instance doesn’t scale well.

I added the following to the config:

export SPARK_WORKER_INSTANCES=4
export SPARK_WORKER_CORES=12
export SPARK_WORKER_MEMORY=25g

The full description of the test can be found in my previous post Apache Spark with Air ontime performance data

The results:

Click graphic to enlarge

Click graphic to enlarge

Although the results for four instances still don’t scale much after using 12 cores, at least there is no extra penalty for using more.

It could be that the dataset is just not big enough to show the setup’s full potential.

I think there is a clear indication that with the 25GB HEAP size, Java performs much better than with 100GB – at least with Oracle’s JDK (there are comments that a third-party commercial JDK may handle this better).

This is something to keep in mind when working with Java-based servers (like Apache Spark) on high end servers.

by Vadim Tkachenko at January 15, 2016 10:52 PM

January 14, 2016

Peter Zaitsev

OpenSSH CVE-2016-0777: Details and Mitigation

OpenSSH

OpenSSHEarlier today advisories were sent out regarding OpenSSH versions 5.4 through 7.1., informing users about a security bug in the software. In essence, the advisory instructed people to add the  

UseRoaming no
 option to their ssh_config file, with a promise for further information to be made available shortly.

 

The post on the security issue at OpenBSD Journal can be seen here: http://undeadly.org/cgi?action=article&sid=20160114142733

This information was then later released detailing the issue and the implications here: http://www.openssh.com/txt/release-7.1p2

The statement below summarized the main issue:

The matching server code has never been shipped, but the client code was enabled by default and could be tricked by a malicious server into leaking client memory to the server, including private client user keys.”

So what does this all mean? Simply speaking, this means a malicious or compromised server could potentially retrieve the users private SSH keys from memory. The stolen keys could then be used to authenticate against servers.

(2FA helps to protect servers from the use of stolen keys, however this is not in as widespread use as it should be.)

The short summary is in lieu of an update to the software, you can use the following mitigation options to protect yourself:

  1. In your ~/.ssh/config:
    Host * UseRoaming no
  2. In your ssh_config:
    Linux: /etc/ssh/ssh_config OSX: /private/etc/ssh/ssh_config
  3. On each CLI execution:
    ssh -oUseRoaming=no <hostname>
        

Personally, I’ve used a combination of 1 and 2, as often a ~/.ssh/config cleanup is required. Make sure and check that your OpenSSH is correctly configured, and keep watching for updates.

by David Busby at January 14, 2016 08:04 PM

Prometheus as an Engine for MySQL Monitoring

prometheusWhen I first discovered Graphite years ago, I was very impressed with its monitoring capabilities.  Compared to many RRD-based tools that were popular at the time (like Cacti), Graphite separated the captured data and graphs, allowing you to do all kinds of math and transformations while visualizing data. For example, I could plot the relationship between system queries and disk IO, and capture how the system was getting more IO bound over time. It also had reasonably high performance, allowing me to capture high-resolution data for medium-sized systems.

Just last year I discovered Prometheus, and it also impressed me. I think it has the potential to take Graphite’s flexibility to the next level. Though I am in no way a Prometheus expert, I  want to share my understanding and thoughts on it so far.

Data Model

The data model is perhaps what attracted me to Prometheus the most. While it’s not obvious at first, when you do figure it out it has fantastic flexibility.

In the data model used by Whisper and Carbon in Graphite, you will use something like this to store MySQL data:

myapp.store.mysql.db01.status.questions = 5000

You can set up any hierarchy you like, but it has to have a hierarchy.

What Prometheus does instead is allow you to use a set of key-value pairs. The same data shown above could be presented like this:

questions_total{app=”myapp”,subsystem=”store”,engine=”mysql”,host=”db01”, source=”status”} = 5000

(You most likely wouldn’t use this exact structure, but it’s good for illustration.)

The difference between these approaches it that Prometheus provides you multiple dimensions on which you can filter and aggregate, plus you can add those dimensions later as you need them (without needing to redesign your tree hierarchy).

These labels are very dynamic, and I can change them in a second. For example, a MySQL server reporting as a “Master” might start reporting as a “Slave” in the very next second, and its data will be aggregated differently.

This is especially important in the modern, often cloud-based and virtualized world. For example, using Prometheus it is very easy to tag servers by their region or availability zones. I can also do things like compute MySQL space usage by both the database and storage engine. The possibilities are endless.

Data Capture

Unlike Graphite – where the main model is push and the hosts themselves choose what kind of information they want to push to monitoring system and at which intervals – with Prometheus you set up “Exporters” that have the ability to export the data. It is up to the Prometheus server configuration to choose what data to sample and how frequently.

The clear benefit of Prometheus’ approach is that you can have as many servers as you like pulling the data, so it is very easy to create a development version of your system and play around with it – without affecting production. It also provides a simple pathway to high availability.

(Both the push and pull approaches have their benefits and drawbacks. Brian Brazil wrote an excellent article advertising the pull model of monitoring.)

Prometheus does create a few challenges for me. Unless I want to set up Service Discovery, it is a hassle to monitor any development/test VMs I might spin up (that would otherwise not be open to external access at all). While this isn’t the main use case for Prometheus, it is helpful for me to test the dashboard’s behavior with different operating systems, workloads, etc.

A more significant issue I discovered is dealing with some data that can’t be captured to multiple locations, because the data capture causes the data to change.

Here is specific example: if I look at the

events_statements_summary_by_digest
 table in
PERFORMANCE_SCHEMA
, there is a
MAX_TIMER_WAIT
 
field that shows me what the maximum query execution time is for the query pattern. If I want to get the maximum query execution time for every minute, for example, I would need to “truncate” the table to reset the statistics and let the maximum value be computed again. If I don’t perform that operation, the data becomes meaningless. If I make the exporter to reset the statistics during the poll, however, I can’t pull it from two Prometheus servers.

This is one instance where Prometheus’ performance schema design could be better. I could set up a Cron job or Event to clear out the statistics regularly and get a  proper maximum value for every five minutes, but that isn’t an overly convenient solution.

Another issue I discovered is that Prometheus doesn’t have any protection from bad (long) samples, or a very good method of detecting of them. Let’s imagine that I have a MySQL server and I’m sampling status data every second. For some reason the call to

SHOW GLOBAL STATUS
 took five seconds to execute. The truth is we don’t really know where in those five seconds the
SHOW GLOBAL STATUS
 
output corresponds – it might be at very start, it might be at the very end. As such, you don’t really know how to process the counters. Whatever you do, you’re likely to be wrong. My preference in this case it to simply discard such samples, because even missing one percent of the samples is unlikely to change the whole picture. Constantly questioning whether you really had a couple of seconds where the QPS spiked to ten times the normal rate, or that it’s an invalid sample, is not something I on which I want to waste a lot of time!

My preferred approach is to configure the

SHOW GLOBAL STATUS
 capture so that if it takes more than ten percent of the capture interval, it will be discarded. For example, with a one second capture I would allow 100ms for the capture. If the system is not keeping up with this scale, I would be better to not fool myself and reduce the capture resolution to around five seconds.

The only protection Prometheus allows is to configure the scrape_timeout, but unfortunately it is only limited to one second resolution at this point. This is too coarse for any high-resolution capture.

Finally, it is also inconvenient to specify different resolutions for different data. In MySQL there is a often a lot of data that I want to capture, but the resolution needed for each capture is different. For example,

SHOW GLOBAL STATUS
 with one second resolution is must. At the same time, capturing the table size information from
INFORMATION_SCHEMA
 
with a one second resolution would put too much load on MySQL, especially if there are a lot of tables. That level of resolution in this case isn’t really needed.

An attractive thing about Prometheus is that the Prometheus development team uses it a lot for MySQL monitoring, so the MySQL Exporter is really good. Most MySQL monitoring plugins I find resort to reporting just a few basics statistics, which is not nearly enough for advanced diagnostics. The Prometheus MySQL exporter gets tons of stuff and has been adding more in every version.

I also very much like that the Prometheus Exporters are designed using HTTP protocol. This means it is very easy to debug or see what kind of data they capture. They present it simply using a web-browser:

HTTP browser

Computational Model

I think the basic operations in Prometheus are pretty intuitive, but if you look at some of the advanced behaviors you’re going to find some inconveniences and some things that are likely to surprise you.

One inconvenience is that Prometheus is mainly designed for working with high resolution data. If there are more than five minute holes (by default) in the time series, they could disappear from the graphs. As I mentioned, for MySQL there is quite a lot of information that it makes sense to capture at a resolution lower than five minutes.

Prometheus functions are looking in the “past,” and designed in a way that the value of the function at any time (T) when it could be computed is not going to change. It all looks clean and logical, but it causes issues with holes in the data capture.  

As an example, let’s imagine following five seconds where the total number of questions from the start successfully scrapped some seconds but not others (due to a network issue, overload, etc.):

1  –  10
2  –  20
3  –  X
4  –  X
5  –  200

 When we capture data of “counter” type the most important value it has is not the actual counter value at the given time but the rate of change of the counter at different time intervals. If in this case, for example, the query rate was ten QPS for intervals one through two seconds, this can be clearly computed. But what was the query rate in the three through four second interval? We don’t really have exact data, but that is fine: we know there have been 180 queries during the two through five second interval, giving us 60 QPS (which we can use for the three through four seconds interval).

This is NOT, however, how Prometheus will compute it if you use a high irate() function (which is suppose to give you highest resolution possible). When you evaluate irate() at T=4, it doesn’t have access to the T=5 value, even if it is in the database. Instead, it will look back and find the matching previous interval (one through two) and use the corresponding value of ten QPS.

I find this pretty puzzling and inconvenient.

There is also the rate() function, which can be used to get the average rate for the period.  Unfortunately it can’t estimate the rate for a smaller period based on the available data for a longer period. So for example if I ask rate() function to compute a query rate at T=4, looking one second back, it will return no data. This isn’t a big deal when you’re working with data manually, but if you’re building zoomable dashboards it means you can zoom in to the point where the data will disappear (rather than stopping at the best possible value available).

Storage

Prometheus has its own high performance storage system which is based in part on LevelDB. It is highly optimized for time series and can achieve a very high level of compression. Be ready, though: all your label combinations will create a different time series on the low level, and will require a lot of files. This isn’t really a problem with SSD drives and modern file systems, but it something to look out for.

The capture engine and storage systems are rather efficient. Even though Prometheus does not have built in clustering for “scaling out,” you can reportedly get more than 300K metrics per second captured on a single node. You can also use multiple Prometheus servers as needed.

The problem I found with Prometheus’ storage is that it is very self contained: you can only use it from Prometheus or access it from the HTTP API. There are no tools at this point to export it for advanced analysis with R, or to dump the whole database into something like JSON format so it can be loaded into a different database engine. Some of these features might already be on roadmap.

Purging and Aggregation

Retention configuration in Prometheus is pretty spartan. You can set

storage.local.retention
 to the length you want to store the data, but that’s it. You can’t configure it to purge different data at different times. You can run multiple Prometheus instances to achieve this, but it’s quite a hassle.  It’s also not possible to instruct Prometheus to automatically build summaries in order to execute low resolution queries faster.

For example if I have MySQL’s query rate captured every second, but I want to view the data over a long time period (e.g., how it changed over last three months to estimate growth trends), data aggregated at hour intervals would be enough for that purpose.

There is support for recording rules to help achieve some of this, but it is not explicit or convenient in my opinion.

Looking at the Prometheus roadmap, some of these issues might not be fixed in Prometheus but achieved through integrating other systems such as InfluxDB (where experimental support already exists).

Purpose

A lot of these limitations make sense if you look at the purpose for which Prometheus was created: getting high-resolution data and being able to provide as much troubleshooting information as possible to its Alerting engine. It is not really designed for storing extensive history. Too bad! I would very much like to get both of those properties in the single system!

Visualization

As you install Prometheus, it has a built-in Expression Browser, which is great for debugging and interactive analyses. It also allows you to see what data you actually have in the database. It will disappoint you, however, if you’re looking for beautiful graphs!

HTTP installer

This shows I have the information about MySQL query rate from two servers, as well as the available and configured labels.

If I want to pick one server and look at the average rate of queries per five minutes, I can do this:

HTTP graphs

There are some tools available in the graph to chose the time range and resolution.

You should aware that visualizing data with the rate() function often shows you things that do not exist. In this case, it looks like the number of queries was gradually creeping up. In reality, I just started the benchmark so the number of queries jumped almost immediately. This is what the real situation looks like (using irate()):

HTTP graphs 2

As I explained before, irate() does not handle missing data points very well, plus it behaves somewhat bizarrely when you “zoom out” – providing instant rate information at sparse intervals (e.g., the instant rate computed every one second over 60 seconds) rather than smoothing things to averages.

There is also the PromDash tool available for Prometheus, which gives you nicer looking dashboards and supports a lot of Prometheus’ features. Now that Grafana has official support for Prometheus, it is my preferred tool to build dashboards – especially since it supports multiple data sources besides Prometheus.

Summary

I’m very excited about Prometheus. It allows me to get a lot of information easily and use it for Performance analyses in benchmarking or troubleshooting. It would be great if it also had a simple integrated solution for long term storage and trending. I am also looking forward to better integration with Grafana and better documentation on how to create Prometheus-based dashboards – especially with some Prometheus-based examples!  

Note: All above was written about Prometheus 0.16.1. Prometheus is rapidly evolving and may  change with newer versions.

by Peter Zaitsev at January 14, 2016 03:23 PM

January 13, 2016

Valeriy Kravchuk

MySQL Support People - Those Who Were There First

I'd like to devote this long weekend post, the first in a new series, to my current and former colleagues who once worked or still work in a company that provided public MySQL Support service and had a job role of MySQL Support engineer. The list of companies include MySQL AB/Inc ("good old MySQL"), Sun, Oracle, Percona, MariaDB, FromDual, maybe more (I named only those that surely provided or provides MySQL Support for customers outside of the company).

This is not the first time that I write about "people of MySQL", in a wider sense "those who contribute to MySQL Community". Last time it was about famous MySQL bug reporters. That's because for me, on personal and professional level, MySQL is about people who work on it, not about companies, services or money we all make while working on it. But today I want to concentrate on "MySQL Support People" mostly, and name many of them, those who still do MySQL Support right now when you read this, to those who are now CEOs, CTOs, Directors, VPs, Product Managers, DBAs and Developers. They all have two things in common: they once provided support for MySQL in the same team with me, and their contribution to MySQL Community is visible (can be found easily at http://bugs.mysql.com or elsewhere). Click on names to see some of their public contributions to MySQL (bugs reported).

You may want to ask why I highlight "Support" in this post? It happened so that recently I have to emphasize in public the importance of Support as a core service for MySQL. Regular readers probably noted this here and there...

Let me write today about those who started to provide MySQL Support in "good old MySQL" before me:
  • Shane Bester - he started to work in MySQL Support back in 2005 and he still actively provides support in Oracle, as far as I know. Click on his name to see a list of 1029 (at the moment of writing) public bug reports he created for MySQL software! Start with Bug #79591 if you are interested in regression bugs in MySQL 5.7. I wish I'd be able to do the magic he does every day with creating repeatable test case for MySQL bugs.
  • Sinisa Milivojevic - rumors say he probably was the first Support Engineer in MySQL. Maybe not, but he does this job to some extent since 1999 or so. I see 74 bugs reported, started from Bug #4. So, he was already contributing to MySQL Community back in 2002 for sure! Recently he seems to verify public bug reports from time to time while still working in Oracle. Not sure what else he does, but does it really matter? He was one of MySQL people who influenced me a lot and I tried to play "better Sinisa" for a couple of years at my current job.
  • Victoria Reznichenko - a well known Queen of MySQL Support in Oracle now probably enjoys her maternity leave (I hope Oracle does not force her to work during this time). She was already a well known Support provider in EMEA when I joined, as one can clearly conclude from the fact that her first Bug #860 (of 102 I can see) was reported back in July 2103, 2 years before I joined MySQL AB. Surely her opinion would always matter for me.
  • Alexander Keremidarski (a.k.a. Salle) - when I joined he was team manager of EMEA Support team. As far as I know he has a similar role now in MariaDB. A lof of MySQL support practices are related to his name in my mind. Based on Bug #11 he had reported, he contributes to MySQL Community at least since 2002. Maybe it was him, not Sinisa, who was the first employee with MySQL Support Engineer title. I let them argue about that.
  • Timothy Smith - he tested my knowledge when they decided to give me Support Engineer job in MySQL. In 2005 he was an APAC team manager. Probably his activity was the reason why I ended up in Support (they were faster). I've used his approach to testing candidates later. I see 130 bugs reported by Tim, and many are still not fixed. I am not sure if he is still related to MySQL in any way, but he built a great team and I always was happy to work with him.
  • Matt Lord - he was already a key Support engineer in AMER team when I joined. 123 public bug reports till he moved to the dark side in Oracle, where he is a MySQL Product Manager now.
  • Harrison Fisk - he probably did consulting or training when I joined, but I clearly remember him in Support soon after that and until he moved to Facebook (after playing TAM role for them from some time). I see 108 bug reports and Bug #53825, for example, makes me wanting to double check current InnoDB code. I wish I have that gdb functions they discuss with Sunny at hand. Harrison had the highest rank in Support all the time we worked together. I would not be surprised if I am still 5+ years behind Harrison in my understanding of InnoDB internals (or anything related to MySQL)...
  • Arjen Lentz - he was a key Support engineer in APAC team when I joined. Now he is the Exec. Director at Open Query Pty Ltd, yet another MySQL services provider. His first public bug report that I can see in the list of 84 total, Bug #108, is almost 13 years old.
  • James Day - he worked all kind of crazy hours (probably he is the only one besides Shane and me who was SSC for all 3 shifts, EMEA, AMER and APAC more than once) and was already a well known guru in complex issues handling back in 2005. I see just 25 public bug reports from him, but he had processed many bugs and many of his comments in InnoDB-related bug reports are priceless. Read Bug #69842, for example. I'll remember our busy night APAC shift we worked together (with nobody else around) forever... He still does MySQL Support in Oracle, as far as I know.
  • Kolbe Kegel - he joined MySQL several weeks before me and he still provides Support in MariaDB. He had always been a very reliable and smart engineer. Check the latest of his 133 public bug reports, Bug #74517.
  • Hartmut Holzgraefe - Hartmut helped me a lot during my very first days in MySQL. He was ready to answer any question, help with code (he probably pushed the patch to have my Celeron CPU supported by the build scripts), whatever I asked. He provided support for NDB cluster, and people who are able to do this form a separate "super team" in MySQL Support that gain my huge respect. The do magic by supporting software I consider NOT supportable without direct communication channel with its developers even today. He had reported 420 bugs as far as I can see, and Bug #77018 is the last "Verified" one of them reported for NDB cluster.
  • Axel Schwenke - he was a key support provider in EMEA when I joined and he is also from that "super team" of NDB-enabled magicians. By the way, originally I was hired to provide NDB support eventually and went to Bugs Verification Team only for some part of usual 6 months probation period. But I skipped getting back to NDB cluster happily, and was able to mostly ignore it for 10 next years... Axel works in MariaDB and recently tried to fight with Oracle on performance benchmarks several times. Check this, Bug #77586, the last of his 69 public bug reports.
  • Miguel Solorzano - my first and only team manager in MySQL/Sun/Oracle. He still manages the same Bugs Verification Team and still works on bugs processing in a very visible way (Google for his name and count hits). His first bug report out of 115 total, Bug #326, is almost 13 years old, and before devoting himself to bugs processing he worked as developer on porting MySQL to Windows. I've got my first email form him with instructions few days before I started, and this email was enough for me to start doing my job from day one. I've forwarded it to maybe a dozen of colleagues who joined later. To summarize, Miguel was an ideal manager for me at some stage. The amount of time and efforts he devoted to MySQL Community is incredible!
  • Bogdan Degtyariov - Bogdan joined MySQL Support few weeks before me and worked on Connectors (mostly ODBC, but others as well, for some time), both when they were mentioned in customer issues and in public bug reports. He had reported 64 bugs. I am not sure what happened to him recently, as his latest public bug report is 5 years old and we had no chance to get in touch since I left Oracle in 2012. I hope he still works on MySQL in Oracle, no matter what is his current job role.
  • Dean Ellis - he was (or soon became) AMER team manger in Support when I joined. He was also a key Support provider for AMER and defined many proper procedures over years. The last of them that still affects every MySQL user was so called "BPS process" for setting priorities of bug fixes in Oracle MySQL. It was him who set up the grounds of the process that was later implemented (from Support side) by Sinisa, me and Sveta Smirnova. I see 140 bugs reported but I am sure he had more than one account and reports date back to 2003. Some of his famous bug reports include Bug #4291 and Bug #1118 (implemented only in 5.7.5+). He was a Vice President, Technical Support at MariaDB last time I checked.
  • Vasily Kyshkin (a.k.a. WAX) - he was a key engineer in my Bugs Verification Team when I joined. No wonder he reported just few bugs - our task was to get the counter of open and not processed bugs smaller, not bigger. It had been <90 for months, not 528 as i see now (and 500+ for months recently). So, engineers from bugs team, mostly reported bugs when requested by customers. I was really sorry when he decided to leave MySQL, but the habit to not care much about Support engineers (and even less - about those working on community bug reports) was not invented recently. Some had it back during "good old MySQL" days...
  • Aleksey Kishkin (a.k.a. Walrus) - according to many, he was the best SSC (Support Shift Captain) of MySQL AB times. I can not argue with that (even if I'd like to think it was me :). He worked a lot and was always ready to help with bugs processing. He left MySQL AB in 2006 probably, then eventually returned. Not sure what is he doing now. As a side note, those engineers with nicknames in this list surely worked since the times were there were just a handful of people in entire MySQL... In 2005 I was asked about the preferred nickname (and I used "openxs", you can search bugs on that, it's still my main UNIX user name everywhere, but it was not widely used in the company).
  • Michael Widenius (a.k.a. Monty) - yes, creator of MySQL provided Support for it and worked in tight cooperation with first Support engineers to define some of the approaches and practices of doing Support. This is what I've heard and can conclude based on his bug reports. Check his Bug #18 to understand the state of MySQL quality back in 2002 :)
  • Jorge del Conde - when I joined Jorge worked in Bugs Verification Team, a lot, on all kinds of bugs. But essentially he was a developer in MySQL since 1998 (one of the very first employees). More than 7 last years he works as Senior Developer at Microsoft. I was really sorry back in 2006 when he had to quit from MySQL Support. Not the last time when I could do nothing to keep a team member, but it's always painful to me.
  • Peter Zaitsev - yes, founder and CEO of Percona was a leader of High Performance Team in Support back in 2005. this team worked on complex performance-related issues and run all kinds of benchmarks. Moreover, Peter had SSC shifts scheduled sometimes, and at least once was late for his shift so I had to call him. He had reported 133 public bugs. Check Bug #59899 as one of the last examples of my communication with Peter from the other side of public bugs database.
  • Vadim Tkachenko - current CTO and co-founder of Percona was the only person from MySQL AB whom I've seen in the real life before I joined. He met me to evaluate if I am good enough to join support. We had some coffee at talked about transaction isolation levels and probably something else. Vadim worked with Peter Zaitsev in High Performance Team and formally was a Performance Engineer, but he worked on usual support issues as well and took SSC shifts. Vadim managed to report 23 public bugs while working in MySQL, and his Bug #14347 is still just "Verified".
  • Alexey Kopytov (a.k.a. Kaamos) - he also worked in High Performance Team in 2005 when I joined, and is famous for his sysbench tool and numerous contributions to Percona Server and Percona XtraBackup since 2010 when he joined Percona. He left Percona somewhat silently some time ago, so I am not sure where exactly he is working know, but he still writes a lot about MySQL and keeps reporting bugs (I see 90 of them) and contributing patches. Check his last bug report and patch in Bug #79487.
  • Lachlan Mulcahy - I remember him as a great APAC team manager after Timothy Smith moved back to USA, but quick check shows that he was a Support Engineer when I joined. List of bugs he reported (49 in total) move us back to September 2004, that's probably when he started. Check his Bug #36151 that is still "Verified". Now he is a Production Engineering Manager at Facebook.
  • Sasha Pachev was a developer in MySQL, but some time in during 2006 or 2007 he worked in Support as well, part time. Check his last public bug report with a patch suggested (out of 8 that I see) that is still "Open", Bug #60593. There are also still "Verified" optimizer bugs reported by him.
  • Jani Tolonen was mostly a developer, but he played many roles and worked in Support part time. His 23 public bug reports covers everything from mysql command line client to Maria storage engine. I am not sure what he is doing now.
  • Indrek Siitan - when I joined and for some time after that he played mostly a "full time SSC" (Support Coordinator) role mostly in Support. But quick check for bugs gives a list of 15, starting with Bug #138, so he surely did other work as well. Now he seems to be a Full-Stack Web Developer (and probably he had always been a web developer in one of his roles).
I studied a lot from all these people while working with them. They surely know what MySQL Support is about and contributed a lot both to MySQL Community and to my personal ideas on what's good and bad in MySQL services.

UPDATE: my readers on Facebook noted that there were other engineers as well who worked in Support or who had provided Support in MySQL before I joined and before many people of those I've listed started to do that. I tried to work based on my memory and also, just because of format of this blog, I had to list those who has several real bugs reported in public. I did some corrections and additions in the list based on feedback I've got.

If I missed your name in the list or had written something about your work that is not correct, please, forgive me my old memory and correct me.

Next time I'll write about those great engineers who joined MySQL Support after me and became much better than I'd ever be able to become. Stay tuned!

For now just remember the following statement that I've used more than once in different discussions already:
"There are two kinds of people: people of MySQL and all other people. You can not judge them by the same rules."
I had not changed my mind since the first time I stated that while defending one of colleagues in internal discussion.

by Valeriy Kravchuk (noreply@blogger.com) at January 13, 2016 07:38 PM

January 12, 2016

MariaDB AB

MariaDB Enterprise Now Supports MariaDB 10.1

nishantvyas

Customers have been excited to get their hands on the performance and maintenance enhancements and data encryption capabilities made available through MariaDB 10.1. We are happy to announce that MariaDB Enterprise and Enterprise Cluster subscriptions now support MariaDB 10.1 and make all of these features available to our subscription customers.

With MariaDB Enterprise supporting 10.1, users can fully and transparently encrypt their databases and protect their data-at-rest, and benefit from password validation and role-based access control improvements. Adding this critical security layer requires no changes to existing applications, and critically, maintains the same high level of performance. Data breaches cost an average of $3.8M, and we encourage all MariaDB users, as well as companies using MySQL, to make the easy migration to MariaDB Enterprise supporting MariaDB 10.1 to reduce their security risk. The new certified binaries are now available for download.

Support for MariaDB 10.1 also provides MariaDB Enterprise subscribers with

High availability enhancements with full integration of the community Galera multi-master cluster technology

Scalability enhancements:

  • Optimistic parallel replication – all transactions will be considered to be run in parallel, giving another performance boost in master-to-slave replication
  • Slave execution of triggers when using row-based replication
  • WebScaleSQL performance enhancements

Performance enhancements:

  • Query timeouts
  • InnoDB improvements such as multi-threaded flush, page compression for FusionIO/nvmfs
  • Optimizer enhancements including EXPLAIN JSON and EXPLAIN ANALYZE (with FORMAT=JSON)

by nishantvyas at January 12, 2016 08:48 PM

Peter Zaitsev

Play the Percona Powerball Pool!!

percona powerball

percona powerballThe Only Sure Thing is Percona Powerball Pool

Everyone is talking about the upcoming Powerball lottery draw. 1.4 BILLION dollars!! And growing! Millions of people are imagining what they would do IF they win. It’s the stuff of dreams.

That is literally true. The chances of winning the Powerball Lottery are 1 in 292.2 million. Or roughly speaking, the chances of picking the right combination of numbers is like flipping a coin and getting heads 28 times in a row. You’re more likely to get struck by lightning (twice) or bitten by a shark.

Sorry.

You know what is a sure thing? Percona’s ability to optimize your database performance and increase application performance. Our Support and Percona Care consultants will give you a 1 in 1 chance of making your database run better, solving your data performance issues, and improving the performance of your applications.

However, in the spirit of moment, Percona has bought 10 sets of Powerball numbers and have posted them on Facebook, Twitter and LinkedIn. It’s the Percona Powerball Pool! Like either post and share it, and you are qualified for one (1) equal share of the winnings! Use #perconapowerball when you share.

Here are the numbers:

percona powerball

We at Percona can’t promise a huge Powerball windfall (in fact, as data experts we’re pretty sure you won’t win!), but we can promise that our consultants are experts at helping you with your full LAMP stack environments. Anything affecting your data performance – on that we can guarantee you a win!

Full rules are here.

by Dave Avery at January 12, 2016 06:25 PM

Percona Server 5.6.28-76.1 is now available

Percona Server 5.6.28-76.1Percona is glad to announce the release of Percona Server 5.6.28-76.1 on January 12, 2016. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.28, including all the bug fixes in it, Percona Server 5.6.28-76.1 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – and this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.28-76.1 milestone on Launchpad.

Bugs Fixed:

  • Clustering secondary index could not be created on a partitioned TokuDB table. Bug fixed #1527730 (DB-720).
  • When enabled, super-read-only option could break statement-based replication while executing a multi-table update statement on a slave. Bug fixed #1441259.
  • Running OPTIMIZE TABLE or ALTER TABLE without the ENGINE clause would silently change table engine if enforce_storage_engine variable was active. This could also result in system tables being changed to incompatible storage engines, breaking server operation. Bug fixed #1488055.
  • Setting the innodb_sched_priority_purge variable (available only in debug builds) while purge threads were stopped would cause a server crash. Bug fixed #1368552.
  • Small buffer pool size could cause XtraDB buffer flush thread to spin at 100% CPU. Bug fixed #1433432.
  • Enabling TokuDB with ps_tokudb_admin script inside the Docker container would cause an error due to insufficient privileges even when running as root. In order for this script to be used inside Docker containers this error has be been changed to a warning that a check is impossible. Bug fixed #1520890.
  • InnoDB status will start printing negative values for spin rounds per wait, if the wait number, even though being accounted as a signed 64-bit integer, will not fit into a signed 32-bit integer. Bug fixed #1527160 (upstream #79703).

Other bugs fixed: #1384595 (upstream #74579), #1384658 (upstream #74619), #1471141 (upstream #77705), #1179451, #1524763 and #1530102.

Release notes for Percona Server 5.6.28-76.1 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

by Hrvoje Matijakovic at January 12, 2016 02:30 PM

Percona Server 5.5.47-37.7 is now available

Percona Server 5.5.47-37.7
Percona is glad to announce the release of Percona Server 5.5.47-37.7 on January 12, 2016. Based on MySQL 5.5.47, including all the bug fixes in it, Percona Server 5.5.47-37.7 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. Details of the release can be found in the 5.5.47-37.7 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

  • Running OPTIMIZE TABLE or ALTER TABLE without the ENGINE clause would silently change table engine if enforce_storage_engine variable was active. This could also result in system tables being changed to incompatible storage engines, breaking server operation. Bug fixed #1488055.

Other bugs fixed: #1179451, #1524763, and #1530102.

Release notes for Percona Server 5.5.47-37.7 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at January 12, 2016 02:10 PM

Colin Charles

FOSDEM 2016 – See you in Brussels

Over the weekend I read in the FT (paywall): Is Brussels safe? Ring a local resident to find out. I’m sure it will be fine, and you will want to be there for FOSDEM, happening 30-31 January 2016. 

There is the excellent one day track, that is the MySQL & Friends Devroom (site). Talks hail from Oracle, MariaDB Corporation, Percona and more. We don’t have a booth this year, but we do have amazingly good content on Saturday. I’m happy to have been part of the committee that chose the talks, but you know that this is a labour of love put on by Frédéric Descamps, Liz van Dijk, Dimitri Vanoverbeke, and Kenny Gryp. I’m sure the party will be awesome.

But that is not all! In the distributions devroom, you can see me give a talk at 11:00-11:20 titled Distributions from the view of a package. This is an important topic, because you start seeing MariaDB Server becoming the default in many distributions with the last holdout being Debian. But there is a lot of discussion, especially from the security standpoint there now, about MySQL overall. But that’s not the focus of my talk — I’m going to talk to you about how we, as upstream, have had to deal with distributions, changing requirements, etc. overall. I’ve done this since the MySQL days, so have quite a bit of experience dealing with it. 

If you are making software and want to be included and supported across all distributions, I highly recommend you coming to my talk. If you happen to decide to live in an ecosystem where there are forks, I also promise to make it useful for you.

And on Sunday, you will want to go visit the RocksDB Storage Engine for MySQL talk by none other than Yoshinori Matsunobu of Facebook. This will be at the main track and I highly recommend you visit it — I’m sure Sergei Petrunia will also make an appearance as he spends a lot of time on this too.

All in, I’m extremely excited to be at FOSDEM 2016. And you don’t need to ring a local resident to find out if its going to be safe/fun — come for the learning, stay for the beer ;-)

by Colin Charles at January 12, 2016 10:57 AM

SCALE14x – lots of MySQL content there

One of my favourite events run by a grassroots organisation is SCALE, and they are now doing their 14th edition, SCALE14x. If you’re into opensource software as well as all things open, this is the place to be from January 21-24 2016. It is at a new location in Pasadena (so not quite next to LAX as it was previously), but this is due to growth — so kudos to the team.

From MariaDB Corporation you get to see Max Mether (Scaling MySQL & MariaDB – I’m extremely interested in seeing what he has to say and will likely blog the session) and me (The MySQL Server Ecosystem in 2016).

One thing is for sure is that the topic I plan to present on will surely come under contention since I also represent a server maker — however I believe I will be extremely objective and will put up blog posts before/after the event as well as slides, because it is clear that MySQL is now going to be 21 years old and the ecosystem has grown tremendously. Let me reiterate my main thesis: MySQL server development has been at its most vibrant since the Oracle acquisition — the ecosystem is flourishing, and Oracle is doing a great job with MySQL, Percona with Percona Server, MariaDB Corporation/MariaDB Foundation with MariaDB Server, and lets not forget the wonderful work from the WebScaleSQL Consortium, Facebook’s MySQL tree and even Alibaba’s tree (the Twitter tree seems to be sadly not really maintained much these days, but there was innovation coming out of it in the past).

There are also going to be many other great talks at the MySQL track on Friday, from Peter Zaitsev, Dave Stokes (I’m excited about the JSON support in MySQL 5.7), Ovais Tariq/Aleksandr Kuzminsky on indexes, and Janis Griffin on query tuning. There’s also an excellent PostgreSQL track and I think one of the highlights should also be the keynote from Mark Shuttleworth at UbuCon on Thursday.

See you at SCALE14x? Oh, before I forget, MariaDB Corporation also has a booth, so you will get to see Rod Allen manning it and I’m sure there will be giveaways of some sort. 

If you have any feedback about the MySQL Server ecosystem and its developments, please feel free to leave a comment here or send an email to me. Thanks!

by Colin Charles at January 12, 2016 10:41 AM

Peter Zaitsev

Percona Monitoring Plugins 1.1.6 release

Percona is glad to announce the release of Percona Monitoring Plugins 1.1.6.

Changelog:

  • Added new RDS instance classes to RDS scripts.
  • Added boto profile support to RDS scripts.
  • Added AWS region support and ability to specify all regions to RDS scripts.
  • Added ability to set AWS region and boto profile on data source level in Cacti.
  • Added period, average time and debug options to pmp-check-aws-rds.py.
  • Added ability to override Nginx server status URL path on data source level in Cacti.
  • Made Memcached and Redis host configurable for Cacti script.
  • Added the ability to lookup the master’s server_id when using pt-heartbeat with pmp-check-mysql-replication-delay.
  • Changed how memory stats are collected by Cacti script and pmp-check-unix-memory.
    Now /proc/meminfo is parsed instead of running free command. This also fixes pmp-check-unix-memory for EL7.

  • Set default MySQL connect timeout to 5s for Cacti script. Can be overridden in the config.
  • Fixed innodb transactions count on the Cacti graph for MySQL 5.6 and higher.
  • Fixed –login-path option in Nagios scripts when using it along with other credential options.

Thanks to contributors: David Andruczyk, Denis Baklikov, Mischa ter Smitten, Mitch Hagstrand.

The project is fully hosted on Github now including issues and Launchpad project is discontinued.

A new tarball is available from downloads area or in packages from our software repositories. The plugins are fully supported for customers with a Percona Support contract and free installation services are provided as part of some contracts. You can find links to the documentation, forums and more at the project homepage.

Percona Monitoring PluginsAbout Percona Monitoring Plugins
Percona Monitoring Plugins are monitoring and graphing components designed to integrate seamlessly with widely deployed solutions such as Nagios, Cacti and Zabbix.

by Roman Vynar at January 12, 2016 06:01 AM

January 11, 2016

Peter Zaitsev

Bare-metal servers for button-push database-as-a-service

bare-metal serversEnterprises’ demand flexibility, scalability and efficiency to keep up with the demands of their customers — while maintaining the bottom line. To solve this, they’re running to cloud infrastructure services to both cut costs and take advantage of cutting-edge technology innovations. Clouds have brought simplicity and ease of use to infrastructure management. However, with this ease of use often comes some sacrifice: namely, performance.

Performance degradation often stems from the introduction of virtualization and a hypervisor layer. While the hypervisor enables the flexibility and management capabilities needed to orchestrate multiple virtual machines on a single box, it also creates additional processing overhead.

Regardless, cloud servers also have huge advantages: they deploy at lightning speed and enable hassle-free private networking without the need for a private VLAN from the datacenter. They also allow the customer near instantaneous scalability without the burden of risky capital expenditures.

Bare-metal servers are one solution to this trade-off. A bare metal server is all about plain hardware. It is a single-tenant physical server that is completely dedicated to a single data-intensive workload. It prioritizes performance and reliability. A bare-metal server provides a way to enable cloud services that eliminates the overhead of virtualization, but retains the flexibility, scalability and efficiency.

On certain CPU-bound workloads, bare metal servers can outperform a cloud server of the same configuration by four times. Database management systems, being very sensitive to both CPU performance and IO speed, can obviously benefit from access to a bare metal environment.

Combine a bare metal server accessible via a cloud service with a high performance MySQL solution and you get all benefits of the cloud without sacrificing performance. This is an ideal solution for startups, side projects or even production applications.

In fact this is just what we’ve done with a partnership between Percona and Servers.com, where you can automatically provision Percona Server for MySQL on one of their bare metal servers. You can learn more about this service here.

by Dave Avery at January 11, 2016 05:50 PM

MongoDB revs you up: What storage engine is right for you? (Part 2)

MongoDB

MongoDBDifferentiating Between MongoDB Storage Engines: WiredTiger

In our last post, we discussed what a storage engine is, and how you can determine the characteristics of one versus the other. From that post:

“A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems.”

Check out the full post here.

Generally speaking, it’s important to understand what type of work environment the database is going to interact with, and to select a storage engine that is tailored to that environment.

The last post looked at MMAPv1, the original default engine for MongoDB (through release 3.0). This post will examine the new default MongoDB engine: WiredTiger.

WiredTiger

Find it in: MongoDB or Percona builds

MongoDB, Inc. introduced WiredTiger for document-level concurrency control for write operations in MongoDB v3.0. As a result of this introduction, multiple clients can now modify different documents of a collection at the same time. WiredTiger in MongoDB currently only supports B-trees for the data structure. However, it also has the ability to use LSM-trees, but it is not currently implemented in the MongoDB version of the engine.

WiredTiger has a few interesting features, most notably compression, document-level locking, and index prefix compression. B-trees, due to their rigidity in disk interaction and chattiness with storage, are not typically known for their performance when used with compression. However, WiredTiger has done an excellent job of maintaining good performance with compression and gives a decent performance/compression ratio with the “snappy” compression algorithm. Be that as it may, if deeper compression is necessary, you may want to evaluate another storage engine. Index prefix compression is a unique feature that should improve the usefulness of the cache by decreasing the size of indexes in memory (especially very repetitive indexes).

WiredTiger’s ideal use cases include data that are likely to stay within a few multiples of cache size. One can also expect good performance from TTL-like workloads, especially when data is within the limit previously mentioned.

Conclusion

Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of RocksDB, PerconaFT and WiredTiger to help specifically differentiate between these engines.

In Part Three of this blog series, we’ll take a closer look at the RocksDB storage engine.

Part 1: Intro and the MMAPv1 storage engine.

 

by Jon Tobin at January 11, 2016 05:48 PM

Percona XtraDB Cluster 5.6.27-25.13 is now available

Percona XtraBackup Logo

Percona XtraDB Cluster 5.6.26-25.12Percona is glad to announce the new release of Percona XtraDB Cluster 5.6 on January 11, 2016. Binaries are available from the downloads area or from our software repositories.

Percona XtraDB Cluster 5.6.27-25.13 is now the current release, based on the following:

All of Percona software is open-source and free, and all the details of the release can be found in the 5.6.26-25.12 milestone at Launchpad.

For more information about relevant Codership releases, see this announcement.

NOTE: Due to new dependency on libnuma1 package in Debian/Ubuntu, please run one of the following commands to upgrade the percona-xtradb-cluster-server-56 package:

  • aptitude safe-upgrade
  • apt-get dist-upgrade
  • apt-get install percona-xtradb-cluster-server-5.6

New Features:

  • There is a new script for building Percona XtraDB Cluster from source. For more information, see Compiling and Installing from Source Code.
  • wsrep_on is now a session only variable. That means toggling it will not affect other clients connected to said node. Only the session/client modifying it will be affected. Trying to toggle wsrep_on in the middle of a transaction will now result in an error. Trx will capture the state of wsrep_on during start and will continue to use it. Start here means when the first logical changing statement is executed within transaction context.

Bugs Fixed:

  • #1261688 and #1292842: Fixed race condition when two skipped replication transactions were rolled back, which caused [ERROR] WSREP: FSM: no such a transition ROLLED_BACK ->ROLLED_BACK with LOAD DATA INFILE
  • #1362830: Corrected xtrabackup-v2 script to consider only the last specified log_bin directive in my.cnf. Multiple log_bin directives caused SST to fail.
  • #1370532: Toggling wsrep_desync while node is paused is now blocked.
  • #1404168: Removed support for innodb_fake_changes variable.
  • #1455098: Fixed failure of LDI on partitioned table. This was caused by partitioned table handler disabling bin-logging and Native Handler (InnoDB) failing to generate needed bin-logs eventually causing skipping of statement replication.
  • #1503349: garbd now uses default port number if it is not specified in sysconfig.
  • #1505184: Corrected wsrep_sst_auth script to ensure that user name and password for SST is passed to XtraBackup through internal command-line invocation. ps -ef doesn’t list these credentials so passing it internally is fine, too.
  • #1520491: FLUSH TABLE statements are not replicated any more, because it lead to an existing upstream fix pending deadlock error. This fix also takes care of original fix to avoid increment of local GTID.
  • #1528020: Fixed async slave thread failure caused by redundant updates of mysql.event table with the same value. Redundant updates are now avoided and will not be bin-logged.
  • Fixed garb init script causing new UUIDs to be generated every time it runs. This error was due to missing base_dir configuration when gardb didn’t have write-access to current working directory. garbd will now try to use cwd. Then it will try to use /var/lib/galera (like most Linux daemons). If it fails to use or create /var/lib/galera, it will throw a fatal error.
  • Fixed replication of DROP TABLE statement with a mix of temporary and non-temporary tables (for example, DROP TABLE temp_t1, non_temp_t2), which caused errorneous DROP TEMPORARYTABLE stmt on replicated node. Corrected it by detecting such scenarios and creating temporary table on the replicated node, which is then dropped by follow-up DROP statement. All this workload should be part of same unit as temporary tables are session-specific.
  • Fixed error when wsrep_cluster_name value over 32 characters long caused gmcast message to exceed maximum length. Imposed a limit of 32 character on wsrep_cluster_name.
  • Added code to properly handle default values for wsrep_* variables, which caused an error/crash.
  • Fixed error when a CREATE TABLE AS SELECT (CTAS) statement still tried to certify a transaction on a table without primary key even if certification of tables without primary key was disabled. This error was caused by CTAS setting trx_id (fake_trx_id) to execute SELECT and failing to reset it back to -1 during INSERT as certification is disabled.
  • Fixed crashing of INSERT .... SELECT for MyISAM table with wsrep_replicate_myisam set to ON. This was caused by TOI being invoked twice when source and destination tables were MyISAM.
  • Fixed crash when caching write-set data beyond configured limit. This was caused by TOI flow failing to consider/check error resulting from limit enforcement.
  • Fixed error when loading MyISAM table from schema temporary table (with wsrep_replicate_myisam set to ON). This was caused by temporary table lookup being done usingget_table_name(), which could be misleading as table_name for temporary tables is set to temporary generated name. Original name of the table is part of table_alias. The fix corrected condition to consider both table_name and alias_name.
  • Fixed error when changing wsrep_provider in the middle of a transaction or as part of a procedure/trigger. This is now blocked to avoid inconsistency.
  • Fixed TOI state inconsistency caused by DELAYED_INSERT on MyISAM table (TOI_END was not called). Now the DELAYED_ qualifier will be ignored and statement will be interpreted as normal INSERT.
  • Corrected locking semantics for FLUSH TABLES WITH READ LOCK (FTWRL). It now avoids freeing inheritted lock if follow-up FLUSH TABLE statement fails. Only frees self-acquired lock.
  • Fixed crash caused by GET_LOCK + wsrep_drupal_282555_workaround. GET_LOCK path failed to free all instances of user-level locks after it inherited multiple-user-locks from Percona Server. The cleanup code now removes all possible references of locks.
  • Fixed cluster node getting stuck in Donor/Desync state after a hard recovery, because of an erroneous type cast in source code.
  • Corrected DDL and DML semantics for MyISAM:
    • DDL (CREATE/DROP/TRUNCATE) on MyISAM will be replicated irrespective of wsrep_replicate_miysam value
    • DML (INSERT/UPDATE/DELETE) on MyISAM will be replicated only if wsrep_replicate_myisam is enabled
    • SST will get full transfer irrespective of wsrep_replicate_myisam value (it will get MyISAM tables from donor if any)
    • Difference in configuration of pxc-cluster node on enforce_storage_engine front may result in picking up different engine for same table on different nodes
    • CREATE TABLE AS SELECT (CTAS) statements use non-TOI replication and are replicated only if there is involvement of InnoDB table that needs trx (involvement of MyISAM table will cause CTAS statement to skip replication)

Known Issues:

  • 1330941: Conflict between wsrep_OSU_method set to RSU and wsrep_desync set to ON was not considered a bug.
  • 1443755: Causal reads introduces surprising latency in single node clusters.
  • 1522385: Holes are introduced in Master-Slave GTID eco-system on replicated nodes if any of the cluster nodes are acting as asynchronous slaves to an independent master.
  • SST fails with innodb_data_home_dir/innodb_log_home_dir. This is a bug in Percona XtraBackup. It should be fixed in the next 2.3.2 release. Until then, please use 2.2.12 that doesn’t have this issue.
  • Enabling wsrep_desync (from previous OFF state) will wait until previous wsrep_desync=OFF operation is completed.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at January 11, 2016 12:16 PM

Jean-Jerome Schmidt

ClusterControl Developer Studio: write your first database advisor

Did you ever wonder what triggers the advice in ClusterControl that your disk is filling up? Or the advice to create primary keys on InnoDB tables if they don’t exist? These advisors are mini scripts written in the ClusterControl Domain Specific Language (DSL) that is a Javascript-like language. These scripts can be written, compiled, saved, executed and scheduled in ClusterControl. That is what the ClusterControl Developer Studio blog series will be about.

Today we will cover the Developer Studio basics and show you how to create your very first advisor where we will pick two status variables and give advice about their outcome.

The advisors

Advisors are mini scripts that are executed by ClusterControl, either on-demand or after a schedule. They can be anything from simple configuration advice, warning on thresholds or more complex rules for predictions or cluster-wide automation tasks based on the state of your servers or databases. In general, advisors perform more detailed analysis, and produce more comprehensive recommendations than alerts.

The advisors are stored inside the ClusterControl database and you can add new or alter/modify existing advisors. We also have an advisor Github repository where you can share your advisors with us and other ClusterControl users.

The language used for the advisors is the so called ClusterControl DSL and is an easy to comprehend language. The semantics of the language can be best compared to Javascript with a couple of differences, where the most important differences are:

  • Semicolons are mandatory
  • Various numeric data types like integers and unsigned long long integers.
  • Arrays are two dimensional and single dimensional arrays are lists.

You can find the full list of differences in the ClusterControl DSL reference.

The Developer Studio interface

The Developer Studio interface can be found under Cluster > Manage > Developer Studio. This will open an interface like this:

Advisors

The advisors button will generate an overview of all advisors with their output since the last time they ran:

You can also see the schedule of the advisor in crontab format and the date/time since the last update. Some advisors are scheduled to run only once a day so their advice may no longer reflect the reality, for instance if you already resolved the issue you were warned about. You can manually re-run the advisor by selecting the advisor and run it. Go to the “compile and run” section to read how to do this.

Importing advisors

The Import button will allow you to import a tarball with new advisors in them. The tarball has to be created relative to the main path of the advisors, so if you wish to upload a new version of the MySQL query cache size script (s9s/mysql/query_cache/qc_size.js) you will have to make the tarball starting from the s9s directory.

By default the import will create all (sub)folders of the import but not overwrite any of the existing advisors. If you wish to overwrite them you have to select the “Overwrite existing files” checkbox.

Exporting advisors

You can export the advisors or a part of them by selecting a node in the tree and pressing the Export button. This will create a tarball with the files in the full path of the structure presented. Suppose we wish to make a backup of the s9s/mysql advisors prior to making a change, we simply select the s9s/mysql node in the tree and press Export:

Note: make sure the s9s directory is present in /home/myuser/.

This will create a tarball called /home/myuser/s9s/mysql.tar.gz with an internal directory structure s9s/mysql/*

Creating a new advisor

Since we have covered exports and imports, we can now start experimenting. So let’s create a new advisor! Click on the New button to get the following dialogue:

In this dialogue, you can create your new advisor with either an empty file or pre fill it with the Galera or MySQL specific template. Both templates will add the necessary includes (common/mysql_helper.js) and the basics to retrieve the Galera or MySQL nodes and loop over them.

Creating a new advisor with the Galera template looks like this:

#include "common/mysql_helper.js"

Here you can see that the mysql_helper.js gets included to provide the basis for connecting and querying MySQL nodes.

var WARNING_THRESHOLD=0;
…
if(threshold > WARNING_THRESHOLD)

The warning threshold is currently set to 0, meaning if the measured threshold is greater than the warning threshold, the advisor should warn the user. Note that the variable threshold is not set/used in the template yet as it is a kickstart for your own advisor.

var hosts     = cluster::Hosts();
var hosts     = cluster::mySqlNodes();
var hosts     = cluster::galeraNodes();

The statements above will fetch the hosts in the cluster and you can use this to loop over them. The difference between them is that the first statement includes all non-MySQL hosts (also the CMON host), the second all MySQL hosts and the last one only the Galera hosts. So if your Galera cluster has MySQL asynchronous read slaves attached, those hosts will not be included.

Other than that these objects will all behave the same and feature the ability to read their variables, status and query against them.

Advisor buttons

Now that we have created a new advisor there are six new button available for this advisor:

Save will save your latest modifications to the advisor (stored in the CMON database), Move will move the advisor to a new path and Remove will obviously remove the advisor.

More interesting is the second row of buttons. Compiling the advisor will compile the code of the advisor. If the code compiles fine, you will see this message in the Messages dialogue below the code of the advisor:

While if the compilation failed, the compiler will give you a hint where it failed:

In this case the compiler indicates a syntax error was found on line 24.

The compile and run button will not only compile the script but also execute it and its output will be shown in the Messages, Graph or Raw dialogue. If we compile and run the table cache script from the auto_tuners, we will get output similar to this:

Last button is the schedule button. This allows you to schedule (or unschedule) your advisors and add tags to it. We will cover this at the end of this post when we have created our very own advisor and want to schedule it.

My first advisor

Now that we have covered the basics of the ClusterControl Developer Studio, we can now finally start to create a new advisor. As an example we will create a advisor to look at the temporary table ratio. Create a new advisor as following:

The theory behind the advisor we are going to create is simple: we will compare the number of temporary tables created on disk against the total number of temporary tables created:

tmp_disk_table_ratio = Created_tmp_disk_tables / (Created_tmp_tables + Created_tmp_disk_tables) * 100;

First we need to set some basics in the head of the script, like the thresholds and the warning and ok messages. All changes and additions have been marked in bold:

var WARNING_THRESHOLD=20;
var TITLE="Temporary tables on disk ratio";
var ADVICE_WARNING="More than 20% of temporary tables are written to disk. It is advised to review your queries, for example, via the Query Monitor.";
var ADVICE_OK="Temporary tables on disk are not excessive." ;

We set the threshold here to 20 percent which is considered to be pretty bad already. But more on that topic once we have finalised our advisor.

Next we need to get these status variables from MySQL. Before we jump to conclusions and execute some “SHOW GLOBAL STATUS LIKE ‘Created_tmp_%’” query, there is already a function to retrieve the status variable of a MySQL instance:

statusVar = readStatusVariable(host, <statusvariablename>);

We can use this function in our advisor to fetch the Created_tmp_disk_tables and Created_tmp_tables.

for (idx = 0; idx < hosts.size(); ++idx)
{
   host        = hosts[idx];
   map         = host.toMap();
   connected     = map["connected"];
   var advice = new CmonAdvice();
   var tmp_tables = readStatusVariable(host, ‘Created_tmp_tables’);
   var tmp_disk_tables = readStatusVariable(host, ‘Created_tmp_disk_tables’);

And now we can calculate the temporary disk tables ratio:

var tmp_disk_table_ratio = tmp_disk_tables / (tmp_tables + tmp_disk_tables) * 100;

And alert if this ratio is greater than the threshold we set in the beginning:

if(checkPrecond(host))
{
   if(tmp_disk_table_ratio > WARNING_THRESHOLD) {
      advice.setJustification("Temporary tables written to disk is excessive");
      msg = ADVICE_WARNING;
   }
   else {
      advice.setJustification("Temporary tables written to disk not excessive");
      msg = ADVICE_OK;
   }
}

It is important to assign the Advice to the msg variable here as this will be added later on into the advice object with the setAdvice function. The full script for completeness:

#include "common/mysql_helper.js"

/**
* Checks the percentage of max ever used connections
*
*/
var WARNING_THRESHOLD=20;
var TITLE="Temporary tables on disk ratio";
var ADVICE_WARNING="More than 20% of temporary tables are written to disk. It is advised to review your queries, for example, via the Query Monitor.";
var ADVICE_OK="Temporary tables on disk are not excessive.";

function main()
{
   var hosts     = cluster::mySqlNodes();
   var advisorMap = {};

   for (idx = 0; idx < hosts.size(); ++idx)
   {
       host        = hosts[idx];
       map         = host.toMap();
       connected     = map["connected"];
       var advice = new CmonAdvice();
       var tmp_tables = readStatusVariable(host, 'Created_tmp_tables');
       var tmp_disk_tables = readStatusVariable(host, 'Created_tmp_disk_tables');
       var tmp_disk_table_ratio = tmp_disk_tables / (tmp_tables + tmp_disk_tables) * 100;
       
       if(!connected)
           continue;

       if(checkPrecond(host))
       {
          if(tmp_disk_table_ratio > WARNING_THRESHOLD) {
              advice.setJustification("Temporary tables written to disk is excessive");
              msg = ADVICE_WARNING;
              advice.setSeverity(0);
          }
          else {
              advice.setJustification("Temporary tables written to disk not excessive");
              msg = ADVICE_OK;
          }
       }
       else
       {
           msg = "Not enough data to calculate";
           advice.setJustification("there is not enough load on the server or the uptime is too little.");
           advice.setSeverity(0);
       }

       advice.setHost(host);
       advice.setTitle(TITLE);
       advice.setAdvice(msg);
       advisorMap[idx]= advice;
   }

   return advisorMap;
}

Now you can play around with the threshold of 20, try to lower it to 1 or 2 for instance and then you probably can see how this advisor will actually give you advice on the matter.

As you can see, with a simple script you can check two variables against each other and report/advice based upon their outcome. But is that all? There are still a couple of things we can improve!

Improvements on my first advisor

The first thing we can improve is that this advisor doesn’t make a lot of sense. What the metric actually reflects is the total number of temporary tables on disk since the last FLUSH STATUS or startup of MySQL. What it doesn’t say is at what rate it actually creates temporary tables on disk. So we can convert the Created_tmp_disk_tables to a rate using the uptime of the host:

var tmp_disk_table_rate = tmp_disk_tables / uptime;

This should give us the number of temporary tables per second and combined with the tmp_disk_table_ratio, this will give us a more accurate view on things. Again, once we reach the threshold of two temporary tables per second, we don’t want to immediately send out an alert/advice.

Another thing we can improve is to not use the readStatusVariable function from the mysql_helper.js library. This function executes a query to the MySQL host every time we read a status variable, while CMON already retrieves most of them every second and we don’t need a real-time status anyway. It’s not like two or three queries will kill the hosts in the cluster, but if many of these advisors are run in a similar fashion, this could create heaps of extra queries.

In this case we can optimize this by retrieving the status variables in a map using the host.sqlInfo()function and retrieve everything at once as a map. This function contains the most important information of the host, but it does not contain all. For instance the variable uptime that we need for the rate is not available in the host.sqlInfo()map and has to be retrieved with the readStatusVariable function.

This is what our advisor will look like now, with the changes/additions marked in bold:

#include "common/mysql_helper.js"

/**
* Checks the percentage of max ever used connections
*
*/
var RATIO_WARNING_THRESHOLD=20;
var RATE_WARNING_THRESHOLD=2;
var TITLE="Temporary tables on disk ratio";
var ADVICE_WARNING="More than 20% of temporary tables are written to disk and current rate is more than 2 temporary tables per second. It is advised to review your queries, for example, via the Query Monitor.";
var ADVICE_OK="Temporary tables on disk are not excessive.";

function main()
{
   var hosts     = cluster::mySqlNodes();
   var advisorMap = {};

   for (idx = 0; idx < hosts.size(); ++idx)
   {
       host        = hosts[idx];
       map         = host.toMap();
       connected     = map["connected"];
       var advice = new CmonAdvice();
       var hostStatus = host.sqlInfo();
       var tmp_tables = hostStatus['CREATED_TMP_TABLES'];
       var tmp_disk_tables = hostStatus['CREATED_TMP_DISK_TABLES'];
       var uptime = readStatusVariable(host, 'uptime');
       var tmp_disk_table_ratio = tmp_disk_tables / (tmp_tables + tmp_disk_tables) * 100;
       var tmp_disk_table_rate = tmp_disk_tables / uptime;

       if(!connected)
           continue;

       if(checkPrecond(host))
       {
          if(tmp_disk_table_rate > RATE_WARNING_THRESHOLD && tmp_disk_table_ratio > RATIO_WARNING_THRESHOLD) {
              advice.setJustification("Temporary tables written to disk is excessive: " + tmp_disk_table_rate + " tables per second and overall ratio of " + tmp_disk_table_ratio);
              msg = ADVICE_WARNING;
              advice.setSeverity(0);
          }
          else {
              advice.setJustification("Temporary tables written to disk not excessive");
              msg = ADVICE_OK;
          }
       }
       else
       {
           msg = "Not enough data to calculate";
           advice.setJustification("there is not enough load on the server or the uptime is too little.");
           advice.setSeverity(0);
       }

       advice.setHost(host);
       advice.setTitle(TITLE);
       advice.setAdvice(msg);
       advisorMap[idx]= advice;
   }

   return advisorMap;
}

Scheduling my first advisor

After we have saved this new advisor, compiled it and run, we now can schedule this advisor. Since we don’t have an excessive workload, we will probably run this advisor once per day.

The base scheduling mode has every minute, 5 minutes, hour, day, month preset and this is exactly what we need. Changing this to advanced will unlock the other greyed out input fields. These input fields work exactly the same as a crontab, so you can even schedule for a particular day, day of the month or even set it on weekdays.

Blog category:

by Severalnines at January 11, 2016 11:51 AM

January 09, 2016

Daniël van Eeden

Using Connector/J with Python

With Python you would normally use MySQL Connector/Python or the older MySQLdb to connect from Python to MySQL, but there are more options.

There are also multiple Python implementations: CPython (the main implementation), PyPy, Jython and IronPython. PyPy tries to be faster than CPython by using a Just-in-Time compiler. Jython runs on the JVM and IronPython runs on the .NET CLR.

Connector/Python by default (Without the C Extension) is a pure Python implementation and can work with most if not all implementations. And for MySQLdb there is a drop-in replacement called PyMySQL, which is a pure python implementation.

So there are many options already. But for at least Jython it is also possible to use a Java (JDBC) driver.

But why would you use a different Python implementation? There are multiple reasons for that:

  • Speed. PyPy can be faster and Jython has no Global Interpreter Lock (GIL) , which can allow for more concurrent execution.
  • To access 'native' code. e.g. call Java code from Jython or C# from IronPython.
  • Use existing infrastructure. You can deploy a Jython application on Tomcat.
  • Create testcases, healthchecks etc. which uses the same settings and infrastucture as your Java application with the benefits of a scripting language.

I wanted to test how Connector/J behaves with regards to TLS (the successor of SSL).

Setup

The first step is to get Jython, and Connector/J on your system. On Fedora 23 this is easily done with a dnf install jython mysql-connector-java.

Then I used MySQL Sandbox to setup a MySQL 5.7.10 sandbox. To enable TLS I did a ./my sql_ssl_rsa_setup, which is the Sandbox version of mysql_ssl_rsa_setup. If you have a pre-5.7 version then you can use mysslgen instead.

To convert the CA certificate from the PEM format to the Java Key Store (JKS) format I used keytool.

$ keytool -importcert -trustcacerts -file ca.pem -keystore /tmp/mysql57keystore.jks
Enter keystore password:
Re-enter new password:
Owner: CN=MySQL_Server_5.7.10_Auto_Generated_CA_Certificate
Issuer: CN=MySQL_Server_5.7.10_Auto_Generated_CA_Certificate
Serial number: 1
Valid from: Fri Jan 08 16:23:16 CET 2016 until: Mon Jan 05 16:23:16 CET 2026
Certificate fingerprints:
MD5: B5:B5:2B:53:5C:91:A2:6A:64:B5:C9:12:85:A0:CE:CC
SHA1: 85:F1:AB:14:15:33:65:A8:71:4D:00:A6:C6:FC:8F:7F:BE:95:BA:B0
SHA256: CB:B9:D5:BC:26:76:37:3A:66:67:99:95:5B:3B:8E:95:84:6C:A4:5F:52:39:EF:2A:23:36:6E:AB:B0:3E:81:E0
Signature algorithm name: SHA256withRSA
Version: 1
Trust this certificate? [no]: yes
Certificate was added to keystore

Then I had to set my CLASSPATH in order for Jython to find Connector/J.

$ export CLASSPATH=/usr/share/java/mysql-connector-java.jar

Running the test

I used this to test the database connetion:

#!/usr/bin/jython
from __future__ import with_statement
from com.ziclix.python.sql import zxJDBC
from java.lang import System

System.setProperty("javax.net.ssl.trustStore","/tmp/mysql57keystore.jks");
System.setProperty("javax.net.ssl.trustStorePassword","msandbox");

jdbc_url = 'jdbc:mysql://127.0.0.1:18785/test?useSSL=true'
with zxJDBC.connect(jdbc_url, 'msandbox', 'msandbox', 'com.mysql.jdbc.Driver') as c:
with c.cursor() as cur:
cur.execute('SHOW SESSION STATUS LIKE \'Ssl_%\'')
for result in cur:
print('%-40s: %s' % result)
raw_input('Press any key to continue...')

This resulted in a working connection. From the database side it looks like this:

mysql> SELECT ATTR_NAME, ATTR_VALUE FROM
-> performance_schema.session_connect_attrs WHERE PROCESSLIST_ID=40;
+------------------+----------------------+
| ATTR_NAME | ATTR_VALUE |
+------------------+----------------------+
| _runtime_version | 1.8.0_65 |
| _client_version | 5.1.36-SNAPSHOT |
| _client_name | MySQL Connector Java |
| _client_license | GPL |
| _runtime_vendor | Oracle Corporation |
+------------------+----------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.status_by_thread WHERE
-> THREAD_ID=(SELECT THREAD_ID FROM performance_schema.threads
-> WHERE PROCESSLIST_ID=40) and VARIABLE_NAME LIKE 'Ssl_version';
+-----------+---------------+----------------+
| THREAD_ID | VARIABLE_NAME | VARIABLE_VALUE |
+-----------+---------------+----------------+
| 65 | Ssl_version | TLSv1 |
+-----------+---------------+----------------+
1 row in set (0.00 sec)

I wanted to see if upgrading Connector/J would change anythong. So I downloaded the latest release and change my CLASSPATH to only include that.

mysql> SELECT ATTR_NAME, ATTR_VALUE FROM
-> performance_schema.session_connect_attrs WHERE PROCESSLIST_ID=45;
+------------------+----------------------+
| ATTR_NAME | ATTR_VALUE |
+------------------+----------------------+
| _runtime_version | 1.8.0_65 |
| _client_version | 5.1.38 |
| _client_name | MySQL Connector Java |
| _client_license | GPL |
| _runtime_vendor | Oracle Corporation |
+------------------+----------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.status_by_thread WHERE
-> THREAD_ID=(SELECT THREAD_ID FROM performance_schema.threads
-> WHERE PROCESSLIST_ID=45) and VARIABLE_NAME LIKE 'Ssl_version';
+-----------+---------------+----------------+
| THREAD_ID | VARIABLE_NAME | VARIABLE_VALUE |
+-----------+---------------+----------------+
| 70 | Ssl_version | TLSv1.1 |
+-----------+---------------+----------------+
1 row in set (0.00 sec)

And it did. Connector/J 5.1.38 uses TLSv1.1 instead of TLSv1.0

by Daniël van Eeden (noreply@blogger.com) at January 09, 2016 03:00 PM

Valeriy Kravchuk

"I'm Winston Wolf, I solve problems."

My (few) readers are probably somewhat tired of boring topics of metadata locks and gdb breakpoints that I discuss a lot this year, so for this weekend I decided to concentrate on something less technical but still important to me - the way I prefer to follow while providing support for MySQL.

Before I continue, it's time to add the explicit disclaimer: the views on how support engineer should work expressed below are mine alone and not those of my current (or any previous) employer. Specific case I describe may be entirely fictional and has nothing to do with any real life customer. I love thy customers in reality...

One of my favorite movies of all times is Pulp Fiction. Coincidentally, it was released in 1994, more or less at the same time when providing technical support started to become one of my regular job duties, not just a hobby or boring part of sysadmin job role I had to play even when not wanted. I had to provide support for the software I had written as soon as it had got first customer, then moved on to helping my colleagues (whom we developed software with together) with all kinds of technical problems they had, from proper coding style to linking Pro*C programs to, well, getting more disk space on NFS server that was actually my workstation. In 10 years or so this ended up with sending CV to MySQL AB and getting a Support Engineer job there (instead of a Developer one I applied for).

So, with this "support" role that I played already when I watched the "Pulp Fiction" for the first time, it was natural for me to find out that my favorite character there is "The Wolf". As soon as I started to provide MySQL Support, I've added the quote that is used as a title of this post to my LJ blog, as a motto. This person and his approach to "customers" of his very specific "service" (resolving all kinds of weird problems) looked ideal to me, and over years this had not changed. Let me remind you this great dialog with Vincent Vega (see the script for the context and details if you do not remember them by heart; I've fixed one typo there while quoting, bugs are everywhere, you know...):

               The Wolf and Jimmie turn, heading for the bedroom, leaving 
Vincent and Jules standing in the kitchen.

VINCENT
(calling after him)
A "please" would be nice.

The Wolf stops and turns around.

THE WOLF
Come again?

VINCENT
I said a "please" would be nice.

The Wolf takes a step toward him.

THE WOLF
Set is straight, Buster. I'm not
here to say "please."I'm here to
tell you what to do. And if self-
preservation is an instinct you
possess, you better fuckin' do it
and do it quick. I'm here to help.
If my help's not appreciated, lotsa
luck gentlemen.

JULES
It ain't that way, Mr. Wolf. Your
help is definitely appreciated.

VINCENT
I don't mean any disrespect. I just
don't like people barkin' orders at
me.

THE WOLF
If I'm curt with you, it's because
time is a factor. I think fast, I
talk fast, and I need you guys to
act fast if you want to get out of
this. So pretty please, with sugar
on top, clean the fuckin' car.

Over the years of doing support I've found out how important it is to tell it straight and honestly from the very beginning: "I'm not here to say "please."I'm here to tell you what to do."

In services the approach is often the following: "The customer is always right". In reality, speaking about customers of technical services at least, it may NOT be so. Customer may be wrong with his ideas about the root cause of the problem, and (as the wiki page linked above says) can be dishonest, have unrealistic expectations, and/or try to misuse a software. What's even more important, customers are rarely right when they say how services should be provided for them.

All these:
"Can I speak on Skype with Valerii? ... Can I chat with him on Skype please? It's a lot easier ... and I have questions" 
may end up with a chat, or may end up with an email that was already sent while this chat with oncall engineer happened, explaining both possible root causes of the problem, asking followup questions to define the real root case, and listing next steps to pinpoint or workaround the problem.

As this is still a technical blog, the problem in the case I had in mind while writing the above, was a very slow query execution on MariaDB 10.x, where query had 80K IDs in the IN list and previously it was executed fast on Percona XtraDB Cluster 5.5.x. The range of root causes to suspect was wide enough initially, from Bug #20932 to Bug #76030 and maybe to some MariaDB specific bugs to search for, to disk I/O problems, lock waits (what if the SELECT query was executed from SERIALIZABLE transaction). So, I kept asking for more diagnostic outputs and insisting on getting them in emails, and yes, long outputs are better shared via emails! Sharing them in chat or, even more, discussing over phone, or seeing them over a shared desktop session (the approaches many customers insist on) is neither faster nor more convenient. I say what to execute, you do that, copy/paste the output and send email (or reply in the issue via web interface, if you prefer to paste there), "pretty please, with sugar on top".

As soon as evidence provided shown there is no locking or disk I/O problem, my investigation was concentrated around this important fact: it was fast on 5.5.x and is slow on MariaDB 10.x with the same data. What is the difference in these versions that matter? Most likely it's in the optimizer and new optimizations they have in MariaDB! 

Do I know all the MariaDB optimizations by heart? No, surely, I have to go read about them, check how proper optimizer switches are named and make up my mind about suggestions. Is it OK to do all these while hanging on phone with customer or chatting with him? Well, maybe, if customer prefers to listen to my loud typing and sounds in my neighborhood... I'd prefer NOT to listen to any of that, ever, and not to hang on the chat with 10 minutes in between messages. So, the only chat reply customer got was: "Please reply to his request and I'll have him follow up."

The last my request at this moment was simple:
"I do not see optimizer_switch set explicitly in your my.cnf, so I assume defaults there as https://mariadb.com/kb/en/mariadb/optimizer-switch/ shows. Can you, please, check if setting this:

set optimizer_switch='extended_keys=off';

before running the problematic query allows it to run faster?
"
Even though the bug that led me to this idea (after reading the details of what optimizations MariaDB provided and how they are controlled) is still "Open", and I had never been able to create a test case not depending on any customer data, I've seen the problem more than once. I think fast, and I type fast, so I shared this suggestion immediately (1h 20 minutes after customer started to describe the problem in chat, 43 minutes after customer provided all the outputs required in email). And you know what happened? Chat continued more or less like this:
"wow... optimizer_switch with extended_keys off worked... from 1000k+ seconds to ... 1 second ... rofl ...  so extended_keys=on by default ... it literally halted our database today almost ... thank you for your assistance ... i would have never of thought to change that... and you guys caught it, so props"
So, the immediate problem was resolved and, you know, the resolution started with my explicit refusal to join any chat until I see the evidence and outputs requested in emails. It also ended with email suggesting to switch off one specific optimization that is used by default and is known to me, the one pretending to be expert, from several previous cases (similar or not so much, as I never seen anything like this this happening on MariaDB before). It took 1 hour 20 minutes from initial problem statement communicated to problem resolved, and all this time I worked asynchronously and concurrently on this and few other issues, and had not said or typed a word to customer in chat.

This is how The Wolf solves problems: 
"I'm here to tell you what to do. And if self-preservation is an instinct you possess, you better fuckin' do it and do it quick. I'm here to help. If my help's not appreciated, lotsa luck gentlemen"
I do it the same way, and. IMHO, any expert should do it this way. If a customer always knows better what to do and how to communicate, why they ended up with a problem that brought them to me? 

I try to prove every point I make while working on problems, and I expect from the other side to apply the same level of efforts - they have to prove me the way they want issue handled is better than the one I prefer and suggest, if they think this is the problem. After that I'll surely do what's the best from them, knowing why do we both do it so. 

In reality, the problem usually lies elsewhere... Now, it's time to re-read my New Year wishes post.

by Valeriy Kravchuk (noreply@blogger.com) at January 09, 2016 01:13 PM

Peter Zaitsev

ordering_operation: EXPLAIN FORMAT=JSON knows everything about ORDER BY processing

EXPLAIN FORMAT=JSON

EXPLAIN FORMAT=JSONWe’ve already discussed using the ORDER BY clause with subqueries. You can also, however, use the 

ORDER BY
 clause with sorting results of one of the columns. Actually, this is most common way to use this clause.

Sometimes such queries require using temporary tables or filesort, and a regular

EXPLAIN
  clause provides this information. But it doesn’t show if this job is needed for
ORDER BY
 or for optimizing another part of the query.

For example, if we take a pretty simple query ( 

select distinct last_name from employees order by last_name asc
) and run
EXPLAIN
  on it, we can see that both the temporary table and filesort were used. However, we can’t identify if these were applied to
DISTINCT
, or to
ORDER BY
, or to any other part of the query.

mysql> explain select distinct last_name from employees order by last_name ascG
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299379
     filtered: 100.00
        Extra: Using temporary; Using filesort
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select distinct `employees`.`employees`.`last_name` AS `last_name` from `employees`.`employees` order by `employees`.`employees`.`last_name`

EXPLAIN FORMAT=JSON
 tells us exactly what happened:

mysql> explain format=json select distinct last_name from employees order by last_name ascG
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "360183.80"
    },
    "ordering_operation": {
      "using_filesort": false,
      "duplicates_removal": {
        "using_temporary_table": true,
        "using_filesort": true,
        "cost_info": {
          "sort_cost": "299379.00"
        },
        "table": {
          "table_name": "employees",
          "access_type": "ALL",
          "rows_examined_per_scan": 299379,
          "rows_produced_per_join": 299379,
          "filtered": "100.00",
          "cost_info": {
            "read_cost": "929.00",
            "eval_cost": "59875.80",
            "prefix_cost": "60804.80",
            "data_read_per_join": "13M"
          },
          "used_columns": [
            "emp_no",
            "last_name"
          ]
        }
      }
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select distinct `employees`.`employees`.`last_name` AS `last_name` from `employees`.`employees` order by `employees`.`employees`.`last_name`

In the output above you see can see that 

ordering_operation
 does not use filesort:

"ordering_operation": {
      "using_filesort": false,

But

DISTINCT
 does:

"duplicates_removal": {
        "using_temporary_table": true,
        "using_filesort": true,

If we remove the 

DISTINCT
 clause, we will find that 
ORDER BY
 started using filesort, but does not need to create a temporary table:

mysql> explain format=json select last_name from employees order by last_name ascG
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "360183.80"
    },
    "ordering_operation": {
      "using_filesort": true,
      "cost_info": {
        "sort_cost": "299379.00"
      },
<rest of the output skipped>

This means that in the case of the first query, a sorting operation proceeded in parallel with the duplicate keys removal.

Conclusion:

EXPLAIN FORMAT=JSON
  provides details about
ORDER BY
  optimization which cannot be seen with a regular
EXPLAIN
 operation.

by Sveta Smirnova at January 09, 2016 06:25 AM

January 08, 2016

Serge Frezefond

MariaDB JSON text indexing

It is not new that we can store a JSON content in a normal table text field. This has always been the case in the past. But two key features were missing : filtering based on JSON content attributes and indexing of the JSON content. With MariaDB 10.1 CONNECT storage Engine we offer support for [...]

by Serge at January 08, 2016 05:43 PM

Peter Zaitsev

Apache Spark with Air ontime performance data

Apache SparkThere is a growing interest in Apache Spark, so I wanted to play with it (especially after Alexander Rubin’s Using Apache Spark post).

To start, I used the recently released Apache Spark 1.6.0 for this experiment, and I will play with “Airlines On-Time Performance” database from
http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time. You can find the scripts I used here https://github.com/Percona-Lab/ontime-airline-performance. The uncompressed dataset is about 70GB, which is not really that huge overall, but quite convenient to play with.

As a first step, I converted it to Parquet format. It’s a column based format, suitable for parallel processing, and it supports partitioning.

The script I used was the following:

# bin/spark-shell --packages com.databricks:spark-csv_2.11:1.3.0
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/data/opt/otp/On_Time_On_Time_Performance_*.csv")
sqlContext.setConf("spark.sql.parquet.compression.codec", "snappy")
df.write.partitionBy("Year").parquet("/data/flash/spark/otp")

Conveniently, by using just two commands (three if to count setting compression, “snappy” in this case) we can convert ALL of the .csv files into Parquet (doing it in parallel).

The datasize after compression is only 3.5GB, which is a quite impressive compression factor of 20x. I’m guessing the column format with repeatable data allows this compression level.

In general, Apache Spark makes it very easy to handle the Extract, Transform and Load (ETL) process.

Another one of Spark’s attractive features is that it automatically uses all CPU cores and execute complexes in parallel (something MySQL still can’t do). So I wanted to understand how fast it can execute a query compared to MySQL,  and how efficient it is in using multiple cores.

For this I decided to use a query such as:
"SELECT avg(cnt) FROM (SELECT Year,Month,COUNT(*) FROM otp WHERE DepDel15=1 GROUP BY Year,Month) t1"

Which translates to the following Spark DataFrame manipulation:

(pFile.filter("DepDel15=1").groupBy("Year","Month").count()).agg(avg("count")).show()

I should note that Spark is perfectly capable of executing this as SQL query, but I want to learn more about DataFrame manipulation.

The full script I executed is:

val pFile = sqlContext.read.parquet("/mnt/i3600/spark/otp1").cache();
for( a <- 1 to 6){
println("Try: " +a )
val t1=System.currentTimeMillis;
(pFile.filter("DepDel15=1").groupBy("Year","Month").count()).agg(avg("count")).show();
val t2=System.currentTimeMillis;
println("Try: "+a+" Time: " + (t2-t1))
}
exit

And I used the following command line to call the script:

for i in `seq 2 2 48` ; do bin/spark-shell --executor-cores $i -i run.scala  | tee -a $i.schema.res ; done

which basically tells it to use from 2 to 48 cores (the server I use has 48 CPU cores) in steps of two.

I executed this same query six times. The first time is a cold run, and data is read from the disk. The rest are hot runs, and the query should be executed from memory (this server has 128GB of RAM, and I allocated 100GB to the Spark executor).

I measured the execution time in cold and hot runs, and how it changed as more cores were added.

There was a lot of variance in the execution time of the hot runs, so I show all the results to demonstrate any trends.

Cold runs:
cold

More cores seem to help, but after a certain point – not so much.

Hot runs:
hot

The best execution time was when 14-22 cores were used. Adding more cores after that, seems to actually make things worse. I would guess that the datasize is small enough so that the communication and coordination overhead cost exceeded the benefits of more parallel processing cores.

Comparing to MySQL

Just to have some points for comparison, I executed the same query in MySQL 5.7 using the following table schema: https://github.com/Percona-Lab/ontime-airline-performance/blob/master/mysql/create_table.sql

The hot execution time for the same query in MySQL (MySQL can use only one CPU core to execute one query) is 350 seconds (or 350,000ms to compare with the data on charts) when using the table without indexes. This is about 11 times worse than the best execution time in Spark.

If we use a small trick and createa  covering index in MySQL designed for this query:

"ALTER TABLE ontime ADD KEY (Year,Month,DepDel15)"

then we can improve execution time to 90 seconds. This is still worse than Spark, but the difference is not as big. We can’t, however, create index for each ad-hoc query, while Spark is capable of processing a variety of queries.

In conclusion, I can say that Spark is indeed an attractive option for data analytics queries
(and in fact it can do much more). It is worth keep in mind, however, that in this experiment
it did not scale well with multiple CPU cores. I wonder if the same problem appears when we use multiple server nodes.

If you have recommendations on how I can improve the results, please post it in comments.

Spark configuration I used (in Standalone cluster setup):

export MASTER=spark://`hostname`:7077
export SPARK_MEM=100g
export SPARK_DAEMON_MEMORY=2g
export SPARK_LOCAL_DIRS=/mnt/i3600/spark/tmp
export SPARK_WORKER_DIR=/mnt/i3600/spark/tmp

by Vadim Tkachenko at January 08, 2016 01:28 AM

January 07, 2016

Valeriy Kravchuk

Exploring Metadata Locks with gdb - How One Can Use This?

In the previous post in this series I've concluded that metadata locks are acquired in "batches" and the function that implements this is called MDL_context::acquire_locks. Let's check quickly what it does to confirm where wait for metadata lock really happens. We need this to proceed finally from studying what locks are set and when (this is a long and complicated topic to spend time on out of general interest) to more practical topic: how to find the session that holds the blocking metadata lock in MySQL versions before 5.7.x.

I'll continue to use Percona Server 5.6.27 for now, just because I have it installed and have a source code at hand. So, MDL_context class is defined in sql/mdl.h file as follows:

/**
  Context of the owner of metadata locks. I.e. each server
  connection has such a context.
*/

class MDL_context
{
public:
  typedef I_P_List<MDL_ticket,
                   I_P_List_adapter<MDL_ticket,
                                    &MDL_ticket::next_in_context,
                                    &MDL_ticket::prev_in_context> >
          Ticket_list;

  typedef Ticket_list::Iterator Ticket_iterator;

  MDL_context();
...

  bool try_acquire_lock(MDL_request *mdl_request);
  bool acquire_lock(MDL_request *mdl_request, ulong lock_wait_timeout);
  bool acquire_locks(MDL_request_list *requests, ulong lock_wait_timeout);
...

  unsigned long get_lock_owner(MDL_key *mdl_key);...
private:
  THD *get_thd() const { return m_owner->get_thd(); }...

};

I've highlighted some of the functions that we may use later. Now, this is how MDL_context::aquire_locks() implementation looks like in mdl.cc:

/**
  Acquire exclusive locks. There must be no granted locks in the
  context.

  This is a replacement of lock_table_names(). It is used in
  RENAME, DROP and other DDL SQL statements.

  @param  mdl_requests  List of requests for locks to be acquired.

  @param lock_wait_timeout  Seconds to wait before timeout.

  @note The list of requests should not contain non-exclusive lock requests.
        There should not be any acquired locks in the context.

  @note Assumes that one already owns scoped intention exclusive lock.

  @retval FALSE  Success
  @retval TRUE   Failure
*/

bool MDL_context::acquire_locks(MDL_request_list *mdl_requests,
                                ulong lock_wait_timeout)
{

...
  for (p_req= sort_buf; p_req < sort_buf + req_count; p_req++)
  {
    if (acquire_lock(*p_req, lock_wait_timeout))
      goto err;
  }
  my_free(sort_buf);
  return FALSE;

err:
  /*
    Release locks we have managed to acquire so far.
    Use rollback_to_savepoint() since there may be duplicate
    requests that got assigned the same ticket.
  */
  rollback_to_savepoint(mdl_svp);
  /* Reset lock requests back to its initial state. */
...

  my_free(sort_buf);
  return TRUE;
}


Comments make me think that for some metadata locks we may see some different function called, but let's deal with exclusive ones for now, those that can be blocked. Now, the MDL_context::aquire_lock() we eventually call for each request:

/**
  Acquire one lock with waiting for conflicting locks to go away if needed.

  @param mdl_request [in/out] Lock request object for lock to be acquired

  @param lock_wait_timeout [in] Seconds to wait before timeout.

  @retval  FALSE   Success. MDL_request::ticket points to the ticket
                   for the lock.
  @retval  TRUE    Failure (Out of resources or waiting is aborted),
*/

bool
MDL_context::acquire_lock(MDL_request *mdl_request, ulong lock_wait_timeout)
{
  MDL_lock *lock;
  MDL_ticket *ticket= NULL;
  struct timespec abs_timeout;
  MDL_wait::enum_wait_status wait_status;
...

  /*
    Our attempt to acquire lock without waiting has failed.
    As a result of this attempt we got MDL_ticket with m_lock
    member pointing to the corresponding MDL_lock object which
    has MDL_lock::m_rwlock write-locked.
  */
  lock= ticket->m_lock;

  lock->m_waiting.add_ticket(ticket);

  /*
    Once we added a pending ticket to the waiting queue,
    we must ensure that our wait slot is empty, so
    that our lock request can be scheduled. Do that in the
    critical section formed by the acquired write lock on MDL_lock.
  */
  m_wait.reset_status();

  /*
    Don't break conflicting locks if timeout is 0 as 0 is used
    To check if there is any conflicting locks...
  */
  if (lock->needs_notification(ticket) && lock_wait_timeout)
    lock->notify_conflicting_locks(this);

  mysql_prlock_unlock(&lock->m_rwlock);

  will_wait_for(ticket);

  /* There is a shared or exclusive lock on the object. */
  DEBUG_SYNC(get_thd(), "mdl_acquire_lock_wait");

  find_deadlock();

  struct timespec abs_shortwait;
  set_timespec(abs_shortwait, 1);
  wait_status= MDL_wait::EMPTY;

  while (cmp_timespec(abs_shortwait, abs_timeout) <= 0)
  {
    /* abs_timeout is far away. Wait a short while and notify locks. */
    wait_status= m_wait.timed_wait(m_owner, &abs_shortwait, FALSE,
                                   mdl_request->key.get_wait_state_name());

    if (wait_status != MDL_wait::EMPTY)
      break;
    /* Check if the client is gone while we were waiting. */
    if (! m_owner->is_connected())
    {
      /*
       * The client is disconnected. Don't wait forever:
       * assume it's the same as a wait timeout, this
       * ensures all error handling is correct.
       */
      wait_status= MDL_wait::TIMEOUT;
     break;
    }

    mysql_prlock_wrlock(&lock->m_rwlock);
    if (lock->needs_notification(ticket))
      lock->notify_conflicting_locks(this);
    mysql_prlock_unlock(&lock->m_rwlock);
    set_timespec(abs_shortwait, 1);
  }
...


So, here we check if our request is deadlocking and then we wait. It's clear that any session that is hanging while waiting for the metadata lock will have MDL_context::aquire_lock() in the backtrace.

To check this assumption let me set up a usual test with SELECT * FROM t1 executed in active transaction and TRUNCATE TABLE blocked. This is how it may look like in the SHOW PROCESSLIST:

mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+-------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                           | Info              | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+---------------------------------+-------------------+-----------+---------------+
|  2 | root | localhost | test | Query   |  121 | Waiting for table metadata lock | truncate table t1 |         0 |             0 |
|  3 | root | localhost | test | Sleep   |    2 |                                 | NULL              |         1 |             0 |
|  4 | root | localhost | test | Query   |    0 | init                            | show processlist  |         0 |             0 |
+----+------+-----------+------+---------+------+---------------------------------+-------------------+-----------+---------------+
3 rows in set (0.00 sec)

In general (and in this case) we may NOT be able to find a thread that is executing some statement for longer than out blocked session is waiting for the metadata lock. So, which one is the blocking one?

Let's try to find a thread that mentions  MDL_context::aquire_lock() in the backtrace:

[root@centos percona-server]# gdb -p `pidof mysqld` -ex "set pagination 0" -ex "thread apply all bt" -batch 2>/dev/null | grep 'MDL_context::acquire_lock'
#3  0x000000000064a74b in MDL_context::acquire_lock (this=0x7fc005ff5140, mdl_request=0x7fc00614e488, lock_wait_timeout=) at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:2380
#4  0x000000000064ae1b in MDL_context::acquire_locks (this=0x7fc005ff5140, mdl_requests=) at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:2500
[root@centos percona-server]#

Note that I've executed the command as root, because mysqld process is owned by root in this case.

So, we see exactly one backtrace with this call:

MDL_context::acquire_lock (this=0x7fc005ff5140, mdl_request=0x7fc00614e488, lock_wait_timeout=)

and from the code review we know that 0x7fc00614e488 is of type MDL_request *. We also know that 0x7fc005ff5140 is of type MDL_context * (this pointer in the method of that class). Now we can attach gdb to the mysqld running and try to see what we can do with that pointers:

[root@centos percona-server]# gdb -p `pidof mysqld`
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-83.el6)
...

(gdb) set $pmdlr=(MDL_request *)0x7fc00614e488
(gdb) p $pmdlr
$1 = (MDL_request *) 0x7fc00614e488
(gdb) p $pmdlr->key
$2 = {m_length = 9, m_db_name_length = 4,
  m_ptr = "\003test\000t1", '\000' <repeats 378 times>,
  static m_namespace_to_wait_state_name = {{m_key = 102,
      m_name = 0xb9eb8c "Waiting for global read lock", m_flags = 0}, {
      m_key = 103, m_name = 0xb9eba9 "Waiting for backup lock", m_flags = 0}, {
      m_key = 104, m_name = 0xb9ed68 "Waiting for schema metadata lock",
      m_flags = 0}, {m_key = 105,
      m_name = 0xb9ed90 "Waiting for table metadata lock", m_flags = 0}, {
      m_key = 106,
      m_name = 0xb9edb0 "Waiting for stored function metadata lock",
      m_flags = 0}, {m_key = 107,
      m_name = 0xb9ede0 "Waiting for stored procedure metadata lock",
      m_flags = 0}, {m_key = 108,
      m_name = 0xb9ee10 "Waiting for trigger metadata lock", m_flags = 0}, {
      m_key = 109, m_name = 0xb9ee38 "Waiting for event metadata lock",
      m_flags = 0}, {m_key = 110, m_name = 0xb9ebc1 "Waiting for commit lock",
      m_flags = 0}, {m_key = 79, m_name = 0xb9295a "User lock", m_flags = 0}, {
      m_key = 111, m_name = 0xb9ebd9 "Waiting for binlog lock", m_flags = 0}}}
(gdb) p &($pmdlr->key)
$3 = (MDL_key *) 0x7fc00614e4a8


So, we can get a pointer to the MDL_key. Now, in MDL_context class we had seen the following method:

  unsigned long get_lock_owner(MDL_key *mdl_key);

that returns the id of the owning thread, and lucky we are, we have a pointer to MDL_context, so we can refer to this function and call it if needed:

(gdb) set $pmdlc=(MDL_context *)0x7fc005ff5140
(gdb) p $pmdlc
$4 = (MDL_context *) 0x7fc005ff5140
(gdb) p $pmdlc->get_lock_owner(&($pmdlr->key))
$5 = 3


Now, what's that value printed, 3? It's the id of thread in the SHOW PROCESSLIST that holds the blocking metadata lock. So, we quit from gdb:

(gdb) q
A debugging session is active.

        Inferior 1 [process 3003] will be detached.

Quit anyway? (y or n) y
Detaching from program: /usr/sbin/mysqld, process 3003


and then we know what thread to kill to release the blocking lock and let TRUNCATE to proceed:

mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+-------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                           | Info              | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+---------------------------------+-------------------+-----------+---------------+
|  2 | root | localhost | test | Query   | 1291 | Waiting for table metadata lock | truncate table t1 |         0 |             0 |
|  3 | root | localhost | test | Sleep   | 1172 |                                 | NULL              |         1 |             0 |
|  4 | root | localhost | test | Query   |    0 | init                            | show processlist  |         0 |             0 |
+----+------+-----------+------+---------+------+---------------------------------+-------------------+-----------+---------------+
3 rows in set (0.00 sec)

mysql> kill 3;
Query OK, 0 rows affected (0.05 sec)

mysql> show processlist;
+----+------+-----------+------+---------+------+-------+------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State | Info             | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+-------+------------------+-----------+---------------+
|  2 | root | localhost | test | Sleep   | 1305 |       | NULL             |         0 |             0 |
|  4 | root | localhost | test | Query   |    0 | init  | show processlist |         0 |             0 |
+----+------+-----------+------+---------+------+-------+------------------+-----------+---------------+
2 rows in set (0.00 sec)


So, from now on you know how to find out in MySQL 5.6 the blocking session/thread for any session that is waiting on metadata with just gdb and some grep.

This is not the only way, you can get the same result from different functions in different MDL-related classes. Your findings may depend on what part of the code you traced or read first, but essentially that's it: you can easily find a blocking thread for any waiting MDL lock request, using just a couple of gdb commands! No need to kill them one by one in a hope.

For any MySQL DBA gdb can be the tool useful in their routine work, NOT only while studying core dumps.

In the following blog posts in this series I'll get back to studying source code and will try to work with MySQL 5.7 in a similar way. Stay tuned!

by Valeriy Kravchuk (noreply@blogger.com) at January 07, 2016 04:55 PM

January 06, 2016

Peter Zaitsev

MongoDB revs you up: What storage engine is right for you? (Part 1)

MongoDB

MongoDBDifferentiating Between MongoDB Storage Engines

The tremendous data growth of the last decade has affected almost all aspects of applications and application use. Since nearly all applications interact with a database at some point, this means databases needed to adapt to the change in usage conditions as well. Database technology has grown significantly in the last decade to meet the needs of constantly changing applications. Enterprises often need to scale, modify, or replace their databases in order to meet new business demands.

Within a database management system (DBMS), there are many levels that can affect performance, including the choice of your database storage engine. Surprisingly, many enterprises don’t know they have a choice of storage engines, or that specific storage engine types are architected to handle specific scenarios. Often the best option depends on what function the database in question is designed to fulfill.

With Percona’s acquisition of Tokutek, we’ve moved from a mostly-MySQL company to having several MongoDB-based software options available.

MongoDB is a cross-platform, NoSQL, document-oriented database. It doesn’t use the traditional table-based relational database structure, and instead employs JSON-type documents with dynamic schemas. The intention is making the integration of certain application data types easier and faster.

This blog (the first in a series) will briefly review some of the available options for a MongoDB database storage engine, and the pros and cons of each. Hopefully it will help database administrators, IT staff, and enterprises realize that when it comes to MongoDB, you aren’t limited to a single storage engine choice.

What is a Storage Engine?

A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems. Thus, the storage engine is not actually the database, but a service that the database consumes for the storage and retrieval of information. Given that the storage engine is responsible for managing the information stored in the database, it greatly affects the overall performance of the database (or lack thereof, if the wrong engine is chosen).

Most storage engines are organized using one of the following structures: a Log-Structured Merge (LSM) tree, B-Tree or Fractal tree.

  • LSM Tree. An LSM tree has performance characteristics that make it attractive for providing indexed access to files with high insert volume. LSM trees seek to provide the excellent insertion performance of log type storage engines, while minimizing the impact of searches in a data structure that is “sorted” strictly on insertion order. LSMs buffer inserts, updates and deletes by using layers of logs that increase in size, and then get merged in sorted order to increase the efficiency of searches.
  • B-Tree. B-Trees are the most commonly implemented data structure in databases. Having been around since the early 1970’s, they are one of the most time-tested storage engine “methodologies.” B-Trees method of data maintenance makes searches very efficient. However, the need to maintain a well-ordered data structure can have a detrimental effect on insertion performance.
  • Fractal Tree. A Fractal Tree index is a tree data structure much like that of a B-tree (designed for efficient searches), but also ingests data into log-like structures for efficient memory usage in order to facilitate high-insertion performance. Fractal Trees were designed to ingest data at high rates of speed in order to interact efficiently with the storage for high bandwidth applications.

Fractal Trees and the LSM trees sound very similar. The main differentiating factor, however, is the manner in which they sort the data into the tree for efficient searches. LSM trees merge data into a tree from a series of logs as the logs fill up. Fractal Trees sort data into log-like structures (message buffers) along the proper data path in the tree.

What storage engine is best?

That question is not a simple one. In order decide which engine to choose, it’s necessary to determine the core functionality provided in each engine. Core functionality can generally be aggregated into three areas:

  • Locking types. Locking within database engines defines how access and updates to information are controlled. When an object in the database is locked for updating, other processes cannot modify (or in some cases read) the data until the update has completed. Locking not only affects how many different applications can update the information in the database, it can also affect queries on that data. It is important to monitor how queries access data, as the data could be altered or updated as it is being accessed. In general, such delays are minimal. The bulk of the locking mechanism is devoted to preventing multiple processes updating the same data. Since both additions (INSERT statements) and alterations (UPDATE statements) to the data require locking, you can imagine that multiple applications using the same database can have a significant impact. Thus, the “granularity” of the locking mechanism can drastically affect the throughput of the database in “multi-user” (or “highly-concurrent”) environments.
  • Indexing. The indexing method can dramatically increase database performance when searching and recovering data. Different storage engines provide different indexing techniques, and some may be better suited for the type of data you are storing. Typically, every index defined on a collection is another data structure of the particular type the engine uses (B-tree for WiredTiger, Fractal Tree for PerconaFT, and so forth). The efficiency of that data structure in relation to your workload is very important. An easy way of thinking about it is viewing every extra index as having performance overhead. A data structure that is write-optimized will have lower overhead for every index in a high-insert application environment than a non-write optimized data structure would. For use cases that require a large number of indexes, choosing an appropriate storage engine can have a dramatic impact.
  • Transactions. Transactions provide data reliability during the update or insert of information by enabling you to add data to the database, but only to commit that data when other conditions and stages in the application execution have completed successfully. For example, when transferring information (like a monetary credit) from one account to another, you would use transactions to ensure that both the debit from one account and the credit to the other completed successfully. Often, you will hear this referred to as “atomicity.” This means the operations that are bundled together are an immutable unit: either all operations complete successfully, or none do. Despite the ability of RocksDB, PerconaFT and WiredTiger to support transactions, as of version 3.2 this functionality is not available in the MongoDB storage engine API. Multi-document transactions cannot be used in MongoDB. However, atomicity can be achieved at the single document level. According to statements from MongoDB, Inc., multi-document transactions will be supported in the future, but a firm date has not been set as of this writing.

Now that we’ve established a general framework, we’ll move onto discussing engines. For the first blog in this series, we’ll look at MMAPv1 (the default storage engine that comes with MongoDB up until the release 3.0).

MMAPv1

Find it in: MongoDB or Percona builds

MMAPv1 is MongoDB’s original storage engine, and was the default engine in MongoDB 3.0 and earlier. It is a B-tree based system that offloads much of the functions of storage interaction and memory management to the operating system. MongoDB is based on memory mapped files.

The MMAP storage engine uses a process called “record allocation” to grab disk space for document storage. All records are contiguously located on disk, and when a document becomes larger than the allocated record, it must allocate a new record. New allocations require moving a document and updating all indexes that refer to the document, which takes more time than in-place updates and leads to storage fragmentation. Furthermore, MMAPv1 in it’s current iterations usually leads to high space utilization on your filesystem due to over-allocation of record space and it’s lack of support for compression.

As mentioned previously, a storage engine’s locking scheme is one of the most important factors in overall database performance. MMAPv1 has collection-level locking – meaning only one insert, update or delete operation can use a collection at a time. This type of locking scheme creates a very common scenario in concurrent workloads, where update/delete/insert operations are always waiting for the operation(s) in front of them to complete. Furthermore, oftentimes those operations are flowing in more quickly than they can be completed in serial fashion by the storage engine. To put it in context, imagine a giant supermarket on Sunday afternoon that only has one checkout line open: plenty of customers, but low throughput!

Given the storage engine choices brought about by the storage engine API in MongoDB 3.0, it is hard to imagine an application that demands the MMAPv1 storage engine for optimized performance. If you read between the lines, you could conclude that MongoDB, Inc. would agree given that the default engine was switched to WiredTiger in v3.2.

Conclusion

Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of RocksDB, PerconaFT and WiredTiger to help specifically differentiate between these engines.

In the next post, we’ll examine the ins and outs of MongoDB’s new default storage engine, WiredTiger.

 

 

by Jon Tobin at January 06, 2016 06:31 PM

Valeriy Kravchuk

Exploring Metadata Locks with gdb - Double Checking the Initial Results

Some results in my initial post in this series led me to questions that I'll try to answer here. First of all, I noted that SELECT from a single table ended up with just one metadata lock request:

(gdb) b MDL_request::init
Breakpoint 1 at 0x648f13: file /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc, line 1266.
Breakpoint 2 at 0x648e70: file /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc, line 1245.
warning: Multiple breakpoints were set.
Use the "delete" command to delete unwanted breakpoints.
(gdb) c
Continuing.
[Switching to Thread 0x7ff224c9f700 (LWP 2017)]

Breakpoint 2, MDL_request::init (this=0x7ff1fbe425a8,
    mdl_namespace=MDL_key::TABLE, db_arg=0x7ff1fbe421c8 "test",
    name_arg=0x7ff1fbe421d0 "t1", mdl_type_arg=MDL_SHARED_READ,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {


MDL_SHARED_READ lock on the table is expected, we read it after all. Let's check with backtrace where this request happens:






(gdb) bt
#0  MDL_request::init (this=0x7ff1fbe425a8, mdl_namespace=MDL_key::TABLE,
    db_arg=0x7ff1fbe421c8 "test", name_arg=0x7ff1fbe421d0 "t1",
    mdl_type_arg=MDL_SHARED_READ, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000006d3033 in st_select_lex::add_table_to_list (this=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:7479
#2  0x000000000077eb6d in MYSQLparse (YYTHD=0x7ff1fbb65000)
    at /var/lib/jenkins/jobs/percona-server-5.6-source-tarballs/workspace/sql/sql_yacc.yy:10888
#3  0x00000000006decef in parse_sql (thd=0x7ff1fbb65000,
    parser_state=0x7ff224c9e130, creation_ctx=0x0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:9005
#4  0x00000000006df1b1 in mysql_parse (thd=0x7ff1fbb65000, rawbuf=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:6878
#5  0x00000000006e0d1f in dispatch_command (command=<value optimized out>,
    thd=0x7ff1fbb65000, packet=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:1442
#6  0x00000000006ad692 in do_handle_one_connection (thd_arg=Unhandled dwarf expression opcode 0xf3
)
...


So, it happens as soon as we noted a table to read from while parsing the query text. What surprised me initially is no other MDL lock requests, nothing at schema level, for example. I expected that metadata locks should prevent dropping the schema when I read from some table there, but had not seen the schema-level lock for this.

To get more details on this and try to make sure I do not miss anything by studying a corner case, I've added a row to the table and then tried to do the same SELECT in explicit transaction, and (after SELECT completed, but transaction remained active) I tried to DROP DATABASE from the second session while this transaction was still active. (This test will also let us understand what metadata locks are requested for DROP DATABASE, by the way.) I've got the following in gdb:

[New Thread 0x7ff224c5e700 (LWP 2048)]
[Switching to Thread 0x7ff224c5e700 (LWP 2048)]

Breakpoint 2, MDL_request::init (this=0x7ff224c5a040,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c5a040, mdl_namespace=MDL_key::GLOBAL,
    db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000007d1ed7 in lock_schema_name (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test")
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/lock.cc:794
#2  0x00000000006b03d4 in mysql_rm_db (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:787
...

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a1f0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a3a0,
    mdl_namespace=MDL_key::SCHEMA, db_arg=0x7ff1e581d0b0 "test",
    name_arg=0xba2ae4 "", mdl_type_arg=MDL_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c5a3a0, mdl_namespace=MDL_key::SCHEMA,
    db_arg=0x7ff1e581d0b0 "test", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_EXCLUSIVE, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000007d1fb2 in lock_schema_name (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test")
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/lock.cc:801
#2  0x00000000006b03d4 in mysql_rm_db (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:787
#3  0x00000000006daaa2 in mysql_execute_command (thd=0x7ff1fbbb5000)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:4371
#4  0x00000000006df518 in mysql_parse (thd=0x7ff1fbbb5000, rawbuf=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_parse.cc:6972
...

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff1e58313b0,
    mdl_namespace=MDL_key::TABLE, db_arg=0x7ff1e5831570 "test",
    name_arg=0x7ff1e5831575 "t1", mdl_type_arg=MDL_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff1e58313b0, mdl_namespace=MDL_key::TABLE,
    db_arg=0x7ff1e5831570 "test", name_arg=0x7ff1e5831575 "t1",
    mdl_type_arg=MDL_EXCLUSIVE, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000006b0857 in find_db_tables_and_rm_known_files (
    thd=0x7ff1fbbb5000, db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:1108
#2  mysql_rm_db (thd=0x7ff1fbbb5000, db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:812
...


Now it's clear what happens, eventually we have to drop the table while dropping the schema, and here we surely hit a blocking metadata lock that SELECT set! Let's continue:

---Type <return> to continue, or q <return> to quit---q
Quit

(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff1e581ece0,
    mdl_namespace=MDL_key::SCHEMA, db_arg=0x7ff1e5831570 "test",
    name_arg=0xba2ae4 "", mdl_type_arg=MDL_INTENTION_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff1e581ece0, mdl_namespace=MDL_key::SCHEMA,
    db_arg=0x7ff1e5831570 "test", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x000000000068da14 in lock_table_names (thd=0x7ff1fbbb5000,
    tables_start=0x7ff1e5831010, tables_end=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5033
#2  0x00000000006b05df in mysql_rm_db (thd=0x7ff1fbbb5000,
    db=0x7ff1e581d0b0 "test", if_exists=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_db.cc:834
...


So, while our request for MDL_EXCLUSIVE lock for the table surely could not be satisfied, server continued with some further requests:

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a1f0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a3a0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.


and only at this stage it hanged (waiting for the transaction where SELECT happened to complete). As soon as I committed there:

Breakpoint 2, MDL_request::init (this=0x7ff224c5a260,
    mdl_namespace=MDL_key::TABLE, db_arg=0xb925ef "mysql",
    name_arg=0xb94906 "proc", mdl_type_arg=MDL_SHARED_READ,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a360,
    mdl_namespace=MDL_key::TABLE, db_arg=0xb925ef "mysql",
    name_arg=0xb94906 "proc", mdl_type_arg=MDL_SHARED_WRITE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {




So, we tried to read and then maybe change the mysql.proc table (to drop procedures created in this database, honestly I was not sure if there was any one there at the moment). Next:

(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c59cb0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c599c0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5a250,
    mdl_namespace=MDL_key::TABLE, db_arg=0xb925ef "mysql",
    name_arg=0xc111a8 "event", mdl_type_arg=MDL_SHARED_WRITE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.




That is, some lock requests and we try to delete events in this database from mysql.event table obviously. Let's continue:

Breakpoint 2, MDL_request::init (this=0x7ff224c59900,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c59900, mdl_namespace=MDL_key::GLOBAL,
    db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x000000000068c13a in open_table (thd=0x7ff1fbbb5000,
    table_list=0x7ff224c59eb0, ot_ctx=0x7ff224c59c10)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:2895
#2  0x0000000000694585 in open_and_process_table (thd=0x7ff1fbbb5000,
    start=0x7ff224c59e48, counter=0x7ff224c59e50, flags=2048,
    prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:4797
#3  open_tables (thd=0x7ff1fbbb5000, start=0x7ff224c59e48,
    counter=0x7ff224c59e50, flags=2048, prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5304
#4  0x0000000000694d74 in open_and_lock_tables (thd=0x7ff1fbbb5000,
    tables=0x7ff224c59eb0, derived=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5960


#5  0x000000000085039d in open_and_lock_tables (thd=0x7ff1fbbb5000, lock_type=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.h:477
#6  Event_db_repository::open_event_table (thd=0x7ff1fbbb5000, lock_type=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/event_db_repository.cc:619
#7  0x0000000000850bfc in Event_db_repository::drop_schema_events (this=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/event_db_repository.cc:1011---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c59610,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) bt
#0  MDL_request::init (this=0x7ff224c59610, mdl_namespace=MDL_key::BACKUP,
    db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
#1  0x00000000007d1dfe in Global_backup_lock::acquire_protection (
    this=0x7ff1fbbb6a08, thd=0x7ff1fbbb5000, duration=MDL_STATEMENT,
    lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/lock.cc:1225
#2  0x000000000068ce20 in open_table (thd=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:3258
#3  0x0000000000694585 in open_and_process_table (thd=0x7ff1fbbb5000,
    start=0x7ff224c59e48, counter=0x7ff224c59e50, flags=2048,
    prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:4797
#4  open_tables (thd=0x7ff1fbbb5000, start=0x7ff224c59e48,
    counter=0x7ff224c59e50, flags=2048, prelocking_strategy=0x7ff224c59ea0)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5304
#5  0x0000000000694d74 in open_and_lock_tables (thd=0x7ff1fbbb5000,
    tables=0x7ff224c59eb0, derived=Unhandled dwarf expression opcode 0xf3
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/sql_base.cc:5960

...
---Type <return> to continue, or q <return> to quit---q
 at /uQuit
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c5ab00,
    mdl_namespace=MDL_key::BINLOG, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

We ended up with writing to the binary log and DROP DATABASE was completed.

Let's summarize this experience. First of all, it seems in current implementation there is no need to set metadata locks at schema level to prevent dropping the database we read from with SELECT - table level metadata lock will eventually block the DROP DATABASE request. If we have many tables in the database this may happen not at the very first one, so we can drop some tables before getting blocked. This is something to study later. I'd really prefer DROP DATABASE to be atomic and do not even start if it can not complete at the moment. See Bug #79610 also.

We see table-level metadata lock requests for mysql.proc and mysql.events tables when DROP DATABASE is executed. This is expected and reasonable - we can NOT drop the database (schema) if there are some stored functions, procedures or events are defined there that are used.

We had also noted that even if some metadata lock is requested but can not be obtained, processing may continue for some time until stopped, and then processing resumes when we finally get the required metadata lock.

This is the code to study in sql/lock.cc:

 760 /**
 761   Obtain an exclusive metadata lock on a schema name.
 762
 763   @param thd         Thread handle.
 764   @param db          The database name.
 765
 766   This function cannot be called while holding LOCK_open mutex.
 767   To avoid deadlocks, we do not try to obtain exclusive metadata
 768   locks in LOCK TABLES mode, since in this mode there may be
 769   other metadata locks already taken by the current connection,
 770   and we must not wait for MDL locks while holding locks.
 771
 772   @retval FALSE  Success.
 773   @retval TRUE   Failure: we're in LOCK TABLES mode, or out of memory,
 774                  or this connection was killed.
 775 */
 776
 777 bool lock_schema_name(THD *thd, const char *db)
 778 {
 779   MDL_request_list mdl_requests;
 780   MDL_request global_request;
 781   MDL_request backup_request;
 782   MDL_request mdl_request;
 783
 784   if (thd->locked_tables_mode)
 785   {
 786     my_message(ER_LOCK_OR_ACTIVE_TRANSACTION,
 787                ER(ER_LOCK_OR_ACTIVE_TRANSACTION), MYF(0));
 788     return TRUE;
 789   }
 790
 791   if (thd->global_read_lock.can_acquire_protection())
 792     return TRUE;
 793   global_request.init(MDL_key::GLOBAL, "", "", MDL_INTENTION_EXCLUSIVE,
 794                       MDL_STATEMENT);
 795
 796   if (thd->backup_tables_lock.abort_if_acquired())
 797     return true;
 798   thd->backup_tables_lock.init_protection_request(&backup_request,
 799                                                   MDL_STATEMENT);
 800
 801   mdl_request.init(MDL_key::SCHEMA, db, "", MDL_EXCLUSIVE, MDL_TRANSACTION)     ;
 802
 803   mdl_requests.push_front(&mdl_request);
 804   mdl_requests.push_front(&backup_request);
 805   mdl_requests.push_front(&global_request);
 806
 807   if (thd->mdl_context.acquire_locks(&mdl_requests,
 808                                      thd->variables.lock_wait_timeout))
 809     return TRUE;
 810
 811   DEBUG_SYNC(thd, "after_wait_locked_schema_name");
 812   return FALSE;


From this code we see that actually we acquire metadata locks in batches(!). This explaines why we had not stopped at our first "blocked" request immediately, and complicates tracing in gdb. We also see that we have some mdl_context object in THD structure, and we may use this later to study pending requests per thread. One day I'll start setting breakpoints at MDLcontext::acquire_locks as well.

It's also more or less clear what MDL_key::GLOBAL namespace is used for - this is a metadata lock requested by FLUSH TABLES WITH READ LOCK global lock, and thus we check it for any "write" operation, even before we try to do it. If I execute FLUSH TABLES WITH READ LOCK, I get the following in my gdb session with a breakpoint set:

[Switching to Thread 0x7ff224c9f700 (LWP 2017)]

Breakpoint 2, MDL_request::init (this=0x7ff224c9c9f0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_SHARED, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7ff224c9c9f0,
    mdl_namespace=MDL_key::COMMIT, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_SHARED, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.


and then it's completed. Now, on UNLOCK TABLES nothing happens at MDL level.

The last topic I'd like to discuss in this post are metadata locks with namespace defined like this: mdl_namespace=MDL_key::BACKUP, "backup locks". Manual for 5.7 does not mention them, but they still exist in the code even there:

[openxs@centos ~]$ grep -rn MDL_key::BACKUP ~/git/mysql-server/*
[openxs@centos ~]$ echo $?
1

In MySQL we see the following in mdl.h:

  enum enum_mdl_namespace { GLOBAL=0,
                            TABLESPACE,
                            SCHEMA,
                            TABLE,
                            FUNCTION,
                            PROCEDURE,
                            TRIGGER,
                            EVENT,
                            COMMIT,
                            USER_LEVEL_LOCK,
                            LOCKING_SERVICE,
                            /* This should be the last ! */
                            NAMESPACE_END };


So, backup locks is the unique feature of Percona Server, explained in the manual. It is implemented with MDL locks. As soon as we run LOCK TABLES FOR BACKUP, we see in gdb session:

Breakpoint 2, MDL_request::init (this=0x7ff224c9ca90,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_SHARED, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {


This reminded me about the difference Percona Server can make, so I'll try to re-check my results with MySQL 5.6 next time as well. Stay tuned!

by Valeriy Kravchuk (noreply@blogger.com) at January 06, 2016 12:18 PM

Open Query

Web Security: SHA1 SSL Deprecated

You may not be aware that the mechanism used to fingerprint the SSL certificates that  keep your access to websites encrypted and secure is changing. The old method, known as SHA1 is being deprecated – meaning it will no longer be supported. As per January 2016 various vendors will no longer support creating certificates with SHA1, and browsers show warnings when they encounter an old SHA1 certificate. Per January 2017 browsers will reject old certificates.

The new signing method, known as SHA2, has been available for some time. Users have had a choice of signing methods up until now, but there are still many sites using old certificates out there. You may want to check the security on any SSL websites you own or run!

To ensure your users’ security and privacy, force https across your entire website, not just e-commerce or other sections. You may have noticed this move on major websites over the last few years.

For more information on the change from SHA1 to SHA2 you can read:

To test if your website is using a SHA1 or SHA2 certificate you can use one of the following tools:

Open Query also offers a Security Review package, in which we check on a broad range of issues in your system’s front-end and back-end and provide you with an assessment and recommendations. This is most useful if you are looking at a form of security certification.

by Nikolai Lusan at January 06, 2016 02:41 AM

January 05, 2016

Valeriy Kravchuk

Exploring Metadata Locks with gdb - First Steps

Metadata locks are used in MySQL since version 5.5.3 and are available in GA MySQL versions for 6 years already. Still they are far from clearly documented (even less their implementation is documented in details - try to find anything about metadata locks in current MySQL Internals manual) and often causes "unexpected" problems for users.

Only since MySQL 5.7.3 (and only for a few months in GA releases since 5.7.9) we have an easy, official and documented way to check metadata locks set by different sessions using the metadata_locks table in  Performance Schema. I've already explained how to use it in my blog post at Percona's blog. Still, most of MySQL servers in production are NOT 5.7.x today, so majority of MySQL users have to wonder what metadata locks are set and when, desperately killing old sessions in a hope to blindly find one holding the blocking metadata lock. In some cases the blocking session is easy to identify from SHOW PROCESSLIST or SHOW ENGINE INNODB STATUS, but if we have several long running sessions that executes nothing at the moment (or something that does not refer to the table we can not TRUNCATE or ALTER, and there are no table level locks mentioned in INNODB STATUS to help us as a hint), there is not so much one can do...

I was not ready to give up easily on this and just suggest to check/kill longest active transactions/sessions one by one, check for running mysqldump --single-transaction etc every time somebody asks why TRUNCATE TABLE is blocked on MySQL 5.5.x or 5.6.x. Based on my recent preferences of using gdb for every task that does not have official "SQL way" to complete, I decided last year to spend some time checking the source code (in mdl.cc and mdl.h) and try to use gdb to find out what metadata locks are set/requested and in what order when specific SQL statements are executed, how to "see" all the metadata locks if you are NOT on MySQL 5.7.x and, most importantly for real life cases, how to find out what exact session is holding the blocking metadata lock. It seems I've collected enough logs of debugging sessions and spent enough time reading the code to start sharing my findings that may become useful for wider audience. Hence this my first post in the series I plan to devote to metadata lock studies.

In this post I'll concentrate on how to check what metadata locks are requested and in what order. For this, we should concentrate on MDL_request class (yes, metadata locks are implemented in C++, with classes, methods, overloaded operators, friend classes etc), see mdl.h:




/**
  A pending metadata lock request.

  A lock request and a granted metadata lock are represented by
  different classes because they have different allocation
  sites and hence different lifetimes. The allocation of lock requests is
  controlled from outside of the MDL subsystem, while allocation of granted
  locks (tickets) is controlled within the MDL subsystem.

  MDL_request is a C structure, you don't need to call a constructor
  or destructor for it.
*/

class MDL_request
{
public:
  /** Type of metadata lock. */
  enum          enum_mdl_type type;  /** Duration for requested lock. */
  enum enum_mdl_duration duration;
  /**
    Pointers for participating in the list of lock requests for this context.
  */
  MDL_request *next_in_list;
  MDL_request **prev_in_list;
  /**
    Pointer to the lock ticket object for this lock request.
    Valid only if this lock request is satisfied.
  */
  MDL_ticket *ticket;

  /** A lock is requested based on a fully qualified name and type. */
  MDL_key key;
public:
  static void *operator new(size_t size, MEM_ROOT *mem_root) throw ()
  { return alloc_root(mem_root, size); }
  static void operator delete(void *ptr, MEM_ROOT *mem_root) {}

  void init(MDL_key::enum_mdl_namespace namespace_arg,
            const char *db_arg, const char *name_arg,
            enum_mdl_type mdl_type_arg,
            enum_mdl_duration mdl_duration_arg);
  void init(const MDL_key *key_arg, enum_mdl_type mdl_type_arg,
            enum_mdl_duration mdl_duration_arg);
...

};

I've highlighted class members that will be important for further checks. Metadata lock types are defined in mdl.h as follows (most comments were removed, read them in the source code!):

enum enum_mdl_type {
  /*
    An intention exclusive metadata lock. Used only for scoped locks.
    Owner of this type of lock can acquire upgradable exclusive locks on
    individual objects.
    Compatible with other IX locks, but is incompatible with scoped S and
    X locks.
  */
  MDL_INTENTION_EXCLUSIVE= 0,
  MDL_SHARED,

  MDL_SHARED_HIGH_PRIO,
  MDL_SHARED_READ,
  MDL_SHARED_WRITE,
  MDL_SHARED_UPGRADABLE,
  MDL_SHARED_NO_WRITE,
  /*
    An exclusive metadata lock.
    A connection holding this lock can modify both table's metadata and data.
    No other type of metadata lock can be granted while this lock is held.
    To be used for CREATE/DROP/RENAME TABLE statements and for execution of
    certain phases of other DDL statements.
  */
  MDL_EXCLUSIVE,
  /* This should be the last !!! */
  MDL_TYPE_END};


Metadata locks can be hold for different duration (till the end of statement, till the end of transactions -most of them), defined as follows:

enum enum_mdl_duration {
  /**
    Locks with statement duration are automatically released at the end
    of statement or transaction.
  */
  MDL_STATEMENT= 0,
  /**
    Locks with transaction duration are automatically released at the end
    of transaction.
  */
  MDL_TRANSACTION,
  /**
    Locks with explicit duration survive the end of statement and transaction.
    They have to be released explicitly by calling MDL_context::release_lock().
  */
  MDL_EXPLICIT,
  /* This should be the last ! */
  MDL_DURATION_END };


With this in mind, we are almost ready to start with gdb, the idea is to set breakpoint on MDL_request::init and then see what requests are made and in what order. This is how it's defined in mdl.cc:

/**
  Initialize a lock request.

  This is to be used for every lock request.

  Note that initialization and allocation are split into two
  calls. This is to allow flexible memory management of lock
  requests. Normally a lock request is stored in statement memory
  (e.g. is a member of struct TABLE_LIST), but we would also like
  to allow allocation of lock requests in other memory roots,
  for example in the grant subsystem, to lock privilege tables.

  The MDL subsystem does not own or manage memory of lock requests.

  @param  mdl_namespace  Id of namespace of object to be locked
  @param  db             Name of database to which the object belongs
  @param  name           Name of of the object
  @param  mdl_type       The MDL lock type for the request.
*/

void MDL_request::init(MDL_key::enum_mdl_namespace mdl_namespace,
                       const char *db_arg,
                       const char *name_arg,
                       enum_mdl_type mdl_type_arg,
                       enum_mdl_duration mdl_duration_arg)
{
  key.mdl_key_init(mdl_namespace, db_arg, name_arg);
  type= mdl_type_arg;
  duration= mdl_duration_arg;
  ticket= NULL;
}


I've got a lot of "hints" from the above, up to the idea that MDL_keys can probably be created "manually" in gdb (not that it worked well). What we miss in the above is the list of metadata lock namespaces, so let's get back to mdl.h:

class MDL_key
{
public:
#ifdef HAVE_PSI_INTERFACE
  static void init_psi_keys();
#endif

  /**
    Object namespaces.
    Sic: when adding a new member to this enum make sure to
    update m_namespace_to_wait_state_name array in mdl.cc!

    Different types of objects exist in different namespaces
     - TABLE is for tables and views.
     - FUNCTION is for stored functions.
     - PROCEDURE is for stored procedures.
     - TRIGGER is for triggers.
     - EVENT is for event scheduler events
    Note that although there isn't metadata locking on triggers,
    it's necessary to have a separate namespace for them since
    MDL_key is also used outside of the MDL subsystem.
  */
  enum enum_mdl_namespace { GLOBAL=0,
                            BACKUP,
                            SCHEMA,
                            TABLE,
                            FUNCTION,
                            PROCEDURE,
                            TRIGGER,
                            EVENT,
                            COMMIT,
                            USER_LOCK,           /* user level locks. */
                            BINLOG,
                            /* This should be the last ! */
                            NAMESPACE_END };
...


As a side note, I quoted the code and worked with Percona Server 5.6.27, but this should not matter much - these parts of related code in 5.5.x and 5.6.x is more or less similar/same. In 5.7 there were many notable changes. For example, there is no BINLOG namespace, but there is one for the locking service...

So, with all these details in mind, let's try to use the gdb to find out what metadata locks are requested by a couple of simple statements, SELECT and TRUNCATE. I attached gdb to Percona Server 5.6.27 with binary log enabled (and debug symbols installed), and did the following:

[root@centos openxs]# gdb -p `pidof mysqld`
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-83.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
...

Reading symbols from /usr/lib64/mysql/plugin/tokudb_backup.so...Reading symbols from /usr/lib/debug/usr/lib64/mysql/plugin/tokudb_backup.so.debug...done.
done.
Loaded symbols for /usr/lib64/mysql/plugin/tokudb_backup.so
0x00007feb2d145113 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.3.x86_64 jemalloc-3.6.0-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-42.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libgcc-4.4.7-16.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64 libstdc++-4.4.7-16.el6.x86_64 nss-softokn-freebl-3.14.3-23.el6_7.x86_64 numactl-2.0.9-2.el6.x86_64 openssl-1.0.1e-42.el6_7.1.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) b MDL_request::init
Breakpoint 1 at 0x648f13: file /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc, line 1266.
Breakpoint 2 at 0x648e70: file /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc, line 1245.
warning: Multiple breakpoints were set.
Use the "delete" command to delete unwanted breakpoints.
(gdb) c
Continuing.


Now, in a separate session connected to the test database with simple (InnoDB) table t1, I did:

mysql> select * from t1;

and got the following in the gdb session:






[Switching to Thread 0x7feb2f44a700 (LWP 2319)]

Breakpoint 2, MDL_request::init (this=0x7feb064445a0,
    mdl_namespace=MDL_key::TABLE, db_arg=0x7feb06444760 "test",
    name_arg=0x7feb064441c8 "t1", mdl_type_arg=MDL_SHARED_READ,
    mdl_duration_arg=MDL_TRANSACTION
)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.





As soon as I continued, select returned me results. So, we see that SELECT had requested an MDL_SHARED_READ lock till the end of transaction for the table t1 in the database test. This already forces me to ask some followup questions to myself, but let's postpone them (and replies) to some later post. Let's try to do TRUNCATE TABLE:

mysql> truncate table t1;

I've got the following in the gdb session:

Breakpoint 2, MDL_request::init (this=0x7feb06444488,
    mdl_namespace=MDL_key::TABLE, db_arg=0x7feb06444648 "test",
    name_arg=0x7feb064440b0 "t1", mdl_type_arg=MDL_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb06444660,
    mdl_namespace=MDL_key::SCHEMA, db_arg=0x7feb06444648 "test",
    name_arg=0xba2ae4 "", mdl_type_arg=MDL_INTENTION_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f447720,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f4478d0,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb06444808,
    mdl_namespace=MDL_key::SCHEMA, db_arg=0x7feb06444648 "test",
    name_arg=0xba2ae4 "", mdl_type_arg=MDL_INTENTION_EXCLUSIVE,
    mdl_duration_arg=MDL_TRANSACTION)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f4472c0,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f447470,
    mdl_namespace=MDL_key::BACKUP, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f447410,
    mdl_namespace=MDL_key::GLOBAL, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_STATEMENT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {

(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f447a60,
    mdl_namespace=MDL_key::COMMIT, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.

Breakpoint 2, MDL_request::init (this=0x7feb2f446900,
    mdl_namespace=MDL_key::BINLOG, db_arg=0xba2ae4 "", name_arg=0xba2ae4 "",
    mdl_type_arg=MDL_INTENTION_EXCLUSIVE, mdl_duration_arg=MDL_EXPLICIT)
    at /usr/src/debug/percona-server-5.6.27-76.0/sql/mdl.cc:1245
1245    {
(gdb) c
Continuing.


and only after all these 10 (if I am not mistaking) MDL requests I've got TRUNCATE TABLE executed. Some of these requests are clear, like the very first MDL_EXCLUSIVE for the table we were truncating, or the next MDL_INTENTION_EXCLUSIVE one, for the test schema. Surely we need exclusive access to the table until the end of transaction, and if you read comments in mdl.h carefully it's clear that any active SELECT from the table will block TRUNCATE (this was a big surprise to many old MySQL users now upgrading to version 5.5+). Some other lock requests at the end (the one for COMMIT and BINLOG namespaces) also look reasonable - we do have to commit the TRUNCATE and write it to the binary log. Other requests may be far from clear.

We'll discuss them all eventually, but for now my goal was to show that the method of metadata locks study works (in a same way as it worked for studying InnoDB lock requests), we probably picked up a useful function to set a breakpoint on, to begin with, and that the method shows a very detailed information on what happens with metadata locks even if compared with what MySQL 5.7 officially provides via metadata_locks table in Performance Schema (where many requests can be missed with simple query, for example because they do not "live" for a long time and may be "gone" at the moment when we query the table).

More posts will appear in this series soon. Stay tuned!

by Valeriy Kravchuk (noreply@blogger.com) at January 05, 2016 07:42 PM

Peter Zaitsev

grouping_operation, duplicates_removal: EXPLAIN FORMAT=JSON has all details about GROUP BY

EXPLAIN FORMAT=JSONIn the previous EXPLAIN FORMAT=JSON is Cool! series blog post, we discussed the  

group_by_subqueries
  member (which is child of
grouping_operation
). Let’s now focus on the 
grouping_operation
  and other details of 
GROUP BY
  processing.

grouping_operation
 simply shows the details of what happens when the 
GROUP BY
 clause is run:

mysql> explain format=json select dept_no from dept_emp group by dept_noG
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "14.40"
    },
    "grouping_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "dept_emp",
        "access_type": "range",
        "possible_keys": [
          "PRIMARY",
          "emp_no",
          "dept_no"
        ],
        "key": "dept_no",
        "used_key_parts": [
          "dept_no"
        ],
        "key_length": "4",
        "rows_examined_per_scan": 9,
        "rows_produced_per_join": 9,
        "filtered": "100.00",
        "using_index_for_group_by": true,
        "cost_info": {
          "read_cost": "12.60",
          "eval_cost": "1.80",
          "prefix_cost": "14.40",
          "data_read_per_join": "144"
        },
        "used_columns": [
          "emp_no",
          "dept_no"
        ]
      }
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no` from `employees`.`dept_emp` group by `employees`.`dept_emp`.`dept_no`

In the listing above, you can see which table was accessed by the 

GROUP BY
 operation, the access type, and if an index for
GROUP BY
 was used.

In case of a simple

JOIN
  of two tables, 
grouping_operation
 is usually a parent for the 
nested_loop
  object (which provides details on how the 
JOIN
  proceeded):

mysql> explain format=json select de.dept_no, count(dm.emp_no) from dept_emp de join dept_manager dm using(emp_no) group by de.dept_noG
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "61.50"
    },
    "grouping_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "cost_info": {
        "sort_cost": "26.41"
      },
      "nested_loop": [
        {
          "table": {
            "table_name": "dm",
            "access_type": "index",
            "possible_keys": [
              "PRIMARY",
              "emp_no"
            ],
            "key": "emp_no",
            "used_key_parts": [
              "emp_no"
            ],
            "key_length": "4",
            "rows_examined_per_scan": 24,
            "rows_produced_per_join": 24,
            "filtered": "100.00",
            "using_index": true,
            "cost_info": {
              "read_cost": "1.00",
              "eval_cost": "4.80",
              "prefix_cost": "5.80",
              "data_read_per_join": "384"
            },
            "used_columns": [
              "dept_no",
              "emp_no"
            ]
          }
        },
        {
          "table": {
            "table_name": "de",
            "access_type": "ref",
            "possible_keys": [
              "PRIMARY",
              "emp_no",
              "dept_no"
            ],
            "key": "emp_no",
            "used_key_parts": [
              "emp_no"
            ],
            "key_length": "4",
            "ref": [
              "employees.dm.emp_no"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 26,
            "filtered": "100.00",
            "using_index": true,
            "cost_info": {
              "read_cost": "24.00",
              "eval_cost": "5.28",
              "prefix_cost": "35.09",
              "data_read_per_join": "422"
            },
            "used_columns": [
              "emp_no",
              "dept_no"
            ]
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`de`.`dept_no` AS `dept_no`,count(`employees`.`dm`.`emp_no`) AS `count(dm.emp_no)` from `employees`.`dept_emp` `de` join `employees`.`dept_manager` `dm` where (`employees`.`de`.`emp_no` = `employees`.`dm`.`emp_no`) group by `employees`.`de`.`dept_no`

Surprisingly, while many

DISTINCT
 queries can be converted into equivalent queries with the 
GROUP BY
 clause, there is separate member (
duplicates_removal
) for processing it. Let’s see how it works with a simple query that performs the same job as the first one in this blog post:

mysql> explain format=json select distinct dept_no from dept_empG
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "14.40"
    },
    "duplicates_removal": {
      "using_filesort": false,
      "table": {
        "table_name": "dept_emp",
        "access_type": "range",
        "possible_keys": [
          "PRIMARY",
          "emp_no",
          "dept_no"
        ],
        "key": "dept_no",
        "used_key_parts": [
          "dept_no"
        ],
        "key_length": "4",
        "rows_examined_per_scan": 9,
        "rows_produced_per_join": 9,
        "filtered": "100.00",
        "using_index_for_group_by": true,
        "cost_info": {
          "read_cost": "12.60",
          "eval_cost": "1.80",
          "prefix_cost": "14.40",
          "data_read_per_join": "144"
        },
        "used_columns": [
          "emp_no",
          "dept_no"
        ]
      }
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select distinct `employees`.`dept_emp`.`dept_no` AS `dept_no` from `employees`.`dept_emp`

You can see that the plan is almost same, but parent element for the plan is

duplicates_removal
.

The reason there are differences between these members can be seen if we change the second, more complicated query to use

DISTINCT
 in place of
GROUP BY
:

mysql> explain format=json select distinct de.dept_no, count(dm.emp_no) from dept_emp de join dept_manager dm using(emp_no)G
ERROR 1140 (42000): In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'employees.de.dept_no'; this is incompatible with sql_mode=only_full_group_by

This example shows that 

DISTINCT
 is not exactly same as
GROUP BY
, and can be used together  if we want to count the number of managers in each department (grouped by the year when the manager started working in the department). In this case, however, we are interested only in unique pairs of such dates and don’t want to see duplicates. Duplicates will appear if one person managed same department more than two years.

mysql> explain format=json select distinct de.dept_no, count(dm.emp_no) from dept_emp de join dept_manager dm using(emp_no) group by de.dept_no, year(de.from_date)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "61.63"
    },
    "duplicates_removal": {
      "using_temporary_table": true,
      "using_filesort": false,
      "grouping_operation": {
        "using_temporary_table": true,
        "using_filesort": true,
        "cost_info": {
          "sort_cost": "26.53"
        },
        "nested_loop": [
          {
            "table": {
              "table_name": "dm",
              "access_type": "index",
              "possible_keys": [
                "PRIMARY",
                "emp_no"
              ],
              "key": "emp_no",
              "used_key_parts": [
                "emp_no"
              ],
              "key_length": "4",
              "rows_examined_per_scan": 24,
              "rows_produced_per_join": 24,
              "filtered": "100.00",
              "using_index": true,
              "cost_info": {
                "read_cost": "1.00",
                "eval_cost": "4.80",
                "prefix_cost": "5.80",
                "data_read_per_join": "384"
              },
              "used_columns": [
                "dept_no",
                "emp_no"
              ]
            }
          },
          {
            "table": {
              "table_name": "de",
              "access_type": "ref",
              "possible_keys": [
                "PRIMARY",
                "emp_no"
              ],
              "key": "PRIMARY",
              "used_key_parts": [
                "emp_no"
              ],
              "key_length": "4",
              "ref": [
                "employees.dm.emp_no"
              ],
              "rows_examined_per_scan": 1,
              "rows_produced_per_join": 26,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "24.00",
                "eval_cost": "5.31",
                "prefix_cost": "35.11",
                "data_read_per_join": "424"
              },
              "used_columns": [
                "emp_no",
                "dept_no",
                "from_date"
              ]
            }
          }
        ]
      }
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select distinct `employees`.`de`.`dept_no` AS `dept_no`,count(`employees`.`dm`.`emp_no`) AS `count(dm.emp_no)` from `employees`.`dept_emp` `de` join `employees`.`dept_manager` `dm` where (`employees`.`de`.`emp_no` = `employees`.`dm`.`emp_no`) group by `employees`.`de`.`dept_no`,year(`employees`.`de`.`from_date`)

In this case, the member

grouping_operation
 is a child of
duplicates_removal
 and the temporary table used to store the result of
GROUP BY
  before removing the duplicates. A temporary table was also used to perform a filesort for the grouping operation itself.

Compare this with regular

EXPLAIN
 output.
EXPLAIN
 only shows that a temporary table was used, but does not provide insights on the operations for which it was used:

mysql> explain select distinct de.dept_no, count(dm.emp_no) from dept_emp de join dept_manager dm using(emp_no) group by de.dept_no, year(de.from_date)G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: dm
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no
          key: emp_no
      key_len: 4
          ref: NULL
         rows: 24
     filtered: 100.00
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: de
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: PRIMARY
      key_len: 4
          ref: employees.dm.emp_no
         rows: 1
     filtered: 100.00
        Extra: NULL
2 rows in set, 1 warning (0.01 sec)
Note (Code 1003): /* select#1 */ select distinct `employees`.`de`.`dept_no` AS `dept_no`,count(`employees`.`dm`.`emp_no`) AS `count(dm.emp_no)` from `employees`.`dept_emp` `de` join `employees`.`dept_manager` `dm` where (`employees`.`de`.`emp_no` = `employees`.`dm`.`emp_no`) group by `employees`.`de`.`dept_no`,year(`employees`.`de`.`from_date`)

Conclusion: 

EXPLAIN FORMAT=JSON
 contains all the details about the 
GROUP BY
 and
DISTINCT
  optimizations.

by Sveta Smirnova at January 05, 2016 01:45 AM

January 04, 2016

Peter Zaitsev

Percona Server for MongoDB 3.0.8-1.2 is now available

psmdb-logo

Percona Server for MongoDBPercona is pleased to announce the release of Percona Server for MongoDB 3.0.8-1.2 on January 4, 2016. Download the latest version from the Percona web site or from the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.8, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

 


New Features:

  • Added support for Ubuntu 15.10 (Wily Werewolf)
  • Contains all changes and fixes from MongoDB 3.0.8

Percona Server for MongoDB 3.0.8-1.2 release notes are available in the official documentation.

by Alexey Zhebel at January 04, 2016 07:37 PM

January 03, 2016

Daniël van Eeden

The performance of TLS with MySQL Connector/Python

I've ran a simple test to see the performance impact of TLS on MySQL connections with MySQL Connector/Python

The test results are in this Jupyter notebook.

TL;DR:
  • Try to reuse connections if you use TLS
  • Establishing TLS connections is expensive (server & client)
  • Improved performance might be possible in the future by using TLS Tickets
Not tested:
  • Difference between YaSSL and OpenSSL
  • Difference between Ciphersuites
  • Performance of larger resultsets and queries

by Daniël van Eeden (noreply@blogger.com) at January 03, 2016 10:52 AM

December 31, 2015

Valeriy Kravchuk

New Year Wishes for Providers of MySQL Support Services

Three years ago I shared my wishes for customers of Oracle's MySQL Support Services. There I basically asked them to report any problem that they suspect to be caused by the bug in MySQL software at http://bugs.mysql.com.This year I want to share wishes mostly for myself (and other providers of MySQL Support services).

I have a job of MySQL Support Engineer for almost 10.5 years. I did it in MySQL AB, Sun, Oracle and Percona. I had enough opportunities to see all kinds of approaches, types, kinds and qualities of services. But I still have some dreams in this area that I'd like to see fulfilled for both myself as a provider of service and for customers of such a service:
  1. I wish to see MySQL Support mostly done in an asynchronous way, via emails and (when absolutely needed and possible) remote login sessions.

    In most cases it's enough for customer to know that she will get a detailed, best possible answer to any her initial question (or problem statement) or any followups question or request in a predictable, well defined time. There is no need for engineer and customer to always work in sync, by talking on phone, chatting or doing a shared screen sessions.

    Support should work the same way UNIX operating system does: by sharing all available resources (engineers) among all tasks (support requests) at hand, allocating resources for the task for some small amount of time and then forcing the resource to switch to other task, either when time unit allocated is ended or immediately when we have to wait for something to complete. Surely this mode is beneficial for support providers (because of ability to work for more customers concurrently than they have engineers online), but customers also get clear benefits. They can move on and work on something else until they get email back (or time to get a reply passes), and they may get a reply based on concurrent (but asynchronous) work of several engineers ("fan-out").

  2. At the same time, I wish each support provider to have a well defined SLA (time of getting a guaranteed useful technical reply, either a solution, suggestion or further question) not only for the initial reply (as we can see here and, honestly, almost everywhere), but also for the followups, for each and every customer email.

    Ideally both sides should be able to negotiate the date(time) of the next reply (even if it's different from formal official SLA), and then make sure to meet this deadline in 100% of cases. Some steps towards this goal are visible here, but so far no well know Support provider is perfect with followups in time, based on my knowledge.

  3. I wish Support engineers to never be involved in phone conferences with customers without a clearly defined agenda related to MySQL and limited time to be spent on phone (see item 1 above for the reasons).

    Sometimes somebody from "services" side should be "there", in case of questions during some long discussion. I think this is a job for customer's TAM (technical assistance manager), Sales Engineer (if the topic is related to purchasing some service or software) or anyone who is paid per hour (like Consultant).

  4. I wish Support engineers, no matter what Support provider they work for, to always report upstream MySQL bugs at http://bugs.mysql.com/ and fork-specific bugs at their public bug trackers, as openly available (public) to all MySQL users.

    Some bugs may be repeatable only with customer-specific and confidential data, and some bugs may have security implications. Ideally, Support engineers should always work on a repeatable test case or otherwise well grounded bug report NOT containing customer data. As for security problems, there is always a way to explain in public important details of the possible security attack vector and list versions affected, without giving enough details for "script kiddies" to just blindly copy-paste the test case to get unauthorized access or crash well-managed public MySQL server.

  5. I wish Support engineers to present their work and share their experience in public.

    We all should try to share knowledge we have and get while working with customers, not only internally to our colleagues in services or via internal knowledge bases, but also in our own blogs, articles, on public MySQL forums and on MySQL-related conferences.

    MySQL Support providers should encourage support engineers to make the results of their work public whenever possible. Not only bugs, but problem solving approaches, code written (if any), experience gained should be shared with MySQL community. This will give us all customers who known more about MySQL and will help us not to re-invent the wheel.
 
To summarize, I wish our customers in the New Year of 2016 to get a simple, but well-defined, responsible, and reliable 24x7 Support service provided by the engineers who are well known to the Community based on their public work on MySQL (via blog posts, bug reports and conference presentations). I wish all MySQL Support Service providers to deliver what they promise (or more) in 100% of cases. I wish myself to work for MySQL Support Provider that cares about my wishes and tries to help me to see my dreams expressed here coming true.

Happy New Year, MySQL Community!

by Valeriy Kravchuk (noreply@blogger.com) at December 31, 2015 05:42 PM

December 30, 2015

Peter Zaitsev

Database Performance Webinar: Tired of MySQL Making You Wait?

Performance

database performance

Too often developers and DBAs struggle to pinpoint the root cause of MySQL database performance issues, and then spend too much time in trying to fix them. Wouldn’t it be great to bypass wasted guesswork and get right to the issue?

In our upcoming webinar Tired of MySQL Making You Wait? we’re going to help you discover how to significantly increase the performance of your applications and reduce database response time.

In this webinar, Principal Architect Alexander Rubin and Database Evangelist Janis Griffin will provide the key steps needed to identify, prioritize, and improve query performance.

They will discuss the following topics:

  • Wait time analytics using Performance / Information schemas
  • Monitoring for performance using DPA
  • Explaining plan operations focusing on temporary tables and filesort
  • Using indexes to optimize your queries
  • Using loose and tight index scans in MySQL

WHEN:

Thursday, January 7, 2016 10:00am Pacific Standard Time (UTC – 8)

PRESENTERS:

Alexander RubinPrincipal Consultant, Percona

Janis GriffinDatabase Evangelist, SolarWinds

Register now!

Percona is the only company that delivers enterprise-class software, support, consulting and managed services solutions for both MySQL and MongoDB® across traditional and cloud-based platforms that maximize application performance while streamlining database efficiencies.

Percona’s industry-recognized performance experts can maximize your database, server and application performance, lower infrastructure costs, and provide capacity and scalability planning for future growth.

by Emily Ikuta at December 30, 2015 08:05 PM

Jean-Jerome Schmidt

s9s Tools and Resources: Momentum Highlights for MySQL, PostgreSQL, MongoDB and more!

Check Out Our Latest Technical Resources for MySQL, MariaDB, PostgreSQL and MongoDB

This is our last s9s Tools & Resources communication in 2015 and as we prepare to kick off 2016, we’d like to take this opportunity to thank you for your support in the year gone by and to wish you a successful start to the new year!

This is a summary of all the resources we recently published. Please do check it out and let us know if you have any comments or feedback.

Momentum Highlights

Severalnines breaks records on MongoDB, MySQL & PostgreSQL

  • Over 100% sales growth achieved early in first half of 2015
  • 150+ enterprise customers, 8,000+ community users - thank you for joining us!
  • New enterprise accounts wins such as the European Broadcast Union, European Gravitational Observatory, BT Expedite and French national scientific research centre, CNRS
  • Hired Gerry Treacy, former MongoDB executive, as Vice President of Sales
  • Added support for PostgreSQL to ClusterControl alongside MySQL and MongoDB

Read the full momentum release here

Technical Webinar Replay

Polyglot Persistence for the MongoDB, MySQL & PostgreSQL DBA

During our last webinar of the year, Art van Scheppingen discussed the four major operational challenges for MySQL, MongoDB & PostgreSQL and demonstrated, using ClusterControl, how Polyglot Persistence for datastores can be managed from one single control centre.

View the replay and read the slides here

Customer Case Studies

From small businesses to Fortune 500 companies, customers have chosen Severalnines to deploy and manage MySQL, MongoDB and PostgreSQL.  

View our Customer page to discover companies like yours who have found success with ClusterControl.

Partnership Announcement

Percona & Severalnines expand partnership to include MongoDB

Peter Zaitsev, Co-founder and CEO of Percona, had this to say about this new announcement with our long-term partner: “We are very pleased to expand our relationship with Severalnines to bring enhanced management, scalability, and industry-leading expertise to Percona Server for MongoDB deployments. With ClusterControl by Severalnines, organizations can now truly afford to monitor, manage and scale the highly available database infrastructures they need to stay competitive in an information-driven economy.”

Read the full announcement here

ClusterControl Blogs

Our series of blogs focussing on how to use ClusterControl continues. Do check them out!

View all ClusterControl blogs here

The MySQL DBA Blog Series

We’re on the 18th installment of our popular ‘Become a MySQL DBA’ series and you can view all of these blogs here. Here are the latest ones in the series:

View all the ‘Become a MySQL DBA’ blogs here

Additional Technical Blogs & Resources

Events

The Percona Live Data Performance Conference (for MySQL and MongoDB users and more) is coming up in just a few months and we’ve been busy with talk submissions for the conference. Two of our talks have already been selected and you can find the ful list of the talks we submitted here. We hope to see you in Santa Clara!

We trust these resources are useful. If you have any questions on them or on related topics, please do contact us!

All our best wishes for the new year,
Your Severalnines Team

Blog category:

by Severalnines at December 30, 2015 02:14 PM

December 29, 2015

Peter Zaitsev

2016 Percona Live Tutorials Schedule is UP!

PL16-Logo-Vert-Full-Opt1

percona live tutorialsWe are excited to announce that the tutorial schedule for the Percona Live Data Performance Conference 2016 is up!

The schedule shows all the details for each of our informative and enlightening Percona Live tutorial sessions, including insights into InnoDB, MySQL 5.7, MongoDB 3.2 and RocksDB. These tutorials are a must for any data performance professional!

The Percona Live Data Performance Conference is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The sneak peek schedule for Percona Live 2016 has also been posted! The Conference will feature a variety of formal tracks and sessions related to MySQL, NoSQL and Data in the Cloud. With over 150 slots to fill, there will be no shortage of great content this year.

The Percona Live Data Performance Conference will be April 18-21 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Just a reminder to everyone out there: our Super Saver discount rate for the Percona Live Data Performance and Expo 2016 is only available ‘til December 31 11:30pm PST! This rate gets you all the excellent and amazing opportunities that Percona Live offers, at the lowest price possible!

Become a conference sponsor! We have sponsorship opportunities available for this annual MySQL, NoSQL and Data in the Cloud event. Sponsors become a part of a dynamic and growing ecosystem and interact with more than 1,000 DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

Click through to the tutorial link right now, look them over, and pick which sessions you want to attend!

by Kortney Runyan at December 29, 2015 09:24 PM

EXPLAIN FORMAT=JSON: order_by_subqueries, group_by_subqueries details on subqueries in ORDER BY and GROUP BY

EXPLAIN FORMAT=JSON

EXPLAIN FORMATAnother post in the EXPLAIN FORMAT=JSON is Cool! series! In this post, we’ll discuss how the EXPLAIN FORMAT=JSON provides optimization details for 

ORDER BY
 and  
GROUP BY
 operations in conjunction with 
order_by_subqueries
 and  
group_by_subqueries

EXPLAIN FORMAT=JSON
 can print details on how a subquery in
ORDER BY
 is optimized:

mysql> explain format=json select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select emp_no limit 1)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "60833.60"
    },
    "ordering_operation": {
      "using_filesort": true,
      "table": {
        "table_name": "employees",
        "access_type": "ALL",
        "rows_examined_per_scan": 299843,
        "rows_produced_per_join": 299843,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "865.00",
          "eval_cost": "59968.60",
          "prefix_cost": "60833.60",
          "data_read_per_join": "13M"
        },
        "used_columns": [
          "emp_no",
          "first_name",
          "last_name"
        ]
      },
      "order_by_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "message": "No tables used"
          }
        }
      ]
    }
  }
}
1 row in set, 2 warnings (0.00 sec)
Note (Code 1276): Field or reference 'employees.employees.emp_no' of SELECT #2 was resolved in SELECT #1
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`employees`.`emp_no` limit 1)

The above code shows member

ordering_operation
 of
query_block
  (which includes the 
order_by_subqueries
 array) with information on how the subquery in
ORDER BY
  was optimized.

This is a simple example. In real life you can have larger subqueries in the 

ORDER BY
  clause. For example, take this more complicated and slightly crazy query:

select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)

Run a regular

EXPLAIN
 on it. If we imagine this is a regular subquery, we won’t know if it can be cached or would be executed for each row sorted.

mysql> explain  select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: employees
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299843
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: dept_emp
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no,dept_no
          key: dept_no
      key_len: 4
          ref: NULL
         rows: 331215
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: emp_no
      key_len: 4
          ref: employees.dept_emp.emp_no
         rows: 10
     filtered: 100.00
        Extra: Using index
3 rows in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`dept_emp`.`dept_no` AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`)

EXPLAIN FORMAT=JSON
  provides a completely different picture:

mysql> explain format=json select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "60833.60"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "employees",
        "access_type": "ALL",
        "rows_examined_per_scan": 299843,
        "rows_produced_per_join": 299843,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "865.00",
          "eval_cost": "59968.60",
          "prefix_cost": "60833.60",
          "data_read_per_join": "13M"
        },
        "used_columns": [
          "emp_no",
          "first_name",
          "last_name"
        ]
      },
      "optimized_away_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "1082124.21"
            },
            "grouping_operation": {
              "using_filesort": false,
              "nested_loop": [
                {
                  "table": {
                    "table_name": "dept_emp",
                    "access_type": "index",
                    "possible_keys": [
                      "PRIMARY",
                      "emp_no",
                      "dept_no"
                    ],
                    "key": "dept_no",
                    "used_key_parts": [
                      "dept_no"
                    ],
                    "key_length": "4",
                    "rows_examined_per_scan": 331215,
                    "rows_produced_per_join": 331215,
                    "filtered": "100.00",
                    "using_index": true,
                    "cost_info": {
                      "read_cost": "673.00",
                      "eval_cost": "66243.00",
                      "prefix_cost": "66916.00",
                      "data_read_per_join": "5M"
                    },
                    "used_columns": [
                      "emp_no",
                      "dept_no"
                    ]
                  }
                },
                {
                  "table": {
                    "table_name": "salaries",
                    "access_type": "ref",
                    "possible_keys": [
                      "PRIMARY",
                      "emp_no"
                    ],
                    "key": "emp_no",
                    "used_key_parts": [
                      "emp_no"
                    ],
                    "key_length": "4",
                    "ref": [
                      "employees.dept_emp.emp_no"
                    ],
                    "rows_examined_per_scan": 10,
                    "rows_produced_per_join": 3399374,
                    "filtered": "100.00",
                    "using_index": true,
                    "cost_info": {
                      "read_cost": "335333.33",
                      "eval_cost": "679874.87",
                      "prefix_cost": "1082124.21",
                      "data_read_per_join": "51M"
                    },
                    "used_columns": [
                      "emp_no",
                      "from_date"
                    ]
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`dept_emp`.`dept_no` AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`)

We see that the subquery was optimized away: member

optimized_away_subqueries
 exists, but there is no
order_by_subqueries
 in the
ordering_operation
 object. We can also see that the subquery was cached:
"cacheable": true
.

EXPLAIN FORMAT=JSON
 also provides information about subqueries in the 
GROUP BY
 clause. It uses the 
group_by_subqueries
 array in the 
grouping_operation
  member for this purpose.

mysql> explain format=json select count(emp_no) from salaries group by salary > ALL (select s/c as avg_salary from (select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "3412037.60"
    },
    "grouping_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "cost_info": {
        "sort_cost": "2838638.00"
      },
      "table": {
        "table_name": "salaries",
        "access_type": "ALL",
        "rows_examined_per_scan": 2838638,
        "rows_produced_per_join": 2838638,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "5672.00",
          "eval_cost": "567727.60",
          "prefix_cost": "573399.60",
          "data_read_per_join": "43M"
        },
        "used_columns": [
          "emp_no",
          "salary",
          "from_date"
        ]
      },
      "group_by_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "881731.00"
            },
            "table": {
              "table_name": "t",
              "access_type": "ALL",
              "rows_examined_per_scan": 3526884,
              "rows_produced_per_join": 3526884,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "176354.20",
                "eval_cost": "705376.80",
                "prefix_cost": "881731.00",
                "data_read_per_join": "134M"
              },
              "used_columns": [
                "dept_no",
                "s",
                "c"
              ],
              "attached_condition": "((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`)))",
              "materialized_from_subquery": {
                "using_temporary_table": true,
                "dependent": false,
                "cacheable": true,
                "query_block": {
                  "select_id": 3,
                  "cost_info": {
                    "query_cost": "1106758.94"
                  },
                  "grouping_operation": {
                    "using_filesort": false,
                    "nested_loop": [
                      {
                        "table": {
                          "table_name": "dept_emp",
                          "access_type": "index",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no",
                            "dept_no"
                          ],
                          "key": "dept_no",
                          "used_key_parts": [
                            "dept_no"
                          ],
                          "key_length": "4",
                          "rows_examined_per_scan": 331215,
                          "rows_produced_per_join": 331215,
                          "filtered": "100.00",
                          "using_index": true,
                          "cost_info": {
                            "read_cost": "673.00",
                            "eval_cost": "66243.00",
                            "prefix_cost": "66916.00",
                            "data_read_per_join": "5M"
                          },
                          "used_columns": [
                            "emp_no",
                            "dept_no"
                          ]
                        }
                      },
                      {
                        "table": {
                          "table_name": "salaries",
                          "access_type": "ref",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no"
                          ],
                          "key": "PRIMARY",
                          "used_key_parts": [
                            "emp_no"
                          ],
                          "key_length": "4",
                          "ref": [
                            "employees.dept_emp.emp_no"
                          ],
                          "rows_examined_per_scan": 10,
                          "rows_produced_per_join": 3526884,
                          "filtered": "100.00",
                          "cost_info": {
                            "read_cost": "334466.14",
                            "eval_cost": "705376.80",
                            "prefix_cost": "1106758.95",
                            "data_read_per_join": "53M"
                          },
                          "used_columns": [
                            "emp_no",
                            "salary",
                            "from_date"
                          ]
                        }
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.01 sec)
Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)` from `employees`.`salaries` group by <not>(<in_optimizer>(`employees`.`salaries`.`salary`,<exists>(/* select#2 */ select 1 from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s`,count(`employees`.`salaries`.`emp_no`) AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t` where ((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`))) having <is_not_null_test>((`t`.`s` / `t`.`c`)))))

Again, this output gives a clear view of query optimization: subquery in

GROUP BY
 itself cannot be optimized, cached or converted into temporary table, but the subquery inside the subquery (
select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no
) could be materialized into a temporary table and cached.

A regular

EXPLAIN
 command does not provide such details:

mysql> explain select count(emp_no) from salaries group by salary > ALL (select s/c as avg_salary from (select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838638
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: <derived3>
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3526884
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: dept_emp
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no,dept_no
          key: dept_no
      key_len: 4
          ref: NULL
         rows: 331215
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 3
  select_type: DERIVED
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: PRIMARY
      key_len: 4
          ref: employees.dept_emp.emp_no
         rows: 10
     filtered: 100.00
        Extra: NULL
4 rows in set, 1 warning (0.01 sec)
Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)` from `employees`.`salaries` group by <not>(<in_optimizer>(`employees`.`salaries`.`salary`,<exists>(/* select#2 */ select 1 from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s`,count(`employees`.`salaries`.`emp_no`) AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t` where ((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`))) having <is_not_null_test>((`t`.`s` / `t`.`c`)))))

Most importantly, we cannot guess from the output if the 

DERIVED
 subquery can be cached.

Conlcusion:

EXPLAIN FORMAT=JSON
  provides details on how subqueries in
ORDER BY
 and
GROUP BY
 clauses are optimized.

by Sveta Smirnova at December 29, 2015 08:39 PM

December 28, 2015

Federico Razzoli

SQL Games

I have just created a GitHub repository called sql_games. It contains games implemented as stored procedures, that can run on MariaDB or (with some changes) on Percona Server or Oracle MySQL.

You play the games via the command-line client. You call a procedure to make your move, then a text or an ASCII image appears.

Of course the same call should produce a different effect, depending on the game’s current state. To remember the state I use both user variables and temporary tables.

Why did I do that? Mainly because it was funny. I can’t explain why. And I can’t tell that it is funny for anyone: perhaps, where I’ve found interesting challenges, someone else would find a cause a frustration. But yes, it has been funny for me.

Also, I did it because I could. This means that others can do it. Stored procedures are not a useless and nasty feature that users should ignore – they are useful tools. Yes, several times I’ve complained that they need important improvements. But I complain because I like them.

Currently, three games are implemented:

  • Anagram – The anagram game. See a randomly generated anagram and try to guess the word. You can choose a language, and a min/max word length.
  • Bulls And Cows – If you don’t know the game, take a look at the Wikipedia page.
  • Hangman –  Try to guess letters and see an ASCII hangman materializing slowly while you fail.

Each game will be installed in a different database. CALL help() in the proper database to see how to play.


by Federico at December 28, 2015 07:50 PM

Peter Zaitsev

EXPLAIN FORMAT=JSON provides insights on optimizer_switch effectiveness

optimizer_switch

EXPLAIN FORMAT=JSONThe previous post in the EXPLAIN FORMAT=JSON is Cool! series showed an example of the query

select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)
, where the subquery was materialized into a temporary table and then joined with the outer query. This is known as a semi-join optimization. But what happens if we turn off this optimization?

EXPLAIN FORMAT=JSON
 can help us with this investigation too.

First lets look at the original output again:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "16.72"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "departments",
          <skipped>
      },
      {
        "table": {
          "table_name": "<subquery2>",
          "access_type": "eq_ref",
          "key": "<auto_key>",
          "key_length": "4",
          "ref": [
            "employees.departments.dept_no"
          ],
          "rows_examined_per_scan": 1,
          "materialized_from_subquery": {
            "using_temporary_table": true,
            "query_block": {
              "table": {
                "table_name": "dept_manager",
                "access_type": "ALL",
                "possible_keys": [
                  "dept_no"
                ],
                "rows_examined_per_scan": 24,
                "rows_produced_per_join": 21,
                "filtered": "90.00",
                "cost_info": {
                  "read_cost": "1.48",
                  "eval_cost": "4.32",
                  "prefix_cost": "5.80",
                  "data_read_per_join": "345"
                },
                "used_columns": [
                  "dept_no",
                  "to_date"
                ],
                "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
              }
            }
          }
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

To repeat what happened here: the subquery was materialized into a  temporary table, then  joined with the departments table. Semi-join optimization is ON by default (as would be most likely without intervention).

What happens if we temporarily turn semi-join optimization OFF?

mysql> set optimizer_switch="semijoin=off";
Query OK, 0 rows affected (0.00 sec)

And then execute

EXPLAIN
 one more time:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null) G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "2.80"
    },
    "table": {
      "table_name": "departments",
      "access_type": "index",
      "key": "dept_name",
      "used_key_parts": [
        "dept_name"
      ],
      "key_length": "42",
      "rows_examined_per_scan": 9,
      "rows_produced_per_join": 9,
      "filtered": "100.00",
      "using_index": true,
      "cost_info": {
        "read_cost": "1.00",
        "eval_cost": "1.80",
        "prefix_cost": "2.80",
        "data_read_per_join": "432"
      },
      "used_columns": [
        "dept_no",
        "dept_name"
      ],
      "attached_condition": "<in_optimizer>(`employees`.`departments`.`dept_no`,`employees`.`departments`.`dept_no` in ( <materialize> (/* select#2 */ select `employees`.`dept_manager`.`dept_no` from `employees`.`dept_manager` where (`employees`.`dept_manager`.`to_date` is not null) ), <primary_index_lookup>(`employees`.`departments`.`dept_no` in <temporary table> on <auto_key> where ((`employees`.`departments`.`dept_no` = `materialized-subquery`.`dept_no`)))))",
      "attached_subqueries": [
        {
          "table": {
            "table_name": "<materialized_subquery>",
            "access_type": "eq_ref",
            "key": "<auto_key>",
            "key_length": "4",
            "rows_examined_per_scan": 1,
            "materialized_from_subquery": {
              "using_temporary_table": true,
              "dependent": true,
              "cacheable": false,
              "query_block": {
                "select_id": 2,
                "cost_info": {
                  "query_cost": "5.80"
                },
                "table": {
                  "table_name": "dept_manager",
                  "access_type": "ALL",
                  "possible_keys": [
                    "dept_no"
                  ],
                  "rows_examined_per_scan": 24,
                  "rows_produced_per_join": 21,
                  "filtered": "90.00",
                  "cost_info": {
                    "read_cost": "1.48",
                    "eval_cost": "4.32",
                    "prefix_cost": "5.80",
                    "data_read_per_join": "345"
                  },
                  "used_columns": [
                    "dept_no",
                    "to_date"
                  ],
                  "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
                }
              }
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` where <in_optimizer>(`employees`.`departments`.`dept_no`,`employees`.`departments`.`dept_no` in ( <materialize> (/* select#2 */ select `employees`.`dept_manager`.`dept_no` from `employees`.`dept_manager` where (`employees`.`dept_manager`.`to_date` is not null) ), <primary_index_lookup>(`employees`.`departments`.`dept_no` in <temporary table> on <auto_key> where ((`employees`.`departments`.`dept_no` = `materialized-subquery`.`dept_no`)))))

Now the picture is completely different. There is no

nested_loop
 member, and instead there is an 
attached_subqueries
 array containing a single member: the temporary table materialized from the subquery
select dept_no from dept_manager where to_date is not null
 (including all the details of this materialization).

Conclusion: We can experiment with the value of

optimizer_switch
 and use
EXPLAIN FORMAT=JSON
 to examine how a particular optimization affects our queries.

by Sveta Smirnova at December 28, 2015 07:20 PM

December 26, 2015

Daniël van Eeden

The performance of MySQL Connector/Python with C Extension

The source of this post is in this gist on nbviewer.

After reading about the difference between MySQL Connector/Python and MySQLdb on this blog post I wondered how the C Extension option in Connector/Python would perform.

If you want to run the code yourself you'll need: Jupyter/IPython, Python 3, Requests, MySQLdb, Connector/Python, Matplotlib, Pandas and MySQL.

In [1]:
%matplotlib notebook
In [2]:
import random
import gzip
import time

import pandas as pd
import matplotlib.pyplot as plt
import requests
import mysql.connector
import MySQLdb
for imp in [mysql.connector, MySQLdb]:
print('Using {imp} {version}'.format(imp=imp.__name__, version=imp.__version__))
print('C Extension for MySQL Connector/Python available: %s' % mysql.connector.HAVE_CEXT)
Using mysql.connector 2.1.3
Using MySQLdb 1.3.7
C Extension for MySQL Connector/Python available: True

Make sure the C Extension is available. This needs MySQL Connector/Python 2.1 or newer. On Fedora you might need to install this with dnf install mysql-connector-python3-cext if you have the mysql-connectors-community repository installed. If you compile from source then make sure to use the --with-mysql-capi option.

In [3]:
worlddb_url = 'https://downloads.mysql.com/docs/world.sql.gz'
worlddb_req = requests.get(worlddb_url)
if worlddb_req.status_code == 200:
worldsql = gzip.decompress(worlddb_req.content).decode('iso-8859-15')
In [4]:
config = {
'host': '127.0.0.1',
'port': 5710,
'user': 'msandbox',
'passwd': 'msandbox',
}

The above is my config to connect to a MySQL Sandbox running MySQL Server 5.7.10.

Note: you might hit MySQL Bug #79780 when loading the world database into MySQL with Connector/Python with the C Extension enabled.

In [5]:
c1 = mysql.connector.connect(use_pure=False, **config)
cur1 = c1.cursor()
cur1.execute('DROP SCHEMA IF EXISTS world')
cur1.execute('CREATE SCHEMA world DEFAULT CHARACTER SET latin1')
cur1.execute('USE world')
result = [x for x in cur1.execute(worldsql, multi=True)]
cur1.close()
c1.close()
In [6]:
config['db'] = 'world'
In [7]:
perfdata = pd.DataFrame(columns=['connpy','connpy_cext','MySQLdb'], index=range(10000))

Now we're going to run 10000 queries with a random primary key between 1 and 8000. This does not use the C Extension as use_pure is set to True.

In [8]:
c1 = mysql.connector.connect(use_pure=True, **config)
cur1 = c1.cursor()
for it in range(10000):
city_id = random.randint(1,8000)
start = time.perf_counter()
cur1.execute("SELECT * FROM City WHERE ID=%s", (city_id,))
cur1.fetchone()
perfdata.ix[it]['connpy'] = time.perf_counter() - start

Next up is Connector/Python with the C Extension (use_pure=False and HAVE_CEXT indicates we have the C Extension available)

In [9]:
c1 = mysql.connector.connect(use_pure=False, **config)
cur1 = c1.cursor()
for it in range(10000):
city_id = random.randint(1,8000)
start = time.perf_counter()
cur1.execute("SELECT * FROM City WHERE ID=%s", (city_id,))
cur1.fetchone()
perfdata.ix[it]['connpy_cext'] = time.perf_counter() - start

And last, but not least, MySQLdb.

In [10]:
c2 = MySQLdb.connect(**config)
cur2 = c2.cursor()
for it in range(10000):
city_id = random.randint(1,8000)
start = time.perf_counter()
cur2.execute("SELECT * FROM City WHERE ID=%s", (city_id,))
cur2.fetchone()
perfdata.ix[it]['MySQLdb'] = time.perf_counter() - start

Now let's have a look to what our data looks like

In [11]:
perfdata.head()
Out[11]:
connpy connpy_cext MySQLdb
0 0.00145918 0.000354935 0.000353173
1 0.000907707 0.000243508 0.000249597
2 0.000468397 0.000277101 0.000207893
3 0.000595066 0.000241349 0.00020754
4 0.000641848 0.000258027 0.000193182

Now let's plot that

In [12]:
plt.style.use('ggplot')
plt.scatter(perfdata.index, perfdata.connpy, s=1, c='r',
label='Connector/Python Pure')
plt.scatter(perfdata.index, perfdata.connpy_cext, s=1, c='g',
label='Connector/Python C Ext')
plt.scatter(perfdata.index, perfdata.MySQLdb, s=1, c='b',
label='MySQLdb')
plt.ylim(ymin=0, ymax=0.001)
plt.xlim(xmin=0, xmax=10000)
plt.xlabel('Run #')
plt.ylabel('Runtime in seconds')
plt.legend()
Out[12]:
<matplotlib.legend.Legend at 0x7f47bd8c2518>

The performance of MySQL Connector/Python 2.1 with the C Extension is much closer to MySQLdb.

There is one serious drawback of using the C Extension: Prepared statements are not yet supported.

But there is more that performance alone. I like the MySQL Connector/Python API over the MySQLdb API and this shows that Connector/Python is flexible and can be almost as fast.

The pure Python implementation also has advantages: Easier installation (no C compiler required) and the option to use alternative implementations like PyPy

by Daniël van Eeden (noreply@blogger.com) at December 26, 2015 03:44 PM

December 24, 2015

MariaDB Foundation

The State of SSL in MariaDB

Usually when one says “SSL” or “TLS” it means not a specific protocol but a family of protocols. Wikipedia article has the details, but in short — SSL 2.0 and SSL 3.0 are deprecated and should not be used anymore (the well-known POODLE vulnerability exploits the flaw in SSL 3.0). TLS 1.0 is sixteen years […]

The post The State of SSL in MariaDB appeared first on MariaDB.org.

by Sergei at December 24, 2015 09:42 AM

MariaDB 10.1.10 and MariaDB Galera Cluster 5.5.47 and 10.0.23 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.10, MariaDB Galera Cluster 5.5.47, and MariaDB Galera Cluster 10.0.23. See the release notes and changelogs for details on each release. Download MariaDB 10.1.10 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Galera Cluster 5.5.47 […]

The post MariaDB 10.1.10 and MariaDB Galera Cluster 5.5.47 and 10.0.23 now available appeared first on MariaDB.org.

by Daniel Bartholomew at December 24, 2015 05:39 AM

December 23, 2015

Peter Zaitsev

Percona Server for MongoDB storage engines in iiBench insert workload

storage engine

storage enginesWe recently released the GA version of Percona Server for MongoDB, which comes with a variety of storage engines: RocksDB, PerconaFT and WiredTiger.

Both RocksDB and PerconaFT are write-optimized engines, so I wanted to compare all engines in a workload oriented to data ingestions.

For a benchmark I used iiBench-mongo (https://github.com/mdcallag/iibench-mongodb), and I inserted one billion (bln) rows into a collection with three indexes. Inserts were done in ten parallel threads.

For memory limits, I used a 10GB as the cache size, with a total limit of 20GB available for the mongod process, limited with cgroups (so the extra 10GB of memory was available for engine memory allocation and OS cache).

For the storage I used a single Crucial M500 960GB SSD. This is a consumer grade SATA SSD. It does not provide the best performance, but it is a great option price/performance wise.

Every time I mention WiredTiger, someone in the comments asks about the LSM option for WiredTiger. Even though LSM is still not an official mode in MongoDB 3.2, I added WiredTiger-LSM from MongoDB 3.2 into the mix. It won’t have the optimal settings, as there is no documentation how to use LSM in WiredTiger.

First, let me show a combined graph for all engines:
engines-timeline

And now, let’s zoom in on the individual engines.

WiredTiger:

wt-3.0

RocksDB + PerconaFT:

rocks-perconaft-3.0

UPDATE on 12/30/15
With an input from RocksDB developers at Facebook, after extra tuning of RocksDB (add delayed_write_rate=12582912;soft_rate_limit=0;hard_rate_limit=0; to config) I am able to get much better result for RocksDB:
rocks-3.0-dyn12M

What conclusions can we make?

  1. WiredTiger’s memory (about the first one million (mln) rows) performed extremely well, achieving over 100,000 inserts/sec. As data grows and exceeds memory size, WiredTiger behaved as a traditional B-Tree engine (which is no surprise).
  2. PerconaFT and RocksDB showed closer to constant throughput, with RocksDB being overall better, However, with data growth both engines start to experience challenges. For PerconaFT, the throughput varies more with more data, and RocksDB shows more stalls (which I think is related to a compaction process).
  3. WiredTiger LSM didn’t show as much variance as a B-Tree, but it still had a decline related to data size, which in general should not be there (as we see with RocksDB, also LSM based).

Inserting data is only one part of the equation. Now we also need to retrieve data from the database (which we’ll cover in another blog post).

Configuration for PerconaFT:

numactl --interleave=all ./mongod --dbpath=/mnt/m500/perconaft --storageEngine=PerconaFT --PerconaFTEngineCacheSize=$(( 10*1024*1024*1024 )) --syncdelay=900 --PerconaFTIndexFanout=128 --PerconaFTCollectionFanout=128 --PerconaFTIndexCompression=quicklz --PerconaFTCollectionCompression=quicklz --PerconaFTIndexReadPageSize=16384 --PerconaFTCollectionReadPageSize=16384

Configuration for RocksDB:

storage.rocksdb.configString:
 "bytes_per_sync=16m;max_background_flushes=3;max_background_compactions=12;max_write_buffer_number=4;max_bytes_for_level_base=1500m;target_file_size_base=200m;level0_slowdown_writes_trigger=12;write_buffer_size=400m;compression_per_level=kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression;optimize_filters_for_hits=true"

Configuration for WiredTiger-3.2 LSM:

storage.wiredTiger.collectionConfig.configString:
 "type=lsm"
 storage.wiredTiger.indexConfig.configString:
 "type=lsm"

Load parameters for iibench:

TEST_RUN_ARGS_LOAD="1000000000 6000 1000 999999 10 256 3 0

by Vadim Tkachenko at December 23, 2015 05:39 PM

Shlomi Noach

Orchestrator progress

This comes mostly to reassure, having moved into GitHub: orchestrator development continues.

I will have the privilege of working on this open source solution in GitHub. There are a few directions we can take orchestrator to, and we will be looking into the possibilities. We will continue to strengthen the crash recovery process, and in fact I've got a couple ideas on drastically shortening Pseudo-GTID recovery time as well as other debts. We will look into yet other directions, which we will share. My new and distinguished team will co-work on/with orchestrator and will no doubt provide useful and actionable input.

Orchestrator continues to be open for pull requests, with a temporal latency in response time (it's the Holidays, mostly).

Some Go(lang) limitations (namely the import path, I'll blog more about it) will most probably imply some changes to the code, which will be well communicated to existing collaborators.

Most of all, we will keep orchestrator a generic solution, while keeping focus on what we think is most important - and there's some interesting vision here. Time will reveal as we make progress.

 

by shlomi at December 23, 2015 04:01 PM

Jean-Jerome Schmidt

Webinar Replay & Slides: Polyglot Persistence for the MongoDB, MySQL & PostgreSQL DBA

Many thanks to everyone who took the time yesterday to participate in our last webinar of 2015! Here are the recording of the session as well as the slides that were used by our colleague Art van Scheppingen, Senior Support Engineer, on the topic of Polyglot Persistence for the MongoDB, MySQL & PostgreSQL DBA.

Webinar Replay:

Webinar Slides:

Agenda

During this webinar, Art discussed the four major operational challenges for MySQL, MongoDB & PostgreSQL:

  • Deployment
  • Management
  • Monitoring
  • Scaling
  • And how to deal with them

And demonstrated, using ClusterControl, how Polyglot Persistence for datastores can be managed from one central control centre. To find out more, feel free to contact us.

"

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

This webinar is based upon the experience Art had while writing our How to become a ClusterControl DBA blog series and implementing multiple storage backends to ClusterControl. To view all the blogs of the ‘Become a ClusterControl DBA’ series visit: http://severalnines.com/blog-categories/clustercontrol

Thank you for your interest and participation in our webinars this year; we look forward to seeing you in 2016!

Blog category:

by Severalnines at December 23, 2015 12:53 PM