Planet MariaDB

August 20, 2017

Valeriy Kravchuk

Fun with Bugs #55 - On Some Public Bugs Fixed in MySQL 8.0.2

I do not care much about MySQL 8.0.x at the moment, as it's far from being GA and is work in progress. It is not yet used by customers whom I have to support. But I know about many interesting changes and improvements there that, eventually, are going to influence all main forks and kinds of MySQL. So, it would not be wise to ignore MySQL 8.0.c entirely even for me.

For this post I decided to briefly check what community reported bugs were fixed in the recent release, 8.0.2, based on release notes. For me it's a measure of community interest in MySQL 8.0.x and Oracle's interest in further working with MySQL Community. I ended up with the following, short enough list of bug fixes in the categories I usually care about (InnoDB, partitioning, replication and optimizer):
  • The very first InnoDB bug mentioned in the release notes, Bug #85043, is private. I fail to see any valid reason for a bug in the version currently under development and not declared GA to remain private after the fix is released. If only it affects GA versions, and this is the case. The bug is fixed in 5.7.19 as well, as you can see in my previous post.
  • Another bug that is related to InnoDB and optimizer, is Bug #81031. It was also fixed in MySQL 5.6.37 and 5.7.19.
  • Bug #84038 - "Errors when restarting MySQL after FLUSH TABLES FOR EXPORT, RENAME and DROP", was also fixed in MySQL 5.7.19. I am actually surprised that as this stage we still have older InnoDB internal data dictionary tables in MySQL 8.0.x.
  • Group replication related Bug #85667, Bug #85047, Bug #84728  and Bug #84733 were also listed as fixed in MySQL 5.7.19.
  • Same situation with normal async replication bugs: Bug #83184, Bug #82283, Bug #81232, Bug #77406 etc. It's expected to see fixes applied to the oldest version affected and then fixes merged to newer versions.
  • The first really unique fix in 8.0.2 that I found was Bug #85639 - "XA transactions are 'unsafe' for RPL using SBR". It was reported by João Gramacho (who probably works for Oracle) originally for MySQL 5.7 and is going to be fixed in MySQL 5.7.20 also.
  • Replication-related Bug #85739 is still private. Release notes say:
    "Issuing SHOW SLAVE STATUS FOR CHANNEL 'group_replication_recovery' following a restart of a server using group replication led to an unplanned shutdown."
  • Yet another private replication bugs fixed in 8.0.2 are: Bug #85405, Bug #85084, Bug #84646Bug #84471, Bug #82467 and Bug #80368. I do not know who reported them, what versions are affected (but I suspect .5.7.x also) and why are they remaining private after being fixed in 8.0.2.
  • The first 8.0.x specific public bug report I've found in the release notes was reported by Andrey Hristov and verified by Umesh Shastry. It is Bug #85937 - "Unchecked read after allocated buffer".
  • Bug #86120 - "Upgrading a MySQL instance with a table name of >64 chars results in breakage", was reported by Daniël van Eeden and verified by Umesh Shastry. I expect a lot of problem reports when users starts to upgrade to 8.0... See also Bug #84889 - "MYSQL 8.0.1 - MYSQLD ERRORLOG UPGRADE ERRORS AT SERVER START LIVE UPGRADE", by Susan Koerner.
  • Bug #85994 - "Out-of-bounds read in mysqld.cc fix_paths", was reported by Laurynas Biveinis and verified by Umesh Shastry. Percona seems to care a lot about improving MySQL 8.0 (as well as other Oracle MySQL GA versions). See also Bug #85678 by Laurynas and Bug #85059 by Roel Van de Paar.
  • Jon Olav Hauglid seems to care about the new data dictionary code quality, so he had reported related public bugs: Bug #85811, Bug #85800, and Bug #83473.
  • Bug #85704 - "mysql 8.0.x crashes when old-style trigger misses the "created" line in .TRG", was reported by Shane Bester. Workaround was also suggested by Jesper Krogh. Shane also reported this nice regression Bug #83019 - "queries in "show processlist" oscillate with constant times higher each day", that is now fixed in 8.0.2.
  • Bug #85614 - "alter table fails when default character set changes to utf8mb4", was reported by Tor Didriksen. He had also reported Bug #85224 - "Illegal mix of collations for time/varchar".
  • Bug #85518 - "Distinct operations on temp tables allocate too little memory for sort keys", was reported by Steinar Gunderson. He had also reported Bug #85487 - "num_tmp_files in filesort optimizer trace is nonsensical".
  • Bug #85179 - "Assert in sql/field.cc:... virtual String* Field_varstring::val_str", was reported by Matthias Leich.
I skipped several bugs that are fixed also in older versions. Many of them were already discussed in my posts. I also skipped all build/compilation/packaging bugs for now.

To summarize, while total number of public bug reports fixed in MySQL 8.0.2 is notable, many of these bugs were reported by few Oracle engineers who are still brave enough to report bugs in public. From Community, it seems only mostly Percona and Booking.com engineers do care to check MySQL 8.0.x at this early stage and report bugs. I am especially concerned with the number of private bug reports mentioned in the release notes of 8.0.2...

by Valeriy Kravchuk (noreply@blogger.com) at August 20, 2017 02:56 PM

August 19, 2017

MariaDB Foundation

MariaDB 10.2.8 and MariaDB Galera Cluster 10.0.32 now available

The MariaDB project is pleased to announce the availability of MariaDB 10.2.8 and MariaDB Galera Cluster 10.0.32. See the release notes and changelogs for details. Download MariaDB 10.2.8 Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Galera Cluster 10.0.32 Release Notes Changelog What is MariaDB Galera Cluster? […]

The post MariaDB 10.2.8 and MariaDB Galera Cluster 10.0.32 now available appeared first on MariaDB.org.

by Ian Gilfillan at August 19, 2017 05:56 PM

August 18, 2017

MariaDB AB

MariaDB Server 10.2.8 now available

MariaDB Server 10.2.8 now available dbart Fri, 08/18/2017 - 14:06

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.8. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.2.8

Release Notes Changelog What is MariaDB Server 10.2?


MariaDB Galera Cluster 10.0.32 is also now available for download.

Download MariaDB Galera Cluster 10.0.32

Release Notes Changelog What is MariaDB Galera Cluster?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.8. See the release notes and changelog for details.

Login or Register to post comments

by dbart at August 18, 2017 06:06 PM

Peter Zaitsev

This Week in Data with Colin Charles: Percona Live Europe!

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Has a week passed already? Welcome back to the second column. A lot of time has been spent neck deep in getting speakers accepted and scheduled for Percona Live Open Source Database Conference Europe 2017 in Dublin, as well organizing the conference sponsors.

Percona Live Europe Dublin

Percona Live Europe Colin CharlesAt the time of writing, we are six weeks away from the conference, so a little over a month! Have you registered yet?

We have 12 tutorials that cover a wide range of topics: ProxySQL (from the author Rene Cannao), Orchestrator (from the author Shlomi Noach), practical Couchbase (to name a few). If we did a technology word cloud, the coverage includes MongoDB, Docker, Elastic, Percona Monitoring and Management (PMM), Percona XtraDB Cluster 5.7, MySQL InnoDB Cluster and Group Replication.

In addition to that, if you’re a MySQL beginner (or thinking of a career change) there is a six-hour boot camp titled MySQL in a Nutshell (Part 1 and Part 2)!. Come prepared with your laptop, and leave a MySQL DBA!

Sessions are scheduled, and most of the content is already online: check out day 1, and day 2. We have 104 sessions scheduled, so there’s plenty to choose from.

Remember that you have till 7:00 a.m. UTC-1, August 16th, 2017 to book the group rate at the event venue for €250/night. Use code PERCON.

Releases

  • orchestrator/raft: Pre-release 3.0 is available. I’m a huge fan of Orchestrator, and now you can setup high availability for orchestrator via the Raft consensus protocol.
  • MariaDB 10.0.32 is out, and it comes with a new Percona XtraDB, Percona TokuDB and a new InnoDB. You’ll want this release if you’re using TokuDB, as it merges from TokuDB 5.6.36-82.1 (which fixes the two issues problem).
  • If you encountered the TokuDB problems above, you’ll want to look at MariaDB 10.1.26. One surprise hidden in the release notes: MariaDB Backup is now a stable/GA release. Have you used it yet?

Link List

I look forward to feedback/tips via e-mail at colin.charles@percona.com or I’m @bytebot on Twitter.

by Colin Charles at August 18, 2017 04:36 PM

August 17, 2017

Peter Zaitsev

IMDb Data in a Graph Database

Graph Database 1

Graph Database 1In this first of its kind, Percona welcomes Dehowe Feng, Software Developer from Bitnine as a guest blogger. In his blog post, Dehowe discusses how viewing imported data from IMDb into a graph database (AgensGraph) lets you quickly see how data nodes relate to each other. This blog echoes a talk given by Bitnine at the Percona Live Open Source Database Conference 2017.

Graphs help illustrate the relationships between entities through nodes, drawing connections between people and objects. Relationships in IMDb are inherently visual. Seeing how things are connected grants us a better understanding of the context underneath. By importing IMDb data as graph data, you simplify the schema can obtain key insights.

In this post, we will examine how importing IMDb into a graph database (in this case, AgensGraph) allows us to look at data relationships in a much more visual way, providing more intuitive insights into the nature of related data.

For install instructions to the importing scripts, go here.

Internet Movie Database (IMDb) owned by Amazon.com is one of the largest movie databases. It contains 4.1 million titles and 7.7 million personalities (https://en.wikipedia.org/wiki/IMDb).

Relational Schema for IMDb

Graph Database 2

Relational Schema of IMDb Info

Picture courtesy of user ofthelit on StackOverflow, https://goo.gl/SpS6Ca

Because IMDb’s file format is not easy to read and parse, rather than implementing the file directly we use an additional step to load it into relational tables. For this project, we used IMDbpy to load relational data into AgensGraph in relational form. The above figure is the relational schema which IMDbpy created. This schema is somewhat complicated, but essentially there are four basic entries: Production, Person, Company and Keyword. Because there are many N-to-N relationships between these entities, the relational schema has more tables than the number of entities. This makes the schema harder to understand. For example, a person can be related to many movies and a movie can have many characters.

Concise Graph Modeling

From there, we developed our own graph schema using Production, Person, Company and Keyword as our nodes (or end data points).

Productions lie at the “center” of the graph, with everything leading to them. Keywords describing Productions, Persons and Companies are credited for their contributions to Productions. Productions are linked to other productions as well.

Graph Database 3

Simplified Graph Database Schema

With the data in graph form, one can easily see the connections between all the nodes. The data can be visualized as a network and querying the data with Cypher allows users to explore the connections between entities.

Compared to the relational schema of IMDb, the graph schema is much simpler to understand. By merging related information for the main entities into nodes, we can access all relevant information to that node through that node, rather than having to match IDs across tables to get the information that we need. If we want to examine how a node relates to another node, we can query its edges to see the connections it forms. Being able to visually “draw a connection” from one node to another helps to illustrate how they are connected.

Furthermore, the labels of the edges describe how the nodes are connected. Edge labels in the IMDb Graph describe what kind of connection is formed, and pertinent information may be stored in attributes in the edges. For example, for the connections ACTOR_IN and ACTRESS_IN, we store role data, such as character name and character id.

Data Migration

To make vertexes’ and edges’ properties we use “views”, which join related tables. The data is migrated into a graph format by querying the relational data using selects and joins into a single table with the necessary information for creating each node.

For example, here is the SQL query used to create the jsonb_keyword view:

CREATE VIEW jsonb_keyword AS
SELECT row_to_json(row(keyword)) AS data
FROM keyword;

We use a view to make importing queries simpler. Once this view is created, its content can be migrated into the graph. After the graph is created, the graph_path is set, and the VLABEL is created, we can use the convenient LOAD keyword to load the JSON values from the relational table into the graph:

LOAD FROM jsonb_keyword AS keywords
CREATE (a:Keyword = data(keywords) );

Note that here LOAD is used to load data in from a relational table, but LOAD can also be used to load data from external sources as well.

Creating edges is a similar process. We load edges from the tables that store id tuples of the between the entities after creating their ELABELs:

LOAD FROM movie_keyword AS rel_key_movie
MATCH (a:Keyword), (b:Production)
WHERE a.id::int = (rel_key_movie).keyword_id AND
b.id::int = (rel_key_movie).movie_id
CREATE (a)-[:KEYWORD_OF]->(b);

As you can see, AgensGraph is not restricted to the CSV format when importing data. We can import relational data into its graph portion using the LOAD feature and SQL statements to refine our data sets.

How is information stored?

Most of the pertinent information is held in the nodes (vertexes). Nodes are labeled either as Productions, Persons, Companies or Keywords, and their relative information is stored as JSONs. Since IMDB information is constantly updated, many fields for certain entities are left incomplete. Since JSON is semi-structured, if an entity does not have a certain piece of information the field will not exist at all – rather than having a field and marking it as NULL.

We also use nested JSON arrays to store data that may have multiple fields, such as quotes that persons might have said or alternate titles to productions. This makes it possible to store “duplicate” fields in each node.

How can this information be used?

In the graph IMDb database, querying between entities is very easy to learn. Using the Cypher Query Language, a user can find things such as all actors that acted in a certain production, all productions that a person has worked on or all other companies that have worked with a certain company on any production. Graph database strength is the simplicity of visualizing the data. There are many ways you can query a graph database to find what you need!

Find the name of all actors that acted in Night at the Museum:

MATCH (a:Person)-[:ACTOR_IN]->(b:Production)
WHERE title = 'Night at the Museum'
RETURN a.name,b.title;

Result:

name | title
-----------------------+---------------------
Asprinio, Stephen | Night at the Museum
Blais, Richard | Night at the Museum
Bougere, Teagle F. | Night at the Museum
Bourdain, Anthony | Night at the Museum
Cherry, Jake | Night at the Museum
Cheng, Paul Chih-Ping | Night at the Museum
...
(56 rows)

Find all productions that Ben Stiller worked on:

MATCH (a:Person)-[b]->(c:Production)
WHERE a.name = 'Stiller, Ben'
RETURN a.name,label(b),c.title;

Result:

name | label | title
-------------+-------------+-----------------------------------------------
...
Stiller, Ben | actor_in | The Heartbreak Kid: The Egg Toss
Stiller, Ben | producer_of | The Hardy Men
Stiller, Ben | actor_in | The Heartbreak Kid: Ben & Jerry
Stiller, Ben | producer_of | The Polka King
Stiller, Ben | actor_in | The Heartbreak Kid
Stiller, Ben | actor_in | The Watch
Stiller, Ben | actor_in | The History of 'Walter Mitty'
Stiller, Ben | producer_of | The Making of 'The Pick of Destiny'
Stiller, Ben | actor_in | The Making of 'The Pick of Destiny'
...
(901 rows)

Find all actresses that worked with Sarah Jessica Parker:

MATCH (a:Person)-[b:ACTRESS_IN]->(c:Production)<-[d:ACTRESS_IN]-(e:Person)
WHERE a.name = 'Parker, Sarah Jessica'
RETURN DISTINCT e.name;

Result:

name
---------------------------------
Aaliyah
Aaron, Caroline
Aaron, Kelly
Abascal, Nati
Abbott, Diane
Abdul, Paula
...
(3524 rows)

Summary

The most powerful aspects of a graph database are flexibility and visualization capabilities.

In the future, we plan to implement a one-step importing script. Currently, the importing script is two-phased: the first step is to load into relational tables and the second step is to load into the graph. Additionally, AgensGraph has worked with Gephi to release a data import plugin. The Gephi Connector allows for graph visualization and analysis. For more information, please visit www.bitnine.net and www.agensgraph.com.

by Dehowe Feng at August 17, 2017 07:36 PM

August 16, 2017

Peter Zaitsev

Percona Monitoring and Management 1.2.1 is Now Available

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM)Percona announces the release of Percona Monitoring and Management 1.2.1 on August 16, 2017.

For install and upgrade instructions, see Deploying Percona Monitoring and Management.

This hotfix release improves memory consumption.

Changes in PMM Server

We’ve introduced the following changes in PMM Server 1.2.1:

Bug fixes

  • PMM-1280: PMM server affected by nGinx CVE-2017-7529. An integer overflow exploit could result in a DOS (Denial of Service) for the affected nginx service with the max_ranges directive not set. This problem is solved by setting the set max_ranges directive to 1 in the nGinx configuration.

Improvements

  • PMM-1232: Update the default value of the METRICS_MEMORY configuration setting. Previous versions of PMM Server used a different value for the METRICS_MEMORY configuration setting which allowed Prometheus to use up to 768MB of memory. PMM Server 1.2.0 used the storage.local.target-heap-size setting, its default value is 256MB. Unintentionally, this value reduced the amount of memory that Prometheus could use. As a result, the performance of Prometheus was affected. To improve the performance of Prometheus, the default setting of storage.local.target-heap-size has been set to 768 MB.

About Percona Monitoring and Management

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

We’re always happy to help! Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, please use the PMM project in JIRA.

by Borys Belinsky at August 16, 2017 05:31 PM

Upcoming Webinar Thursday, August 17: Efficient CRUD Queries in MongoDB

CRUD QueriesJoin Percona’s, Senior Technical Operations Architect Tim Vaillancourt as he presents “Efficient CRUD Queries in MongoDB” on Thursday, August 17, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

MongoDB has its own commands and function structures that ask the database to do work. In this talk, we will discuss how queries, updates, deletes and inserts work. However, we will go beyond these actions and also review what operators you should and shouldn’t use, and how they might actually drive your schema choices. Then we will talk about operationally sound ways for bulk deleting and inserting when you want to limit the impact on production (if other tools are too aggressive).

Register for the webinar here.

Timothy Vaillancourt, Sr. Technical Operations Architect for MongoDB

Tim joined Percona in 2016 as Sr. Technical Operations Architect for MongoDB, with the goal of making MongoDB operations as smooth as possible. With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming, combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack (with many lessons learned). Tim is based in Amsterdam, NL and enjoys traveling, coding and music. Before Percona, Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping launch and operate some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) smoothly. At the same time, he also led the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS. Prior to moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

by Emily Ikuta at August 16, 2017 04:02 PM

August 15, 2017

Peter Zaitsev

Upcoming Webinar Wednesday August 16: Lock, Stock and Backup – Data Guaranteed

Backup

BackupJoin Percona’s, Technical Services Manager, Jervin Real as he presents Lock, Stock and Backup: Data Guaranteed on Wednesday, August 16, 2017 at 7:00 am PDT / 10:00 am EDT (UTC-7).

Backups are crucial in a world where data is digital and uptime is revenue. Environments are no longer bound to traditional data centers, and span multiple cloud providers and many heterogeneous environments. We need bulletproof backups and impeccable recovery processes. This talk aims to answer the question “How should I backup my MySQL databases?” by providing 3-2-1 backup designs, best practices and real-world solutions leveraging key technologies, automation techniques and major cloud provider services.

Register for the webinar here.

Jervin RealJervin Real

As Technical Services Manager, Jervin partners with Percona’s customers on building reliable and highly performant MySQL infrastructures while also doing other fun stuff like watching cat videos on the internet. Jervin joined Percona in April 2010. Starting as a PHP programmer, Jervin quickly learned the LAMP stack. He has worked on several high-traffic sites and a number of specialized web applications (such as mobile content distribution). Before joining Percona, Jervin also worked with several hosting companies, providing care for customer hosted services and data on both Linux and Windows.

by Emily Ikuta at August 15, 2017 04:45 PM

Jean-Jerome Schmidt

What’s New With MySQL Replication in MySQL 8.0

Replication in MySQL has been around for a long time, and has been steadily improving over the years. It has been more like evolution rather than revolution. This is perfectly understandable, as replication is an important feature that many depend on - it has to work.

In the last MySQL versions, we’ve seen improvements in replication performance through support for applying transactions in parallel. In MySQL 5.6, parallelization was done on schema level - all transactions which have been executed in separate schemas could be executed at once. This was a nice improvement for those workloads that had multiple schemas on a single server, and the load was distributed more or less evenly across the schemas.

In MySQL 5.7, another parallelization method was added, so called “logical clock”. It allowed to get some level of concurrency on a slave, even if all your data has been stored in a single schema. It was based, in short, on the fact that some transactions would commit together because of a latency added by hardware. You could even add that latency manually, to achieve better parallelization on the slaves using binlog_group_commit_sync_delay.

This solution was really nice but not without drawbacks. Every delay in committing a transaction could eventually affect user-facing parts of the application. Sure, you can set delays within a range of several milliseconds, but even then, it’s additional latency which slows down the app.

Replication performance improvements in MySQL 8.0

MySQL 8.0, which as of now (August 2017) is still in beta state, brings some nice improvements to replication. Originally, it was developed for Group Replication (GR), but as GR uses regular replication under the hood, “normal” MySQL replication benefited from it. The improvement we mentioned is dependency tracking information stored in the binary log. What happens is that MySQL 8.0 now has a way to store information about which rows were affected by a given transaction (so called writeset), and it compares writesets from different transactions. This makes it possible to identify those transactions which did not work on the same subset of rows and, therefore, these may be applied in parallel. This may allow to increase the parallelization level by several times compared to the implementation from MySQL 5.7. What you need to keep in mind is that, eventually, a slave will see a different view of the data, one that never appeared on the master. This is because transactions may be applied in a different order than on the master. This should not be a problem though. The current implementation of multithreaded replication in MySQL 5.7 may also cause this issue unless you explicitly enable slave-preserve-commit-order.

To control this new behavior, a variable binlog_transaction_dependency_tracking has been introduced. It can take three values:

  • COMMIT_ORDER: this is the default one, it uses the default mechanism available in MySQL 5.7.
  • WRITESET: It enables better parallelization and the master starts to store writeset data in binary log.
  • WRITESET_SESSION: This ensures that transactions will be executed on the slave in order and the issue with a slave that sees a state of database which never was seen on the master is eliminated. It reduces parallelization but it still can provide better throughput than the default settings.

Benchmark

In July, on mysqlhighavailability.com, Vitor Oliveira wrote a post where he tried to measure the performance of new modes. He used the best case scenario - no durability whatsoever, to showcase the difference between old and new modes. We decided to use the same approach, this time in a more real-world setup: binary log enabled with log_slave_updates. Durability settings were left to default (so, sync_binlog=1 - that’s new default in MySQL 8.0, doublewrite buffer enabled, InnoDB checksums enabled etc.) Only exception in durability was innodb_flush_log_at_trx_commit set to 2.

We used m4.2xl instances, 32G, 8 cores (so slave_parallel_workers was set to 8). We also used sysbench, oltp_read_write.lua script. 16 million rows in 32 tables were stored on 1000GB gp2 volume (that’s 3000 IOPS). We tested the performance of all of the modes for 1, 2, 4, 8, 16 and 32 concurrent sysbench connections. Process was as follows: stop slave, execute 100k transactions, start slave and calculate how long it takes to clear the slave lag.

First of all, we don’t really know what happened when sysbench was executed using 1 thread only. Each test was executed five times after a warmup run. This particular configuration was tested two times - results are stable: single-threaded workload was the fastest. We will be looking into it further to understand what happened.

Other than that, the rest of the results are in line with what we expected. COMMIT_ORDER is the slowest one, especially for low traffic, 2-8 threads. WRITESET_SESSION performs typically better than COMMIT_ORDER but it’s slower than WRITESET for low-concurrent traffic.

How it can help me?

The first advantage is obvious: if your workload is on the slow side yet your slaves have tendency to fall back in replication, they can benefit from improved replication performance as soon as the master will be upgraded to 8.0. Two notes here: first - this feature is backward compatible and 5.7 slaves can also benefit from it. Second - a reminder that 8.0 is still in beta state, we don’t encourage you to use beta software on production, although in dire need, this is an option to test. This feature can help you not only when your slaves are lagging. They may be fully caught up but when you create a new slave or reprovision existing one, that slave will be lagging. Having the ability to use “WRITESET” mode will make the process of provisioning a new host much faster.

All in all, this feature will have much bigger impact that you may think. Given all of the benchmarks showing regressions in performance when MySQL handles traffic of low concurrency, anything which can help to speed up the replication in such environments is a huge improvement.

If you use intermediate masters, this is also a feature to look for. Any intermediate master adds some serialization into how transactions are handled and executed - in real world, the workload on an intermediate master will almost always be less parallel than on the master. Utilizing writesets to allow better parallelization not only improves parallelization on the intermediate master but it also can improve parallelization on all of its slaves. It is even possible (although it would require serious testing to verify all pieces will fit correctly) to use an 8.0 intermediate master to improve replication performance of your slaves (please keep in mind that MySQL 5.7 slave can understand writeset data and use it even though it cannot generate it on its own). Of course, replicating from 8.0 to 5.7 sounds quite tricky (and it’s not only because 8.0 is still beta). Under some circumstances, this may work and can speed up CPU utilization on your 5.7 slaves.

Other changes in MySQL replication

Introducing writesets, while it is the most interesting, it is not the only change that happened to MySQL replication in MySQL 8.0. Let’s go through some other, also important changes. If you happen to use a master older than MySQL 5.0, 8.0 won’t support its binary log format. We don’t expect to see many such setups, but if you use some very old MySQL with replication, it’s definitely a time to upgrade.

Default values have changed to make sure that replication is as crash-safe as possible: master_info_repository and relay_log_info_repository are set to TABLE. Expire_log_days has also been changed - now the default value is 30. In addition to expire_log_days, a new variable has been added, binlog_expire_log_seconds, which allows for more fine-grained binlog rotation policy. Some additional timestamps have been added to the binary log to improve observability of replication lag, introducing microsecond granularity.

By all means, this is not a full list of changes and features related to MySQL replication. If you’d like to learn more, you can check the MySQL changelogs. Make sure you reviewed all of them - so far, features have been added in all 8.0 versions.

As you can see, MySQL replication is still changing and becoming better. As we said at the beginning, it has to be a slow-paced process but it’s really great to see what is ahead. It’s also nice to see the work for Group Replication trickling down and reused in the “regular” MySQL replication.

by krzysztof at August 15, 2017 09:29 AM

August 14, 2017

MariaDB AB

Sensitive Data Masking with MariaDB MaxScale

Sensitive Data Masking with MariaDB MaxScale Dipti Joshi Mon, 08/14/2017 - 09:11

Protecting personal and sensitive data, and complying with security and privacy regulations, is high priority for organizations. This includes personally identifiable information (PII), protected health information (PHI), payment card information (subject to PCI-DSS regulation) and intellectual property (subject to ITAR and EAR regulations). In many cases, if not most, it needs to be redacted or masked when accessed (internally and/or externally).

Data redaction obfuscates all or part of the data, reducing unnecessary exposure of sensitive data while at the same time maintaining its usability. Various terms such as data masking, data obfuscation and data anonymization are used to describe this functionality in databases. Data redaction allows an organization to:

  • Meet regulations

  • Protect against insider threat

  • Use production data for non-production use cases (e.g., testing and training)

MariaDB TX and MariaDB AX are complete solutions for high performance transactional and analytical workloads, respectively. Included in both of these solutions is MariaDB MaxScale, a next generation database proxy. In addition to load balancing and query routing, this database proxy provides enhanced security, like encryption of data in flight and masking of sensitive end user data.  

In this blog, we show you how to redact data using the masking filter in MariaDB MaxScale.

Data Masking

With the masking filter, the value of a particular column returned by a query can be obfuscated. For instance, suppose there is a table patients that, among other columns, contains the column ssn where the social security number or national identification number of a patient is stored. With the masking filter, it is possible to specify that when the ssn field is queried, a masked value is returned instead of actual ssn. 

For example:

> SELECT name, ssn FROM patients;

Without masking of SSN column, query result would be:

+-------+-------------+
+ name  | ssn         |
+-------+-------------+
| Alice | 721-07-4426 |
| Bob   | 435-22-3267 |
…

And with masking of SSN column, the query result would be

+-------+-------------+
+ name  | ssn         |
+-------+-------------+
| Alice | XXXXXXXXXXX |
| Bob   | XXXXXXXXXXX |
...

Example configuration of masking filter in MaxScale configuration looks like following:

[MyMasking]
type=filter
module=masking
warn_type_mismatch=always
large_payload=abort
rules=masking_rules.json

[MyService]
type=service
...
filters=MyMasking

And the masking_rules.json will look like this

{
   "rules": [
       {
           "replace": {
               "column": "ssn"
           },
           "with": {
               "fill": "X"
           }
       }
   ]
}

Now doing the following query will show masked data for ssn column

> select name, ssn from patients;

+---------------+--------------+

| name          | ssn          |

+---------------+--------------+

| John Doe      | XXXXXXXXXXX  |

| Jack Smith    | XXXXXXXXXXX  |

| Jane Richards | XXXXXXXXXXX  |

+---------------+--------------+

> select * from patients;

+------+---------------+-------------+--------+-------+ 

| id   | name          | ssn         | gender | age  | 
|+------+--------------+-------------+--------+-------+

|    1 | John Doe      | XXXXXXXXXXX | M      |   55 |

|    2 | Jack Smith    | XXXXXXXXXXX | M      |   38 |

|    3 | Jane Richards | XXXXXXXXXXX | F      |   48 |

+------+---------------+-------------+--------+-------+
> select name, ssn as xyz from patients;

+---------------+--------------+

| name          | xyz          |

+---------------+--------------+

| John Doe      | XXXXXXXXXXX  |

| Jack Smith    | XXXXXXXXXXX  |

| Jane Richards | XXXXXXXXXXX  |

+---------------+--------------+

We have additional enhancements planned for the masking filter in the next release (MariaDB MaxScale 2.2), including partial data masking. 

For details on how to setup the rules, please see MariaDB MaxScale masking filter guide.

If you would like to prevent users from using functions on columns to be masked, you may use the database firewall filter to block such queries.  For details on how to configure black-listing and white-listing rules with firewall filter, please see MariaDB MaxScale Database Firewall filter guide.

To learn more about all the security features of MariaDB TX solution, attend this webinar to learn about advanced security features for data protection.

In this blog, we show you how you can use MariaDB MaxScale’s masking filter for sensitive user data masking.

 

Login or Register to post comments

by Dipti Joshi at August 14, 2017 01:11 PM

August 12, 2017

Valeriy Kravchuk

More on Studying MySQL Hashes in gdb, and How P_S Code May Help

I have to get back to the topic of checking user variables in gdb to clarify few more details. In his comment Shane Bester kindly noted that calling functions defined in MySQL code is not going to work when core dump is studied. So, I ended up with a need to check what does the my_hash_element() function I've used really do, to be ready to repeat that step by step manually. Surely I could skip that and use Python and Shane himself did, but structures of HASH type are widely used in MySQL, so I'd better know how to investigate them manually than blindly use existing code.

Quick search with grep for my_hash_element shows:
[root@centos mysql-server]# grep -rn my_hash_element *
include/hash.h:94:uchar *my_hash_element(HASH *hash, ulong idx);
mysys/hash.c:734:uchar *my_hash_element(HASH *hash, ulong idx)plugin/keyring/hash_to_buffer_serializer.cc:34:      if(store_key_in_buffer(reinterpret_cast<const IKey *>(my_hash_element(keys_hash, i)),
plugin/version_token/version_token.cc:135:  while ((token_obj= (version_token_st *) my_hash_element(&version_tokens_hash, i)))
plugin/version_token/version_token.cc:879:    while ((token_obj= (version_token_st *) my_hash_element(&version_tokens_hash, i)))
sql/sql_base.cc:1051:    TABLE_SHARE *share= (TABLE_SHARE *)my_hash_element(&table_def_cache, idx);
sql/sql_base.cc:1262:        share= (TABLE_SHARE*) my_hash_element(&table_def_cache, idx);
sql/sql_udf.cc:277:    udf_func *udf=(udf_func*) my_hash_element(&udf_hash,idx);
...
sql/table_cache.cc:180:      (Table_cache_element*) my_hash_element(&m_cache, idx);
sql/rpl_gtid.h:2123:        Node *node= (Node *)my_hash_element(hash, i);
sql/rpl_gtid.h:2274:          node= (Node *)my_hash_element(hash, node_index);
sql/rpl_tblmap.cc:168:    entry *e= (entry *)my_hash_element(&m_table_ids, i);
sql/rpl_master.cc:238:    SLAVE_INFO* si = (SLAVE_INFO*) my_hash_element(&slave_list, i);
sql-common/client.c:3245:        LEX_STRING *attr= (LEX_STRING *) my_hash_element(attrs, idx);
storage/perfschema/table_uvar_by_thread.cc:76:    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));storage/ndb/include/util/HashMap.hpp:155:    Entry* entry = (Entry*)my_hash_element(&m_hash, (ulong)i);
storage/ndb/include/util/HashMap.hpp:169:    Entry* entry = (Entry*)my_hash_element((HASH*)&m_hash, (ulong)i);
[root@centos mysql-server]#
That is, HASH structure is used everywhere in MySQL, from keyring to UDFs and table cache, to replication and NDB Cluster, with everything in between. If I can navigate to each HASH element and dump/print it, I can better understand a lot of code, if needed. If anyone cares, HASH is defined in a very simple way in include/hash.h:
typedef struct st_hash {
  size_t key_offset,key_length;         /* Length of key if const length */
  size_t blength;
  ulong records;
  uint flags;
  DYNAMIC_ARRAY array;                          /* Place for hash_keys */
  my_hash_get_key get_key;
  void (*free)(void *);
  CHARSET_INFO *charset;
  my_hash_function hash_function;
  PSI_memory_key m_psi_key;
} HASH;
It relies on DYNAMIC_ARRAY to store keys.

The code of the my_hash_element function in mysys/hash.c is very simple:
uchar *my_hash_element(HASH *hash, ulong idx)
{
  if (idx < hash->records)
    return dynamic_element(&hash->array,idx,HASH_LINK*)->data;
  return 0;
}
Quick search for dynamic_element shows that it's actually a macro:
[root@centos mysql-server]# grep -rn dynamic_element *
client/mysqldump.c:1608:    my_err= dynamic_element(&ignore_error, i, uint *);
extra/comp_err.c:471:      tmp= dynamic_element(&tmp_error->msg, i, struct message*);
extra/comp_err.c:692:    tmp= dynamic_element(&err->msg, i, struct message*);
extra/comp_err.c:803:  first= dynamic_element(&err->msg, 0, struct message*);
include/my_sys.h:769:#define dynamic_element(array,array_index,type) \mysys/hash.c:126:  HASH_LINK *data= dynamic_element(&hash->array, 0, HASH_LINK*);
...
that is defined in include/my_sys.h as follows:
#define dynamic_element(array,array_index,type) \
  ((type)((array)->buffer) +(array_index))
So, now it's clear what to do in gdb, having in mind what array do we use. Let me start the session, find a thread I am interested in and try to check elements one by one:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fc3a037b700 (LWP 3061))]#0  0x00007fc3d2cb3383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 5
(gdb) p do_command::thd->user_vars
$2 = {key_offset = 0, key_length = 0, blength = 4, records = 3, flags = 0,
  array = {buffer = 0x7fc3ba2b3560 "\377\377\377\377", elements = 3,
    max_element = 16, alloc_increment = 32, size_of_element = 16,
    m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
(gdb) set $uvars=&(do_command::thd->user_vars)
(gdb) p $uvars
$3 = (HASH *) 0x7fc3b9fad280
...
(gdb) p &($uvars->array)
$5 = (DYNAMIC_ARRAY *) 0x7fc3b9fad2a8
(gdb) p ((HASH_LINK*)((&($uvars->array))->buffer) + (0))
$6 = (HASH_LINK *) 0x7fc3ba2b3560
(gdb) p ((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data
$7 = (uchar *) 0x7fc3b9fc80e0 "H\201\374\271\303\177"
(gdb) p (user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data)
$8 = (user_var_entry *) 0x7fc3b9fc80e0
(gdb) p *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (0))->data)
$9 = {static extra_size = 8, m_ptr = 0x7fc3b9fc8148 "bbb", m_length = 3,
  m_type = STRING_RESULT, m_owner = 0x7fc3b9fad000, m_catalog = {
    str = 0x100000000 <Address 0x100000000 out of bounds>,
    length = 416611827727}, entry_name = {m_str = 0x7fc3b9fc8150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 25,
  used_query_id = 25, unsigned_flag = false}
...
(gdb) p *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (2))->data)
$11 = {static extra_size = 8, m_ptr = 0x7fc3b9e6e220 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc3b9fad000, m_catalog = {str = 0x0,
    length = 0}, entry_name = {m_str = 0x7fc3b9fc8290 "c", m_length = 1},
  collation = {collation = 0x1ded740, derivation = DERIVATION_IMPLICIT,
    repertoire = 3}, update_query_id = 25, used_query_id = 25,
  unsigned_flag = false}
(gdb)
I tried to highlight important details above. With gdb variables it's a matter of proper type casts and dereferencing. In general, I was printing content of item (user variable in this case) with index N as *(user_var_entry *)(((HASH_LINK*)((&($uvars->array))->buffer) + (N))->data).

Now, back to printing the variables.  Let's see how this is done in performance_schema, in storage/perfschema/table_uvar_by_thread.cc:
     74   for (;;)
     75   {
     76     sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));
     77     if (sql_uvar == NULL)
     78       break;
...
     98     /* Copy VARIABLE_NAME */
     99     const char *name= sql_uvar->entry_name.ptr();
    100     size_t name_length= sql_uvar->entry_name.length();
    101     DBUG_ASSERT(name_length <= sizeof(pfs_uvar.m_name));
    102     pfs_uvar.m_name.make_row(name, name_length);
    103
    104     /* Copy VARIABLE_VALUE */
    105     my_bool null_value;
    106     String *str_value;
    107     String str_buffer;
    108     uint decimals= 0;
    109     str_value= sql_uvar->val_str(& null_value, & str_buffer, decimals);    110     if (str_value != NULL)
    111     {
    112       pfs_uvar.m_value.make_row(str_value->ptr(), str_value->length());
    113     }
    114     else
    115     {
    116       pfs_uvar.m_value.make_row(NULL, 0);
    117     }
    118
    119     index++;    120   }

So, there we check elements by index until there is no element with such index, and apply the val_str() function of the class. While debugging live server we can do the same, but if we care to see how it works step by step, here is the code from sql/item_func.cc:
String *user_var_entry::val_str(my_bool *null_value, String *str,
                                uint decimals) const
{
  if ((*null_value= (m_ptr == 0)))
    return (String*) 0;

  switch (m_type) {
  case REAL_RESULT:
    str->set_real(*(double*) m_ptr, decimals, collation.collation);    break;
  case INT_RESULT:
    if (!unsigned_flag)
      str->set(*(longlong*) m_ptr, collation.collation);    else
      str->set(*(ulonglong*) m_ptr, collation.collation);    break;
  case DECIMAL_RESULT:
    str_set_decimal((my_decimal *) m_ptr, str, collation.collation);
    break;
  case STRING_RESULT:
    if (str->copy(m_ptr, m_length, collation.collation))
      str= 0;                                   // EOM error
  case ROW_RESULT:
    DBUG_ASSERT(1);                             // Impossible
    break;
  }
  return(str);
}
For INT_RESULT and REAL_RESULT it's all clear, and Shane did essentially the same in his Python code. For strings we have to copy proper items into a zero terminated string or use methods of String class if we debug on a live server to get the entire string data. For DECIMAL_RESULT I checked the implementation of str_set_decimal() that relies on decimal2string() eventually that... looks somewhat complicated (check yourself in strings/decimal.c). So, I'd better to just print my_decimal structure in gdb, for any practical purposes, instead of re-implementing this function in Python.

To summarize, HASH structure is widely used in MySQL and it is easy to dump any of these hashes in gdb, item by item, in the same way as, for example, Performance Schema in MySQL 5.7 does this of user variables. Getting string representation of my_decimal "manually" is complicated.

by Valeriy Kravchuk (noreply@blogger.com) at August 12, 2017 04:45 PM

August 11, 2017

MariaDB AB

Extending the Power of MariaDB ColumnStore with User Defined Functions

Extending the Power of MariaDB ColumnStore with User Defined Functions david_thompson_g Fri, 08/11/2017 - 18:40

 

Introduction

MariaDB ColumnStore 1.0 supports User Defined Functions (UDF) for query extensibility. This allows you to create custom filters and transformations to suit any need. This blog outlines adding support for distributed JSON query filtering. 

An important MariaDB ColumnStore concept to grasp is that there are distributed and non-distributed functions. Distributed functions are executed at the PM nodes supporting query execution scale out. Non distributed functions are MariaDB Server functions that are executed within the UM node. As a result, MariaDB ColumnStore requires two distinct implementations of any function.

The next release, MariaDB ColumnStore 1.1, will bring support for User Defined Aggregate Functions and User Defined Window Functions.

Getting Started

To develop a User Defined Function requires familiarity with C, C++ and Linux.  At a high level the process involves:

  • Setting up a development environment to build MariaDB ColumnStore from source
  • Implementing the MariaDB Server UDF interface
  • Implementing the MariaDB ColumnStore Distributed UDF interface

You can find more information in the MariaDB knowledge base: User Defined Functions.

Example - JSON Pointer Query

In this example, I will create a simple UDF for JSON pointer queries using the RapidJSON library from Tencent. It will have the following syntax:

json_ptr(, )

The first argument is the JSON string. The second argument is the JSON pointer string.

Build Files

RapidJSON is a C++ header only library, so it must be included with the source code. I’ve chosen to add the RapidJSON source code as a submodule in the mariadb-columnstore-engine source code repository:

git submodule add https://github.com/miloyip/rapidjson utils/udfsdk/rapidjson

This will create a submodule rapidjson under the utils/udfsdk directory where the UDF SDK code is defined. 

The rapidjson include directory is added to the cmake file utils/udfsdk/CMakeLists.txt to make it available for use (last line below)

include_directories( ${ENGINE_COMMON_INCLUDES}
                     ../../dbcon/mysql
                     rapidjson/include)


MariaDB UDF SDK 

To implement the MariaDB Server UDF SDK, three procedures must be implemented in utils/udfsdk/udfmysql.cpp:

  • _init  : initialization / parameter validation
  • : function implementation.
  • _deinit : any clean up needed, for example freeing of allocated memory

The init function, in this example, will validate the argument count:

my_bool json_ptr_init(UDF_INIT* initid, UDF_ARGS* args, char* message)
{
    if (args->arg_count != 2)
    {
        strcpy(message,"json_ptr() requires two arguments: expression, path");
        return 1;
    }

    return 0;
}

The function implementation will take the JSON string, parse this into an object structure, execute the JSON pointer path and return a string representation of the results:

 1 string json_ptr(UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error)
 2 {
 3     string json = cvtArgToString(args->arg_type[0], args->args[0]);
 4     string path = cvtArgToString(args->arg_type[1], args->args[1]);
 5     Document d;
 6     d.Parse(json.c_str());
 7    if (Value *v = Pointer(path.c_str()).Get(d)) {
 8         rapidjson::StringBuffer sb;
 9         Writer writer(sb);
10         v->Accept(writer);
11         return string(sb.GetString());
12     }
13     else {
14         return string();
15     }
16 }

Here is a brief explanation of the code:

  • Lines 3-4 : read the 2 arguments and convert to string.
  • Lines 5-6 : Parse the JSON string into a RapidJSON Document object .
  • Line 7 : Execute the JSON pointer path against the JSON document into the Value object.
  • Lines 8-11 : If the Value object is not null then serialize Value to a string. The Value class also offers strongly typed accessors that may also be used. Note that string values will be serialized with surrounding double quotes.
  • Line 14: If the Value object is null then return the empty string.

The following RapidJSON includes are required at the top of the file:

#include "rapidjson/document.h"
#include "rapidjson/pointer.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/writer.h"

using namespace rapidjson;


The json_ptr_deinit function must be declared, but it is empty – there is nothing we need to do for clean up.

void json_ptr_deinit(UDF_INIT* initid)
{
}

MariaDB ColumnStore Distributed UDF SDK

The MariaDB ColumnStore distributed UDF SDK requires defining a class instance and registering an instance for it to be usable.

First the class must be declared in utils/udfsdk/udfsdk.h. The simplest approach is to clone one of the existing reference implementations:

class json_ptr : public funcexp::Func
{
  public:
    json_ptr() : Func("json_ptr") {}

    virtual ~json_ptr() {}

    execplan::CalpontSystemCatalog::ColType operationType(          
      funcexp::FunctionParm& fp, 
      execplan::CalpontSystemCatalog::ColType& resultType);

    virtual int64_t getIntVal(
      rowgroup::Row& row,
      funcexp::FunctionParm& fp,
      bool& isNull,
      execplan::CalpontSystemCatalog::ColType& op_ct);

    virtual double getDoubleVal(
      rowgroup::Row& row,
      funcexp::FunctionParm& fp,
      bool& isNull,
      execplan::CalpontSystemCatalog::ColType& op_ct);

     virtual float getFloatVal(
       rowgroup::Row& row,
       funcexp::FunctionParm& fp,
       bool& isNull,
       execplan::CalpontSystemCatalog::ColType& op_ct);

     virtual std::string getStrVal(
       rowgroup::Row& row,
       funcexp::FunctionParm& fp,
       bool& isNull,
       execplan::CalpontSystemCatalog::ColType& op_ct);

     virtual bool getBoolVal(
       rowgroup::Row& row,
       funcexp::FunctionParm& fp,
       bool& isNull,
       execplan::CalpontSystemCatalog::ColType& op_ct);

      virtual execplan::IDB_Decimal getDecimalVal(
        rowgroup::Row& row,
        funcexp::FunctionParm& fp,
        bool& isNull,                                                    
        execplan::CalpontSystemCatalog::ColType& op_ct);
   
      virtual int32_t getDateIntVal(
        rowgroup::Row& row,
        funcexp::FunctionParm& fp,
        bool& isNull,
        execplan::CalpontSystemCatalog::ColType& op_ct);

      virtual int64_t getDatetimeIntVal(
        rowgroup::Row& row,
        funcexp::FunctionParm& fp,
        bool& isNull,
        execplan::CalpontSystemCatalog::ColType& op_ct);

  private:
    void log_debug(std::string arg1, std::string arg2);

  };
}

It can be seen that the following methods are defined:

  • operationType which is used to indicate the return type of the function, which could be dynamic.
  • A number of getVal methods, which perform the operation for a given return type.
  • An optional private log_debug method, which could be used for debug logs in your implementation.

Next the class is implemented in utils/udfsdk/udfsdk.cpp.  First an entry must added to register the class by name (in lower class). This is added to the UDFMap function:

FuncMap UDFSDK::UDFMap() const
{
    FuncMap fm;
    fm["mcs_add"] = new MCS_add();
    fm["mcs_isnull"] = new MCS_isnull();
    fm["json_ptr"] = new json_ptr(); -- new entry for json_ptr function
    return fm;
}

The class is implemented, for brevity only a subset of methods is shown below:

    CalpontSystemCatalog::ColType json_ptr::operationType (
      FunctionParm& fp,                                                           
      CalpontSystemCatalog::ColType& resultType) {
        assert (fp.size() == 2);
        return fp[0]->data()->resultType();
    }

    void json_ptr::log_debug(string arg1, string arg2) {
        logging::LoggingID lid(28); // 28 = primproc
        logging::MessageLog ml(lid);

        logging::Message::Args args;
        logging::Message message(2);
        args.add(arg1);
        args.add(arg2);
        message.format( args );
        ml.logDebugMessage( message );
    }

    string json_ptr::getStrVal(Row& row,
                               FunctionParm& parm,
                               bool& isNull,
                               CalpontSystemCatalog::ColType& op_ct) {
        string json = parm[0]->data()->getStrVal(row, isNull);
        string path = parm[1]->data()->getStrVal(row, isNull);

        Document d;
        d.Parse(json.c_str());
        if (Value *v = Pointer(path.c_str()).Get(d)) {
            StringBuffer sb;
            Writer writer(sb);
            v->Accept(writer);
            return string(sb.GetString());
        }
        else {
            return string();
        }
    }

    double json_ptr::getDoubleVal(Row& row,
                                 FunctionParm& parm,
                                 bool& isNull,
                                 CalpontSystemCatalog::ColType& op_ct)
    {
        throw logic_error("Invalid API called json_ptr::getDoubleVal");
    }


The following methods are implemented:

  • operationType which validates there are exactly 2 arguments and specifies the return type as as string.
  • Optional log_debug method which illustrates how to log to the debug log.
  • getStrVal which performs the JSON evaluation. This is similar to the MariaDB UDF implementation with the exception of being strongly typed and the arguments are retrieved differently.
  • getDoubleVal illustrates throwing an error stating that this is not a supported operation. The method could be implemented of course but for simplicity this was not done.

The complete set of changes can be seen in github.

Building and Using the Example

This example can be built from a branch created from the 1.0.10 code. Once you have cloned the mariadb-columnstore-engine source tree, use the following instructions to build and install on the same server:

git checkout json_ptr_udf
git submodule update --init
cmake .
make -j4
cd utils/udfsdk/
sudo cp libudf_mysql.so.1.0.0 libudfsdk.so.1.0.0 /usr/local/mariadb/columnstore/lib

For a multi server setup, the library files should be copied to the same location on each server. After this, restart the MariaDB ColumnStore instance to start using the functions.

mcsadmin restartSystem

The function is registered using the create function syntax:

create function json_ptr returns string soname 'libudf_mysql.so';

Now a simple example will show the user defined function in action:

create table animal(
id int not null, 
creature varchar(30) not null, 
name varchar(30), 
age decimal(18) ,
attributes varchar(1000)
) engine=columnstore;

insert into animal(id, creature, name, age, attributes) 
values (1, 'tiger', 'roger', 10, '{"fierce": true, "colors": ["white", "orange", "black"]}'), 
       (2, 'tiger', 'sally', 2, '{"fierce": true, "colors": ["white", "orange", "black"]}'), 
       (3, 'lion', 'michael', 56, '{"fierce": false, "colors": ["grey"], "weight" : 9000}');

select name, json_ptr(attributes, '/fierce') is_fierce 
from animal 
where json_ptr(attributes, '/weight') = '9000';
+---------+-----------+
| name    | is_fierce |
+---------+-----------+
| michael | false     |
+---------+-----------+
1 row in set (0.05 sec)

In the example, you can see that the where clause uses the json_ptr function to filter on the weight element in the attributes JSON column and then the fierce element is retrieved in the select clause.

I hope this inspires you to further enhance this example or come up with your own user defined functions to extend the query capabilities of MariaDB ColumnStore.

Getting started? Download MariaDB ColumnStore today and learn how you can get started with MariaDB ColumnStore in 10 minutes.
 

MariaDB ColumnStore 1.0 supports User Defined Functions (UDF) for query extensibility. This allows you to create custom filters and transformations to suit any need. This blog will outline adding support for distributed querying of JSON data. 

Login or Register to post comments

by david_thompson_g at August 11, 2017 10:40 PM

Peter Zaitsev

Learning MySQL 5.7: Q & A

MySQL 5.7

MySQL 5.7In this post I’ll answer questions I received in my Wednesday, July 19, 2017, webinar Learning MySQL 5.7!

First, thank you all who attended the webinar. The link to the slides and the webinar recording can be found here.

I received a number of interesting questions in the webinar that I’ve followed up with below.

Would there be a big difference on passing from 5.1 to 5.6 before going to 5.7 or, at this point, would it be roughly the same?

The biggest risk of jumping between versions, in this case 5.1 to 5.6, is reverting in case of problems. Rollbacks don’t happen often, but they do happen and you have to make sure you have the infrastructure in place whenever you decide to execute. These upgrade steps are not officially supported by Oracle nor even recommended here at Percona. Having said that, as long as your tests (checksums, pt-upgrade) and rollback plan works, this shouldn’t be a problem.

One unforgettable issue I have personally encountered is an upgrade from 5.1 via dump and reload to 5.6. The 5.6 version ran with ROW binlog format preventing replication back to 5.1 because of the limitation with the TIMESTAMP columns. Similarly, downgrading without replication means you have to deal with changes to the MySQL system schema, which obviously require some form of downtime.

Additionally, replication from 5.7 to 5.5 will not work because of the additional metadata information that 5.7 creates (i.e., GTID even when GTID is disabled).

After in-place upgrade a Percona XtraDB Cluster from 5.5 to 5.7 (through 5.6),

innodb_file_per_table
 is enabled by default and the database is now almost twice the size. It was a 40 GB DB now it’s 80 GB due to every table has its own file but ibdata1 is still 40 GB. Is there any solution for this (that doesn’t involve mysqldump and drop tables) and how can this be avoided in future upgrades?

The reason this might be the case is that after upgrading, a number (or possibly all) of tables were [re]created. This would obviously create separate tablespaces for each. One way I can think of reclaiming that disk space is through a familiar upgrade path:

  1. Detach one of the nodes and make is an async replica of the remaining nodes in the cluster
  2. Dump and reload data from this node, then resume replication
  3. Join the other nodes from the cluster as additional nodes of a new cluster using the async replica
  4. Once there is only one node remaining in the original cluster, you can switch to the new cluster for production
  5. Rejoin the last node from the original cluster into the new cluster to complete the process

Depending on the semantics of your switch, it may or may not involve a downtime. For example, if you use ProxySQL this should be a transparent operation.

One way to avoid this problem is by testing. Testing the upgrade process in a lab will expose this kind of information even before deploying the new version into production, allowing you to adjust your process accordingly.

What is a possible impact on upgrades going from the old table format to Barracuda?

So far I am not aware of any negative impact – except if you upgrade and need to downgrade but have since created indexes with prefixes larger than what was supported on the previous version (see large_index_prefix and Barracuda documentation).

Upgrading to Barracuda and one of the supported row formats specifically allows memory constrained systems to save a little more. With BLOB/TEXT column stored off the page, they will not fill the buffer pool unless they are needed.

How do you run mysql_upgrade in parallel?

Good question, I actually wrote about it here.

Can you elaborate on ALTER progress features, and is it also applicable to “Optimization ” query?

I was not able to get more details on the “Optimization” part of this question. I can only assume this too was meant to be table rebuild via OPTIMIZE TABLE. First I would like to point out that OPTIMIZE has been an online DDL operation from 5.6 (with few limitations). As such, there is almost no point in monitoring. Also, for the cases where the online DDL does not apply to OPTIMIZE, under the hood, this is ALTER TABLE .. FORCE – a full table rebuild.

Now, for the actual ALTER process doing a table copy/rebuild, MySQL 5.7 provides some form of progress indication as to how much work has been done. However, it does not necessarily provide an estimate of the actual time it would take to complete. Each ALTER process has different phases which can vary under different conditions. Alternatively, you can also employ other ways of monitoring progress as described in the post.

We are migrated from 5.7.11 to 5.7.17 Percona Server and facing “

Column 1 of table 'x.x' cannot be converted from type 'varchar(100)' to type 'varchar(100)'
”.

This is interesting – what we have seen so far are errors with different datatypes or sizes, which most likely means inconsistency from the table structures if the error is coming from replication. We will need more information on what steps were taken during the upgrade to tell what happened here. Our forums would be the best place to continue this conversation. To begin with, perhaps slave_type_conversions might help if the table structures in replication are the same.

Is the Boost Geometry almost on par with Postgres GIS functions?

I cannot answer this with authority or certainty. I’ve used GIS functions in MySQL, but not developed code for it. Although Boost::Geometry was chosen because of its well-designed API, rapid development and license compatibility, it does not necessarily mean it is more mature than PostGIS (which is widely adopted).

What is the best bulk insert method for MySQL 5.7?

The best option can be different in many situations, so we have to put context here. For this reason, let me give some example scenarios and what might work best:

  • On an upgrade process where you are doing a full dump and reload, parallelizing the process by using mydumper/myloader or mysqlpump will save a lot of time depending the hardware resource available.
  • Bulk INSERT from your application that happens at regular intervals – multi-row inserts are always ideal to reduce disk writes per insert. LOAD DATA INFILE is also a popular option if you can.

Again, thank you for attending the webinar – if you have additional questions head on out to the Percona Forums!

by Jervin Real at August 11, 2017 06:00 PM

August 10, 2017

MariaDB Foundation

MariaDB 10.1.26 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.26. See the release notes and changelogs for details. Download MariaDB 10.1.26 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.1.26 now available appeared first on MariaDB.org.

by Ian Gilfillan at August 10, 2017 07:21 PM

MariaDB AB

MariaDB Server 10.1.26 now available

MariaDB Server 10.1.26 now available dbart Thu, 08/10/2017 - 11:52

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.1.26. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 10.1.26

Release Notes Changelog What is MariaDB Server 10.1?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.1.26. See the release notes and changelog for details.

Login or Register to post comments

by dbart at August 10, 2017 03:52 PM

Peter Zaitsev

This Week in Data: Thoughts from Percona Chief Evangelist Colin Charles

Colin Charles

Colin CharlesWelcome to a new weekly column on the Percona blog. My name is Colin Charles, Percona Chief Evangelist, and I have been involved in MySQL, MariaDB Server and the open source community for over a decade. Now I am at Percona, and this is my weekly column.

When you start a column, you have ask yourself what you’ll be writing about. Keeping the focus on the reader is what’s crucial. With this in mind, I plan to cover happenings, pointers and maybe even musings in this column. It’s August, and while many are away on summer vacations, there’s still plenty happening in the database world. So maybe this will be a little like the now-defunct, Weekly MySQL News. It will be broader than just MySQL, however, and focus on open source databases (after all, Percona’s mission is to to champion unbiased open source database solutions).

So let’s get started! I look forward to feedback/tips via comments, or you can email me directly at colin.charles@percona.com. Feel free to socialize with me! I’m @bytebot on Twitter.

Percona Live Europe Dublin

Percona Live Open Source Database Conference Europe 2017 started in London, moved to Amsterdam (where last year it sold out at 400+ attendees) and is now docking itself in Dublin. Dublin, famous for Guinness, is also now famous a European tech hub. With our expanded conference focus beyond just the MySQL ecosystem, Percona Live Europe also includes MongoDB, PostgreSQL and other open source databases.

Where are we at with the event? The sneak peak schedule is out, and we aim to have a more or less full conference schedule by the second week of August. The conference committee is at its most diverse, with two MongoDB Masters to ensure tighter content around MongoDB, and also two whom are prominent in the DevOps world.

Naturally, evolution is good because you are now getting the “best of the best” talks, as there are less slots to compete when it comes to topics! Registration is open, and you’ll want to sign up as soon as possible to lock in the best available rates.

Percona Live Europe in Dublin is also a great place to be a sponsor as a smaller, intimate event helps ensure that people pop by your expo hall booths. This is great for promoting your products, hiring new folks and so on. Find out more about sponsorship here.

Releases

Link List

In coming posts, I expect to cover upcoming events that I’m participating in, and also thoughts about one’s that I’ve been to. See you soon!

by Colin Charles at August 10, 2017 01:58 PM

August 09, 2017

Peter Zaitsev

How to Configure Aurora RDS Parameters

Aurora RDS Parameters

Aurora RDS ParametersIn this blog post, we’ll look at some tips on how to configure Aurora RDS parameters.

I was recently deploying a few Aurora RDS instances, a process very similar to configuring a regular RDS instance. I noticed a few minor differences in the way you configure Aurora RDS parameters, and very few articles on how the commands should be structured (for RDS as well as Aurora). The only real literature available is the official Amazon RDS documentation.

This blog provides a concise “how-to” guide to quickly change Aurora RDS parameters using the AWS CLI. Aurora retains the parameter group model introduced with RDS, with new instances having the default read only parameter groups. For a new instance, you need to create and allocate a new parameter group (this requires a DB reboot). After that, you can apply changes to dynamic variables immediately. In other words, the first time you add the DB parameter group you’ll need to reboot even if the variable you are configuring is dynamic. It’s best to create a new DB parameter group when initializing your clusters. Nothing stops you from adding more than one host to the same DB Parameter Group rather than creating one per instance.

In addition to the DB Parameter Group, each instance is also allocated a DB Cluster Parameter Group. The DB Parameter Group is used for instance-level parameters, while the DB Cluster Parameter Group is used for cluster-level parameters (and applies to all instances in a cluster). You’ll find some of the MySQL engine variables can only be found in the DB Cluster Parameter Group. Here you will find a handy reference of all the DB cluster and DB instance parameters that are viewable or configurable for Aurora instances.

To run these commands, you’ll need to have the “aws” cli tool installed and configured. Note that the force-failover option used for RDS instances doesn’t apply to Aurora. You should perform either a controlled failover or let Aurora handle this. Also, the group family to use for Aurora is “oscar5.6”. The commands to set this up are as follows:

aws rds create-db-parameter-group
    --db-parameter-group-name percona-opt
    --db-parameter-group-family oscar5.6
    --description "Percona Optimizations"
aws rds modify-db-parameter-group
    --db-parameter-group-name percona-opt
    --parameters "ParameterName=max_connections,ParameterValue=5000,ApplyMethod=immediate"
# For each instance-name:
aws rds modify-db-instance --db-instance-identifier <instance-name>
    --db-parameter-group-name=percona-opt
aws rds reboot-db-instance
    --db-instance-identifier <instance-name>

Once you create the initial DB parameter group, configure the variables as follows:

aws rds modify-db-parameter-group
    --db-parameter-group-name <instance-name>
    --parameters "ParameterName=max_connect_errors,ParameterValue=999999,ApplyMethod=immediate"
aws rds modify-db-parameter-group
    --db-parameter-group-name <instance-name>
    --parameters "ParameterName=max_connect_errors,ParameterValue=999999,ApplyMethod=immediate"
## Verifying change:
aws rds describe-db-parameters
      --db-parameter-group-name aurora-instance-1
      | grep -B7 -A2 'max_connect_errors'

Please keep in mind, it can take a few seconds to propagate changes to nodes. Give it a moment before checking the values with “show global variables”. You can configure the DB Cluster Parameter group similarly, for example:

# Create a new db cluster parameter group
aws rds create-db-cluster-parameter-group --db-cluster-parameter-group-name percona-cluster --db-parameter-group-family oscar5.6 --description "new cluster group"
# Tune a variable on the db cluster parameter group
aws rds modify-db-cluster-parameter-group --db-cluster-parameter-group-name percona-cluster --parameters "ParameterName=innodb_flush_log_at_trx_commit,ParameterValue=2,ApplyMethod=immediate"
# Allocate the new db cluster parameter to your cluster
aws rds modify-db-cluster --db-cluster-identifier <cluster_identifier> --db-cluster-parameter-group-name=percona-cluster
# And of course, for viewing the cluster parameters
aws rds describe-db-cluster-parameters --db-cluster-parameter-group-name=percona-cluster

I hope you find this article useful, please make sure to share with the community!

by Nik Vyzas at August 09, 2017 01:51 PM

August 08, 2017

Peter Zaitsev

Avoiding the “An optimized (without redo logging) DDL operation has been performed” Error with Percona XtraBackup

Percona XtraBackup

Percona XtraBackupThis blog discusses newly added options for Percona XtraBackup 2.4.8 and how they can impact your database backups.

To avoid issues with MySQL 5.7 skipping the redo log for DDL, Percona XtraBackup has implemented three new options (

xtrabackup --lock-ddl
,
xtrabackup --lock-ddl-timeout
,
xtrabackup --lock-ddl-per-table
) that can be used to place MDL locks on tables while they are copied.

So why we need those options? Let’s discuss the process used to get there.

Originally, we found problems while running DDLs: Percona XtraBackup produced corrupted backups as described in two reports:

After experimenting, it was clear that the core cause of those fails was MySQL 5.7 skipping redo logging for some DDLs. This is a newly added feature to MySQL named Sorted Index BuildsYou can read more from following links:

To prevent this we introduced a solution: wWhen Percona XtraBackup detects skipping the redo log), it aborts the backup to prevent creating a corrupted backup.

The scary error message you get with this fix is:

[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
Percona XtraBackup will not be able to take a consistent backup. Retry the backup operation

We need to avoid aborting backup with this message. So how do we do that? Let’s create a test case first and reproduce the issue.

Prepare two tables:

sysbench /usr/share/sysbench/oltp_insert.lua --db-driver=mysql --mysql-db=db1 --mysql-user=msandbox --mysql-password=msandbox --table-size=2000000 --mysql-socket=/tmp/mysql_sandbox20393.sock prepare
sysbench /usr/share/sysbench/oltp_insert.lua --db-driver=mysql --mysql-db=db2 --mysql-user=msandbox --mysql-password=msandbox --table-size=2000000 --mysql-socket=/tmp/mysql_sandbox20393.sock prepare

Create a test.sh file and place it in the sandbox:

#!/bin/bash
echo "drop table if exists db1.sb1"|./use
echo "create table sb1 as select id,c from sbtest1 where id < 150000;"|./use db1
echo "create unique index ix on sb1 (id)"|./use db1
sleep 1
echo "drop table if exists db2.sb1"|./use
echo "create table sb1 as select id,c from sbtest1 where id < 150000;"|./use db2
echo "create unique index ix on sb1 (id)"|./use db2

Run the script in a loop while the backup is taken:

$ while true; do bash test.sh; done

Try to take a backup:

xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf
--user=msandbox --password='msandbox'  --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-11-45
--backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4
--check-privileges --no-version-check

You will likely get something like:

InnoDB: An optimized (without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
PXB will not be able take a consistent backup. Retry the backup operation

Ok, now we have reproduced the error. To avoid this error, XtraBackup has the new options as mentioned above.

Using

--lock-ddl
:

xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf
--user=msandbox --password='msandbox'  --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-16-56
--backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4
--check-privileges --no-version-check --lock-ddl

The new thing you should notice is:

170726 11:16:56 Executing LOCK TABLES FOR BACKUP...

And the backup status:

xtrabackup: Transaction log of lsn (2808294311) to (2808304872) was copied.
170726 11:20:42 completed OK!

Another new option is --lock-ddl-per-table:

xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf
--user=msandbox --password='msandbox'  --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56
--backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4 --check-privileges --no-version-check  --lock-ddl-per-table

The new output will look like this:

170726 11:32:33 [01] Copying ./ibdata1 to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/ibdata1
170726 11:32:33 Locking MDL for db1.sb1
170726 11:32:33 [02] Copying ./db1/sb1.ibd to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/db1/sb1.ibd
170726 11:32:33 Locking MDL for db1.sbtest1
170726 11:32:33 Locking MDL for db2.sb1
170726 11:32:33 [03] Copying ./db1/sbtest1.ibd to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/db1/sbtest1.ibd
170726 11:32:33 [04] Copying ./db2/sb1.ibd to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/db2/sb1.ibd
170726 11:32:33 [04]        ...done
170726 11:32:33 >> log scanned up to (2892754398)
170726 11:32:34 Locking MDL for db2.sbtest1

The result of the backup:

170726 11:35:45 Unlocking MDL for all tables
xtrabackup: Transaction log of lsn (2871333326) to (2892754764) was copied.
170726 11:35:45 completed OK!

The another thing I should add here is about using

--lock-ddl
 with non-Percona Server for MySQL servers. For example., using it with MariaDB:

2017-07-26 12:08:32 ERROR    FULL BACKUP FAILED!
2017-07-26 12:08:37 ERROR    170726 12:08:32 Connecting to MySQL server host: 127.0.0.1, user: msandbox, password: set, port: 10207, socket: /tmp/mysql_sandbox10207.sock
Using server version 10.2.7-MariaDB
170726 12:08:32 Error: LOCK TABLES FOR BACKUP is not supported.

But you can use

--lock-ddl-per-table
 with any server. Use
--lock-ddl-per-table
 with caution, it can block updates to tables for highly loaded servers under some circumstances. Let’s explore one:

Table: CREATE TABLE t1 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT);
  Cases:
  connection 1:
  - BEGIN; SELECT * FROM sb1 LIMIT 1; <--- MDL
  connection 2:
  - UPDATE sb1 SET c = '288' WHERE id = 34;    <--- completes OK
  connection 3:
  - CREATE INDEX sb1_1 ON sb1 (c(10));         <--- WAITING for MDL
  connection 2:
  - UPDATE sb1 SET c = '288' WHERE id = 34;    <--- WAITING for MDL
  connection 1:
  - COMMIT;
  connection 2 and 3 are able to complete now

If one connection holds an MDL lock, and another connection does ALTER TABLE (CREATE INDEX is mapped to an ALTER TABLE statement to create indexes), then updates to that table are blocked.

Testing this with the backup process is quite easy:

Sample table:

CREATE TABLE `sb1` (
  `id` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  UNIQUE KEY `ix` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
select count(*) from sb1;
+----------+
| count(*) |
+----------+
|   149999 |
+----------+
select * from sb1 limit 3;
+----+-------------------------------------------------------------------------------------------------------------------------+
| id | c                                                                                                                       |
+----+-------------------------------------------------------------------------------------------------------------------------+
|  1 | 83868641912-28773972837-60736120486-75162659906-27563526494-20381887404-41576422241-93426793964-56405065102-33518432330 |
|  2 | 38014276128-25250245652-62722561801-27818678124-24890218270-18312424692-92565570600-36243745486-21199862476-38576014630 |
|  3 | 33973744704-80540844748-72700647445-87330233173-87249600839-07301471459-22846777364-58808996678-64607045326-48799346817 |
+----+-------------------------------------------------------------------------------------------------------------------------+

So our “connection 1:” is an 

xtrabackup
 command:

xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf
--user=msandbox --password='msandbox'  --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-28_07-55-30
--backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4 --check-privileges --no-version-check
--lock-ddl-per-table

So after running the backup command and doing the same steps for “connection 2” and “connection 3,” the result is something like this in processlist:

show processlist;
+----+----------+-----------------+------+---------+------+---------------------------------+----------------------------------------+-----------+---------------+
| Id | User     | Host            | db   | Command | Time | State                           | Info                                   | Rows_sent | Rows_examined |
+----+----------+-----------------+------+---------+------+---------------------------------+----------------------------------------+-----------+---------------+
|  4 | root     | localhost       | db1  | Sleep   |   28 |                                 | NULL                                   |         0 |             1 |
| 10 | root     | localhost       | db1  | Query   |   26 | Waiting for table metadata lock | CREATE INDEX sb1_1 ON sb1 (c(10))      |         0 |             0 |
| 11 | root     | localhost       | db1  | Query   |    6 | Waiting for table metadata lock | UPDATE sb1 SET c = '288' WHERE id = 34 |         0 |             0 |
| 12 | root     | localhost       | NULL | Query   |    0 | starting                        | show processlist                       |         0 |             0 |
| 13 | msandbox | localhost:36546 | NULL | Sleep   |   31 |                                 | NULL                                   |         1 |           116 |
| 14 | msandbox | localhost:36550 | NULL | Sleep   |   17 |                                 | NULL                                   |         1 |             1 |
+----+----------+-----------------+------+---------+------+---------------------------------+----------------------------------------+-----------+---------------+
6 rows in set (0.00 sec)

Updates are only completed after the backup, as described. It should be clear now why you should use it with caution.

The last thing we should discuss is if you do not want to use any “hacks” with

xtrabackup
 , you can do things on the MySQL side such as:

  1. Avoiding bad DDLs 🙂
  2. Enabling old_alter_table. When this variable is enabled, the server does not use the optimized method of processing an ALTER TABLE operation. It reverts to using a temporary table, copying over the data and then renaming the temporary table to the original, as used by MySQL 5.0 and earlier. So it is going to use ALGORITHM=COPY for alters.

In conclusion, we do not have a panacea for this issue but, you can use some of the described tricks to get rid of this problem. Thanks for reading!

by Shahriyar Rzayev at August 08, 2017 01:51 PM

August 07, 2017

Valeriy Kravchuk

How to Find Values of User Variables With gdb

In his comment to my announcement of the previous post, Shane Bester kindly suggested to consider pretty printing the information about user variables from gdb. I tried to do this tonight, after a long working day, while working with the same Percona server 5.7.x on CentOS 6.9, and found several interesting details to share even before getting to the pretty printing part (that I'd surely try to avoid doing with Python anyway, as I am lazy and not in a mood to use that programming language for a decade already). So, I decided to share them in a separate post.

To begin with, I checked if actually using gdb for this task is needed for anyone else but those poor souls who has to study core dumps after the crash. This time I do not want Mark Leith to step in with a comment that I had better use <some cool P_S feature> (instead of attaching gdb to properly working production server), so I did my homework...

One can surely use the user_variables_by_thread table in the Performance Schema to get these details if server still works, P_S is enabled and it's possible to connect to it to run SQL queries. There is one minor problem (the type of user variable is missing there), that I was happy to report immediately as Bug #87341. By the way, I had found old feature request, Bug #20307, that I verified 11 years ago and that should be just properly closed because of this nice feature of current MySQL.

So, there is (small) reason to use gdb even before any crashes, and I did it immediately after settings values for few user variables in connection with id 16 with the following statements:
set @a := 'aaa', @b := 254, @c := sysdate();
set @e = 10.5;
I also want to show how one may use (relatively) new column in the threads table of Performance Schema while studying live server with gdb. This was also a hint from Mark Leith in a comment on my older post, to check THREAD_OS_ID column:
mysql> select * from threads\G
...
*************************** 27. row ***************************
          THREAD_ID: 42
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 16
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: test
PROCESSLIST_COMMAND: Sleep
   PROCESSLIST_TIME: 195
  PROCESSLIST_STATE: NULL
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: Socket
       THREAD_OS_ID: 2224
...
My first naive attempt to check the details about user variables with gdb ended up as follows:
(gdb) thread 3
[Switching to thread 3 (Thread 0x7fc5997bc700 (LWP 2224))]#0  0x00007fc5cc299383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 16
(gdb) p do_command::thd->user_vars
$2 = {key_offset = 0, key_length = 0, blength = 8, records = 4, flags = 0,
  array = {buffer = 0x7fc5aabf1560 "\001", elements = 4, max_element = 16,
    alloc_increment = 32, size_of_element = 16, m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
I see 4 elements for the thread I am interested in. Looks good. I also assume from the above that they are of user_var_entry type. So, given some buffer, I try to cast it to this type and check (that was stupid, especially further pointer increment attempt - if that were an array of structures imagine how long would it take to find some variable by name!):
(gdb) p do_command::thd->user_vars->array->buffer
$3 = (uchar *) 0x7fc5aabf1560 "\001"
(gdb) p (user_var_entry *)(do_command::thd->user_vars->array->buffer)
$4 = (user_var_entry *) 0x7fc5aabf1560
(gdb) p sizeof(user_var_entry)
$5 = 104
(gdb) set $vars = (user_var_entry *)(do_command::thd->user_vars->array->buffer)
(gdb) p $vars
$6 = (user_var_entry *) 0x7fc5aabf1560
(gdb) p *($vars)
$7 = {static extra_size = 8, m_ptr = 0x1 <Address 0x1 out of bounds>,
  m_length = 140486683628224, m_type = 4294967295
, m_owner = 0x7fc59adf90e0,
  m_catalog = {str = 0xffffffff <Address 0xffffffff out of bounds>,
    length = 140486683628064}, entry_name = {
    m_str = 0xffffffff <Address 0xffffffff out of bounds>,
    m_length = 140486683627904}, collation = {collation = 0x0,
    derivation = DERIVATION_EXPLICIT, repertoire = 0}, update_query_id = 0,
  used_query_id = 0, unsigned_flag = false}
(gdb) set $vars = $vars + 1
(gdb) p *($vars)
$8 = {static extra_size = 8, m_ptr = 0x0, m_length = 0,
  m_type = STRING_RESULT, m_owner = 0x0, m_catalog = {str = 0x0, length = 0},
  entry_name = {m_str = 0x0, m_length = 0}, collation = {collation = 0x0,
    derivation = DERIVATION_EXPLICIT, repertoire = 0}, update_query_id = 0,
  used_query_id = 0, unsigned_flag = false}
...
So, while eventually I've see something expected (STRING_RESULT as type) and pointer trick "worked", the content I get is obviously a bullshit - addresses are out of bounds or zero etc. This leads us nowhere at best.

I had to dig into the code to see how this structure is actually used. This is easy with grep:
[root@centos openxs]# grep -rn user_vars git/mysql-server/*
git/mysql-server/sql/item_func.cc:6017:  HASH *hash= & thd->user_vars;
...
git/mysql-server/sql/sql_prepare.cc:2182:    entry= (user_var_entry*)my_hash_search(&thd->user_vars,...
git/mysql-server/storage/perfschema/table_uvar_by_thread.cc:76:    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));
...
git/mysql-server/sql/sql_class.cc:1209:  my_hash_init(&user_vars, system_charset_info, USER_VARS_HASH_SIZE, 0, 0,
git/mysql-server/sql/sql_class.cc:1650:  my_hash_init(&user_vars, system_charset_info, USER_VARS_HASH_SIZE, 0, 0,
...
git/mysql-server/sql/rpl_binlog_sender.cc:654:    (user_var_entry*) my_hash_search(&m_thd->user_vars, (uchar*) name.str,
...
git/mysql-server/sql/sql_class.h:1576:  HASH    user_vars;                     // hash for user variables...
git/mysql-server/storage/perfschema/table_uvar_by_thread.cc:76:    sql_uvar= reinterpret_cast<user_var_entry*> (my_hash_element(& thd->user_vars, index));[root@centos openxs]#
So, user_vars is actually a HASH, and we see functions to find the item in this hash by name, my_hash_search(), and to get item by index, my_hash_element(). I've immediately proceed to use the latter, as I wanted to see all variables:
...
(gdb) thread 10
[Switching to thread 10 (Thread 0x7fc5997bc700 (LWP 2224))]#0  0x00007fc5cc299383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 16
(gdb) p do_command::thd->user_vars$2 = {key_offset = 0, key_length = 0, blength = 8, records = 4, flags = 0,
  array = {buffer = 0x7fc5aabf1560 "\001", elements = 4, max_element = 16,
    alloc_increment = 32, size_of_element = 16, m_psi_key = 38},
  get_key = 0xc63630 <get_var_key(user_var_entry*, size_t*, my_bool)>,
  free = 0xc636c0 <free_user_var(user_var_entry*)>, charset = 0x1ded740,
  hash_function = 0xeb6990 <cset_hash_sort_adapter>, m_psi_key = 38}
(gdb) p my_hash_element(&(do_command::thd->user_vars), 1)
$3 = (uchar *) 0x7fc59adf90e0 "H\221ъ \305\177"
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 1))
[New Thread 0x7fc59946f700 (LWP 2447)]
[Thread 0x7fc59973a700 (LWP 2396) exited]
[New Thread 0x7fc59932a700 (LWP 2448)]
(gdb) p $uvar
$4 = (user_var_entry *) 0x7fc59adf90e0
(gdb) p *$uvar
$5 = {static extra_size = 8, m_ptr = 0x7fc59adf9148 "aaa", m_length = 3,
  m_type = STRING_RESULT
, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x41272c275441434e <Address 0x41272c275441434e out of bounds>,
    length = 6075168230325441358}, entry_name = {m_str = 0x7fc59adf9150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
  used_query_id = 426, unsigned_flag = false}
...
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 3))
[Thread 0x7fc599636700 (LWP 2400) exited]
(gdb) p *$uvar
$7 = {static extra_size = 8, m_ptr = 0x7fc59adf91e8 "\376", m_length = 8,
  m_type = INT_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x655f79625f726573 <Address 0x655f79625f726573 out of bounds>,
    length = 7881702179129419126}, entry_name = {m_str = 0x7fc59adf91f0 "b",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
  used_query_id = 426, unsigned_flag = false}
(gdb) set $uvar = (user_var_entry *)(my_hash_element(&(do_command::thd->user_vars), 4))
(gdb) p *$uvar
Cannot access memory at address 0x0
We see that this experimenting was performed on a server under some load, we see threads created and gone, we are able to get proper details by index, but should take extra case to check addresses before accessing them, and should note that items are numbered starting from zero, not 1 (there is no item 4, we get NULL pointer). If anyone cares, item with index 0 was this:
$8 = {static extra_size = 8, m_ptr = 0x7fc5b23b41c0 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x7665602e60616d65 <Address 0x7665602e60616d65 out of bounds>,
    length = 7593481698565975653}, entry_name = {m_str = 0x7fc59adf9330 "e",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 444,
  used_query_id = 444, unsigned_flag = false}
As this was a hash table the order of items is determined by hash, not by the order of assignments to variables.

Now, if I know the name of the variable I can access it in the same way as it's done in the server code (one fo examples found by grep):
    entry= (user_var_entry*)my_hash_search(&thd->user_vars,
                                           (uchar*)lex->prepared_stmt_code.str,
                                           lex->prepared_stmt_code.length);
Basically, we need to pass variable name and the length of this name:
(gdb) set $uvar=(user_var_entry *)(my_hash_search(&(do_command::thd->user_vars), "e", strlen("e")))
[Thread 0x7fc5995b4700 (LWP 2848) exited]
(gdb) p *$uvar
$1 = {static extra_size = 8, m_ptr = 0x7fc5b23b41c0 "\002", m_length = 64,
  m_type = DECIMAL_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x7665602e60616d65 <Address 0x7665602e60616d65 out of bounds>,
    length = 7593481698565975653}, entry_name = {m_str = 0x7fc59adf9330 "e",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 444,
  used_query_id = 444, unsigned_flag = false}

...
(gdb) p *((my_decimal *)$uvar->m_ptr)
$4 = {<st_decimal_t> = {intg = 2, frac = 1, len = 9, sign = 0 '\000',
    buf = 0x7fc5b23b41d8}, buffer = {10, 500000000, 0, 0, 0, 0, 257, 0,
    -1697445128}}
I highlighted some parts that eventually should help to pretty print decimal values. With just strings it's also not that trivial:
...
$5 = {static extra_size = 8, m_ptr = 0x7fc59adf9148 "aaa", m_length = 3,
  m_type = STRING_RESULT, m_owner = 0x7fc59accc000, m_catalog = {
    str = 0x41272c275441434e <Address 0x41272c275441434e out of bounds>,
    length = 6075168230325441358}, entry_name = {m_str = 0x7fc59adf9150 "a",
    m_length = 1}, collation = {collation = 0x1ded740,
    derivation = DERIVATION_IMPLICIT, repertoire = 3}, update_query_id = 430,
  used_query_id = 426, unsigned_flag = false}
(gdb) p *$uvar->m_ptr
$6 = 97 'a'
We can not simply print the pointer in a hope, as this is not a null-terminated string.

I am too tired to create a useful summary right now, but you can see that essentially it's easy (even if not trivial) to get all the details about the user variables defined in any session with gdb, both from working process or from core dump. Enjoy!

by Valeriy Kravchuk (noreply@blogger.com) at August 07, 2017 08:58 PM

Peter Zaitsev

Webinar Wednesday August 9, 2017: MongoDB Security – Making Things Secure by Default

MongoDB security

MongoDB securityJoin Percona’s Senior Technical Services Engineer, Adamo Tonete as he presents MongoDB Security: Making Things Secure by Default on Wednesday, August 9, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

MongoDB security breaches have regularly been in the news. What isn’t mentioned, however, is that it’s all been avoidable. Simple mistakes add up to giant headaches, and loss of management and customer confidence in you (the DBA). We will cut through the marketing hype and give you a real-world playbook on what to do and how to do it when securing your system. This includes enabling authentications, understand and building custom roles and users, knowing how to encrypt your data, using LDAP/AD and more.

Register for the webinar here.

MongoDB SecurityAdamo Tonete, Senior Technical Services Engineer

Adamo joined Percona in 2015, after working as a MongoDB/MySQL Database Administrator for three years. As the main database admin of a startup, he was responsible for suggesting the best architecture and data flows for a worldwide company in a 24/7 environment. Before that, he worked as a Microsoft SQL Server DBA for a large e-commerce company, mainly on performance tuning and automation. Adamo has almost eight years of experience working as a DBA, and in the past three years he has moved to NoSQL technologies without giving up relational databases. He plays videogames and studies everything related to engines. Adamo lives with his wife in São Paulo, Brazil.

 

by Emily Ikuta at August 07, 2017 07:01 PM

MariaDB AB

What’s New in MariaDB Connector/J 2.1

What’s New in MariaDB Connector/J 2.1 RalfGebhardt Mon, 08/07/2017 - 13:27

We are pleased to announce the general availability (GA) of MariaDB Connector/J 2.1, the newest version of MariaDB Connector/J for Java 8+. 

MariaDB Connector/J 2.1 is fully compatible with MariaDB Connector/J 2.0, so version 2.0.3 was the last maintenance release for 2.0 and will be replaced by version 2.1 moving forward. MariaDB Connector/J 2.1 includes the following new enhancements:

Security

MariaDB Connector/J 2.1 adds verification of SSL certificate name mismatch functionality. When using SSL, the driver will check the hostname against the server's identity as presented in the server's certificate (checking alternative names or certificate CN) to prevent man-in-the-middle attacks.

A new option "disableSslHostnameVerification" can be used to deactivate this new verification, for cases where the old connector behaviour is required.

High Availability

MariaDB Connector/J has provided its own high availability and load balancing functionality for Master-Slave and Multi-Master (Galera) environments since version 1.2.

With the new release and when configured for Multi-Master, the function “Connection.isValid()” is not only validating the connection, it also validates the state of the Galera Cluster node (@@wsrep_cluster_status). If the cluster status of the node is not “Primary”, the connection is not seen as valid and the connection will be removed from the connection pool.

Performance

Batched INSERTS

Batch of INSERTS will now be used for better performance where possible. This enhancement requires MariaDB Server 10.2.7.

A new option "useBulkStmts" can be used to control the use of this protocol enhancement.

Updateable resultset

Updateable resultsets can now be used with “ResultSet.CONCUR_UPDATABLE”.


Example:

Statement stmt = con.createStatement(
                            ResultSet.TYPE_SCROLL_INSENSITIVE,
                            ResultSet.CONCUR_UPDATABLE);
ResultSet rs = stmt.executeQuery("SELECT age FROM TABLE2");
// rs will be scrollable, will not show changes made by others,
// and will be updatable
while(rs.next())
{
     //Retrieve by column name
     int newAge = rs.getInt(1) + 5;
     rs.updateDouble( 1 , newAge );
     rs.updateRow();
}

Download the MariaDB Connector now and learn about the newest evolution of MariaDB Connector/J 2.1.
 

Download Release Notes Knowledge Base

 

We are pleased to announce the general availability (GA) of MariaDB Connector/J 2.1, the newest version of MariaDB Connector/J for Java 8+. 

MariaDB Connector/J 2.1 is fully compatible with MariaDB Connector/J 2.0, so version 2.0.3 was the last maintenance release for 2.0 and will be replaced by version 2.1 moving forward. 
 

Login or Register to post comments

by RalfGebhardt at August 07, 2017 05:27 PM

MariaDB Foundation

MariaDB 10.0.32 and Connectors now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.32, as well as the recent releases of as MariaDB Connector/ODBC 3.0.1 (beta), MariaDB Connector/J 2.1.0 and MariaDB Connector/J 1.6.3. See the release notes and changelog for details. Download MariaDB 10.0.32 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository […]

The post MariaDB 10.0.32 and Connectors now available appeared first on MariaDB.org.

by Ian Gilfillan at August 07, 2017 04:25 PM

MariaDB AB

MariaDB 10.0.32, and updated connectors now available

MariaDB 10.0.32, and updated connectors now available dbart Mon, 08/07/2017 - 11:23

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.32. In the past week, MariaDB Connector/J and MariaDB Connector/ODBC both received updates. See the release notes and changelogs for details and visit mariadb.com/downloads to download.

Download MariaDB 10.0.32

Release Notes Changelog What is MariaDB 10.0?


Download MariaDB Connector/ODBC 3.0.1 Beta

Release Notes Changelog About MariaDB Connector/ODBC


Download MariaDB Connector/J 2.1.0 GA

Release Notes Changelog About MariaDB Connector/J


Download MariaDB Connector/J 1.6.3 GA

Release Notes Changelog About MariaDB Connector/J

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.32. See the release notes and changelog for details.

Login or Register to post comments

by dbart at August 07, 2017 03:23 PM

August 06, 2017

Valeriy Kravchuk

How to Find Values of Session Variables With gdb

Usually I make notes while working on customer issues, reading the Slack channels or even scrolling across my Facebook news feed. Some of them are of "ToDo" kind, and this week while watching how my colleagues work on a very complicated crash/bug I noted that one of the problems they were discussing was how to find out from the core dump if the session behind the crashing thread had mrr=ON in the optimizer_switch.

I had already written more than once how to check MySQL threads one by one in gdb, depending on version and your real goal. For this post I decided to concentrate on checking the values of session variables and, specifically, individual switchable optimizations. This is actually easy, comparing to checking MySQL plugin variables...

I've used Percona 5.7.x and executed set session optimizer_switch='mrr=off'; in one of sessions. I checked threads one by one in gdb until I find the one I need, where thd variable is defined in some frame:
...
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f7171fbc700 (LWP 2186))]#0  0x00007f71a4be3383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 7
Local session variables are kept in the field called variables in the THD structure:
(gdb) p do_command::thd->variables
$2 = {dynamic_variables_version = 61,
  dynamic_variables_ptr = 0x7f717359c820 "\001", dynamic_variables_head = 384,
  dynamic_variables_size = 392, dynamic_variables_allocs = 0x7f71735f9040,
  max_heap_table_size = 16777216, tmp_table_size = 16777216,
  long_query_time = 0, end_markers_in_json = 0 '\000',
  optimizer_switch = 523711, optimizer_trace = 0,
  optimizer_trace_features = 15, optimizer_trace_offset = -1,
  optimizer_trace_limit = 1, optimizer_trace_max_mem_size = 16384,
  sql_mode = 1075838976, option_bits = 2147748608,
...

(gdb) p do_command::thd->variables->optimizer_switch
$3 = 523711
...
(gdb) p global_system_variables->optimizer_switch
$9 = 523775
Now, this is surely cool, but how to interpret the value I see, that must be some bitmask? I've printed the value of global variable also (it's in global_system_variables), so from comparing the values I can assume mrr setting is controlled by bit 64 in this bitmask. But we have to check this in the code.

Quick search with grep helps to find out that individual switches are defined in the sql/sql_const.h header file as follows:
/* @@optimizer_switch flags. These must be in sync with optimizer_switch_typelib */
#define OPTIMIZER_SWITCH_INDEX_MERGE               (1ULL << 0)
#define OPTIMIZER_SWITCH_INDEX_MERGE_UNION         (1ULL << 1)
#define OPTIMIZER_SWITCH_INDEX_MERGE_SORT_UNION    (1ULL << 2)
#define OPTIMIZER_SWITCH_INDEX_MERGE_INTERSECT     (1ULL << 3)
#define OPTIMIZER_SWITCH_ENGINE_CONDITION_PUSHDOWN (1ULL << 4)
#define OPTIMIZER_SWITCH_INDEX_CONDITION_PUSHDOWN  (1ULL << 5)
/** If this is off, MRR is never used. */
#define OPTIMIZER_SWITCH_MRR                       (1ULL << 6)/**
   If OPTIMIZER_SWITCH_MRR is on and this is on, MRR is used depending on a
   cost-based choice ("automatic"). If OPTIMIZER_SWITCH_MRR is on and this is
   off, MRR is "forced" (i.e. used as long as the storage engine is capable of
   doing it).
*/
#define OPTIMIZER_SWITCH_MRR_COST_BASED            (1ULL << 7)
#define OPTIMIZER_SWITCH_BNL                       (1ULL << 8)
#define OPTIMIZER_SWITCH_BKA                       (1ULL << 9)
#define OPTIMIZER_SWITCH_MATERIALIZATION           (1ULL << 10)
#define OPTIMIZER_SWITCH_SEMIJOIN                  (1ULL << 11)
#define OPTIMIZER_SWITCH_LOOSE_SCAN                (1ULL << 12)
#define OPTIMIZER_SWITCH_FIRSTMATCH                (1ULL << 13)
#define OPTIMIZER_SWITCH_DUPSWEEDOUT               (1ULL << 14)
#define OPTIMIZER_SWITCH_SUBQ_MAT_COST_BASED       (1ULL << 15)
#define OPTIMIZER_SWITCH_USE_INDEX_EXTENSIONS      (1ULL << 16)
#define OPTIMIZER_SWITCH_COND_FANOUT_FILTER        (1ULL << 17)
#define OPTIMIZER_SWITCH_DERIVED_MERGE             (1ULL << 18)
#define OPTIMIZER_SWITCH_LAST                      (1ULL << 19)
So, to check if at session level we have mrr set, we should do a bitwise AND with 1<<6:
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f7171e77700 (LWP 2257))]#0  0x00007f71a4be3383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->variables->optimizer_switch
$1 = 523775
(gdb) p do_command::thd->variables->optimizer_switch & (1 << 6)
$2 = 64
...
(gdb) p /t do_command::thd->variables->optimizer_switch
$6 = 1111111110111111111
Note that if you know bit positions for individual switches by heart printing with /t modifier may show you all bits set quickly.

by Valeriy Kravchuk (noreply@blogger.com) at August 06, 2017 12:31 PM

August 04, 2017

MariaDB AB

MariaDB Server 10.2: JSON/GeoJSON & GIS

MariaDB Server 10.2: JSON/GeoJSON & GIS Bruno Šimić Fri, 08/04/2017 - 13:02

Starting with MariaDB Server 10.2, new JSON/GeoJSON functionality was introduced. Virtual columns also received improvements. With version 10.2 tons of new functionality was added and many of the limitations previously experienced were lifted.

Here, I’ll show how these new MariaDB Server 10.2 features can empower your business.

Side note: One additional piece of information before we start. You cannot imagine how much I hated mathematics back in my school days. Especially geometry and all the Sin, Cos and Radians stuff. Now I’m older, (and eventually) smarter and want to thank my teachers for not giving up!

For today’s task, we’ll create a MariaDB table with JSON objects. It will contain public data about meteorite landings, to see how safe the area I (or you) live in really is. The dataset is publicly available from NASA’s open data portal.

A little bit of geometry background before we go any further.

Latitudes and longitudes are numerical values and they are expressed in degrees. Latitude defines how far north (positive) or south (negative) of the equator a point is located. All points on the equator have a latitude value of zero. The north pole has a latitude of 90, and the south pole has a latitude of -90. So, that’s why all northern hemisphere locations have positive latitude and southern-hemisphere locations have negative values.

By definition, the Greenwich Observatory near London has a longitude of zero. Negative longitude means that this point is west of Greenwich, positive means that the point is east.

We will provide geographic coordinates (latitude and longitude) and by using GIS functions we will get the information about meteorite impacts in a circle of XY km (or miles) around those coordinates.

To get started, I created a table with the following properties:

CREATE TABLE gis_json (

id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,

jsonfield VARCHAR(1024),

name VARCHAR(255) as (JSON_VALUE(jsonfield,'$.name')),

gis_point POINT as ( ST_GeomFromGeoJSON(JSON_QUERY( jsonfield, '$.geolocation')) ) PERSISTENT,

rlat VARCHAR(20) as (JSON_VALUE(jsonfield,'$.reclat')),

rlong VARCHAR(20) as (JSON_VALUE(jsonfield,'$.reclong')),

KEY jsonkey (name),

CHECK (JSON_VALID(jsonfield))

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

I’m using jsonfield to save the actual JSON formatted data.

The field name is a virtual column, it’s matching a value from the JSON key ‘name’.

The field gis_point is a virtual column too, but here I’m using a persistent method to save its content. It is calculated from the JSON field ‘geolocation’ using GeoJSON function ST_GeomFromGeoJSON. All geolocation data in the JSON file are Point type.

I enabled a Check to be sure that the JSON data I read is valid and properly formatted. I also defined an index on the field name.

Here is just a sample of one JSON data record:

{

"fall":"Fell",

"geolocation":{

"type":"Point","coordinates":[50.775,6.08333]

},

"Id":"1",

"Mass":"21",

"name":"Aachen",

"nametype":"Valid",

"recclass":"L5",

"reclat":"50.775000",

"reclong":"6.083330",

"year":"1880-01-01"

}

As you can see, some of the data is redundant, but I decided to keep this record for the sake of simplicity.

Now, we can create the database and the table, and insert the data:

Welcome to the MariaDB monitor.  Commands end with ; or \g.

Your MariaDB connection id is 31

Server version: 10.2.6-MariaDB Homebrew

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> create database geodata;

Query OK, 1 row affected (0.00 sec)

MariaDB [(none)]> use geodata;

Database changed

MariaDB [geodata]> source json_gis_sample.sql;

Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.03 sec)

Query OK, 38401 rows affected (0.67 sec)

Records: 38401  Duplicates: 0  Warnings: 0

After we have imported the whole dataset (38401 records), we can execute our query:

SELECT name, impactyear, mass_g, Y(gis_point) as latit, X(gis_point) as longit, round(distance,2) as distance_km 
  FROM ( 
         SELECT name, JSON_VALUE(jsonfield,'$.year') as impactyear, JSON_VALUE(jsonfield,'$.mass') as mass_g, gis_point, r , 
                units * DEGREES( ACOS( COS(RADIANS(latpoint)) * 
                                       COS(RADIANS(X(gis_point))) * 
                                       COS(RADIANS(longpoint) - RADIANS(Y(gis_point))) + 
                                       SIN(RADIANS(latpoint)) * 
                                       SIN(RADIANS(X(gis_point))))) AS distance 
           FROM gis_json
           JOIN ( 
                  SELECT 48.139 AS latpoint, 11.566 AS longpoint, 
                         100.0 AS r, 111.045 AS units 
                ) AS p 
           WHERE MbrContains(GeomFromText( 
                        CONCAT('LINESTRING(', latpoint-(r/units),' ',
                                              longpoint-(r /(units* COS(RADIANS(latpoint)))), 
                                              ',', 
                                              latpoint+(r/units) ,' ', 
                                              longpoint+(r /(units * COS(RADIANS(latpoint)))), ')')),
                        gis_point) 
       ) AS d 
 WHERE distance <= r 
 ORDER BY distance;

+----------------+------------+--------+----------+----------+-------------+
| name           | impactyear | mass_g | latit    | longit   | distance_km |
+----------------+------------+--------+----------+----------+-------------+
| Mässing        | 1803-01-01 | 1600   | 12.61667 | 48.13333 |       77.86 |
| Schönenberg    | 1846-01-01 | 8000   | 10.46667 | 48.11667 |       81.52 |
| Eichstädt      | 1785-01-01 | 3000   | 11.21667 |     48.9 |       88.32 |
| Neuschwanstein | 2002-01-01 | 6189   | 10.80833 |   47.525 |       88.54 |
| Mühlau         | 1877-01-01 | 5      | 11.41667 | 47.28333 |       95.67 |
+----------------+------------+--------+----------+----------+-------------+
5 rows in set (0.06 sec)

As you can see with impactyear and mass_g, we can grab the JSON data directly from the jsonfield, using the JSON_VALUE() function. I’m also using MbrContains, which is a Minimum Bounding Rectangle (MBR) geographical function.

The result shows that (statistically) Munich is a safe (and great!) place to live. It’s much easier to drown in good beer than to be hit by a meteor.   

You can adapt the following variables to fit your needs:

latpoint 48.139

longpoint 11.566

As a starting point, I used the center of Munich.

Munich map.png

r 100 -> this is a distance from the starting point you want to search within. Since we are using a circle distance in the query, it’s the circle radius.

units 111.045 -> if you want to search based in Kilometer, one degree has 111.045 km. For Miles, use 69.0. If you are on the sea, and want to know how far away the Roman Abramovich's yacht is, you should use 60.0 as units value. Please note, that you should add or subtract around 1 mile from the value, depending on where Roman was standing (bow or stern!) when he shared his geo position with you.

To search within a broader radius of 250 km, I’ll just change the r value from 100 to 250 and restart the query:

+-----------------+------------+--------+----------+----------+-------------+
| name            | impactyear | mass_g | latit    | longit   | distance_km |
+-----------------+------------+--------+----------+----------+-------------+
| Mässing         | 1803-01-01 | 1600   | 12.61667 | 48.13333 |       77.86 |
| Schönenberg     | 1846-01-01 | 8000   | 10.46667 | 48.11667 |       81.52 |
| Eichstädt       | 1785-01-01 | 3000   | 11.21667 |     48.9 |       88.32 |
| Neuschwanstein  | 2002-01-01 | 6189   | 10.80833 |   47.525 |       88.54 |
| Mühlau          | 1877-01-01 | 5      | 11.41667 | 47.28333 |       95.67 |
| Unter-Mässing   | 1920-01-01 | 80000  | 11.33333 | 49.09028 |      107.01 |
| Mauerkirchen    | 1768-01-01 | 19000  | 13.13333 | 48.18333 |      116.20 |
| Ischgl          | 1976-01-01 | 724    | 10.27333 | 47.02633 |      156.97 |
| Prambachkirchen | 1932-01-01 | 2125   | 13.94083 |  48.3025 |      176.63 |
| Bohumilitz      | 1829-01-01 | 59000  | 13.76667 |    49.05 |      190.66 |
| Langwies        | 1985-01-01 | 16     |  9.71667 | 46.81667 |      202.04 |
| Teplá           | 1909-01-01 | 17000  | 12.86667 | 49.98333 |      225.60 |
| Barcis          | 1950-01-01 | 87     |    12.35 |     46.1 |      234.04 |
| Elbogen         | 1399-12-24 | 107000 | 12.73333 | 50.18333 |      242.31 |
| Pribram         | 1959-01-01 | 5555   | 14.03333 | 49.66667 |      247.39 |
| Ybbsitz         | 1977-01-01 | 15000  |    14.89 |    47.96 |      247.53 |
+-----------------+------------+--------+----------+----------+-------------+
16 rows in set (0.06 sec)


With this set of functions, you can start enjoying a non-relational structure combined with the benefits of a powerful relational database system.

As an example, these functions can be used to build geotargeting API services for your business, showing your customers where your POS is located or checking how far away your friends are who shared their position with you.

A nice Open Source Framework for mapping is OpenLayers. Additionally, you can find a big list of free JSON datasets that don’t require authentication here.

If you want to give this code a try, download the complete SQL file (DDL and data).

Want more? Join our webinar on August 10 to dive deeper on geospatial data with MariaDB. In the webinar, we will cover how to build location-based services with MariaDB Server, and SQL using spatial data types, indexes and functions. Register for the MariaDB geospatial webinar now.

Happy geo positioning!

Starting with MariaDB Server 10.2, new JSON/GeoJSON functionality was introduced. Virtual columns also received improvements. With version 10.2 tons of new functionality was added and many of the limitations previously experienced were lifted. In this blog, I’ll show how these new MariaDB Server 10.2 features can empower your business.

Login or Register to post comments

by Bruno Šimić at August 04, 2017 05:02 PM

Peter Zaitsev

Saturation Metrics in PMM 1.2.0

Saturation Metrics 5 small

One of the new graphs added to Percona Monitoring and Management (PMM) is saturation metrics. This blog post explains how to use the information provided by these graphs.

You might have heard about Brendan Gregg’s USE Method  (Utilization-Saturation-Errors) as a way to analyze the performance of any system. Our goal in PMM is to support this method fully over time, and these graphs take us one step forward.

When it comes to utilization, there are many graphs available in PMM. There is the CPU Usage graph:

Saturation Metrics 1

There is also Disk IO Utilization:

Saturation Metrics 2

And there is Network Traffic:

Saturation Metrics 3

If you would like to look at saturation type metrics, there is classical the Load Average graph:

Saturation Metrics 4

While Load Average is helpful for understanding system saturation in general, it does not really distinguish whether it is the CPU or Disk that is saturated. Load Average, as the name says, is already averaged — so we can’t really observe short saturation spikes with Load Average. It is averaged for at least one minute. Finally, the problem with Load Average is it does not keep the number of CPU cores/threads into account. Suppose I have a CPU-bound Load Average of 16, for example. That is quite a load and will cause high saturation and queueing if you have two CPU threads. But if you have 64 threads, then 16 becomes a trivial load with no saturation at all.

Let’s take a look at the Saturation Metrics graph:

Saturation Metrics 5

It provides us two metrics: one showing the CPU load and another is showing the IO load.These values roughly correspond to  the “r” and “b” columns in VMSTAT output:

Saturation Metrics 6

These are sampled every second and then averaged over the reporting interval.

We also normalize the CPU load by dividing the raw number of runnable processes by a number of threads available. “Rocky” has 56 threads, which is why the normalized CPU load is about one even though the number of runnable processes shown by VMSTAT is around 50.

We do not normalize the IO load, as systems can have multiple IO devices and a number of requests they can handle in parallel is largely unknown. If you want to understand specific IO device performance, you should check out the Disk Performance Dashboard.

Testing Saturation Metrics in Practice

Let’s see if saturation graphs indeed show us when CPU saturation is the issue. I will use a sysbench CPU test for illustration, run as:

sysbench cpu  --cpu-max-prime=100000 --threads=1 --time=60 run

This will use the said number of threads to execute compute jobs, each of which will compute the said number of prime numbers. If we have enough CPU resources available, with no saturation, the latency of executing such requests should be about the same. When we overload the system, so there are not enough CPU execution units to process everything in the parallel, the average latency should increase.   

root@ts140i:/mnt/data# sysbench cpu  --cpu-max-prime=100000 --threads=1 --time=300 run sysbench 1.0.7 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 100000
Initializing worker threads...
Threads started!
General statistics:
   total time:                          300.0234s
   total number of events:              12784
Latency (ms):
        min:                                 23.39
        avg:                                 23.47
        max:                                 28.07
        95th percentile:                     23.52
        sum:                             300018.06

As we can see with one thread working, the average time it takes to handle a single request is 23ms. Obviously, there is no saturation happening in this case:

Saturation Metrics 7

“Ts140i” has four CPU cores, and as you can see the Normalized CPU load stays below one. You may wonder why isn’t it closer to 0.25 in this case, with one active thread and four cores available? The reason is at exactly the time when the metrics are being captured, there often happen to be an additional two to three threads active to facilitate the process. They are only active for a very few milliseconds at the time, so they do not produce much load — but they tend to skew the number a little bit.

Let’s now run with four threads. The number of threads matches the number of CPU cores available (and it is true cores in this case, no hyperthreading). In this case, don’t expect too much increase in the event processing time.

root@ts140i:/mnt/data# sysbench cpu  --cpu-max-prime=100000 --threads=4 --time=300 run
sysbench 1.0.7 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 4
Initializing random number generator from current time
Prime numbers limit: 100000
Initializing worker threads...
Threads started!
General statistics:
   total time:                          300.0215s
   total number of events:              48285
Latency (ms):
        min:                                 24.19
        avg:                                 24.85
        max:                                 43.61
        95th percentile:                     24.83
        sum:                            1200033.93

As you see test confirms the theory – we have avg latency increased just by about 6%  with Normalized CPU load in saturation metrics mostly hovering between 1 and 2:

Saturation Metrics 8

Let’s now do the test with 16 threads, which is four times more than available CPU cores. We should see the latency dramatically increase due to CPU overload (or saturation). The same will happen to your CPU bound MySQL queries if you have more concurrency than CPUs available.

root@ts140i:/mnt/data# sysbench cpu  --cpu-max-prime=100000 --threads=16 --time=300 run
sysbench 1.0.7 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 16
Initializing random number generator from current time
Prime numbers limit: 100000
Initializing worker threads...
Threads started!
General statistics:
   total time:                          300.0570s
   total number of events:              48269
Latency (ms):
        min:                                 27.83
        avg:                                 99.44
        max:                                189.05
        95th percentile:                    121.08
        sum:                            4799856.52

We can see it takes about four times longer to process each request due to CPU overload and queueing. Let’s see what saturation metrics tell us:

Saturation Metrics 9

As you can see, Normalized CPU Load floats between four and five on the graph, consistent with saturation we’re observing.

You may ask does the CPU utilization graph help us here? Not really. You will see 100% CPU usage for both the run with four threads and 16 threads, while request latencies are completely different.   

Summary

As we can see from our test, Normalized CPU Load is very helpful for understanding when the CPU is overloaded. An overloaded CPU causes response times to increase and performance to degrade. Furthermore, you can use it to (roughly) see how serious the overload is. As a rule of thumb, if you see Normalized CPU saturation over two, it indicates your CPUs are overloaded.

by Peter Zaitsev at August 04, 2017 01:23 PM

August 03, 2017

Peter Zaitsev

Percona Live Europe 2017 Sneak Peek Schedule Up Now! See Available Sessions!

Percona Live Europe 2017

Percona Live Europe 2017We are excited to announce that the sneak peek schedule for the Percona Live Open Source Database Conference Europe 2017 is up! The Percona Live Open Source Database Conference Europe 2017 is September 25 – 27, at the Radisson Blu Royal Hotel.

The theme of Percona Live Europe 2017 is Championing Open Source Databases, with sessions on MySQL, MariaDB, MongoDB and other open source database technologies, including time series databases, PostgreSQL and RocksDB. This year’s conference will feature one day of tutorials and two days of keynote talks and breakout sessions related to open source databases and software. Tackling subjects such as analytics, architecture and design, security, operations, scalability and performance, Percona Live Europe provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs.

Below are some of our top picks for MySQL, MongoDB and open source database sessions:

Tutorials

Breakout Talks

MySQL:

MongoDB:

Other Open Source Database Topics:

Registration Prices Increase August 9, 2017 – Get Tickets Now for the Best Price!

Just a reminder to everyone out there that the Early Bird discount rate for the Percona Live Open Source Database Conference Europe 2017 ends August 8! The price increases as of August 9, so buy now. The Early Bird rate gets you all the excellent and amazing opportunities that Percona Live Europe offers, at a very reasonable price! Get your tickets as soon as possible for the best price.

Percona Live Europe 2017 Open Source Database Conference will be held at the Radisson Blu Royal Hotel, at Golden Lane 8, Dublin, Ireland.

The Radisson Blu Royal Hotel is a prime location in the heart of Dublin. Enjoy this spacious venue with complementary WiFi, expert on-site staff and three great restaurants offering a wide variety of meals. Staying for a couple extra days? Take time to enjoy the different tourist attractions, like traditional beer pubs and XII century castles, located minutes away.

A special hotel rate of EUR 250.00 is available for Percona Live Europe 2017 until August 14, 2017.

You can reserve a room by booking through the Radisson Blu’s reservation site.

  1. Click BOOK NOW at the top right.
  2. Enter your preferred check-in and check-out dates, and how many rooms.
  3. From the drop-down “Select Rate Type,” choose Promotional Code.
  4. Enter the code PERCON to get the discount

This special deal includes breakfast each morning! The group rate only applies if used within the Percona Live Europe group block dates (September 25-27, 2017).

Sponsor Percona Live

Become a conference sponsor! We have sponsorship opportunities available for this annual MySQL, MongoDB and open source database event. Sponsors become a part of a dynamic and growing ecosystem and interact with hundreds of DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

by Dave Avery at August 03, 2017 02:42 PM

Shlomi Noach

orchestrator/raft: Pre-Release 3.0

orchestrator 3.0 Pre-Release is now available. Most notable are Raft consensus, SQLite backend support, orchestrator-client no-binary-required client script.

TL;DR

You may now set up high availability for orchestrator via raft consensus, without need to set up high availability for orchestrator's backend MySQL servers (such as Galera/InnoDB Cluster). In fact, you can run a orchestrator/raft setup using embedded SQLite backend DB. Read on.

orchestrator still supports the existing shared backend DB paradigm; nothing dramatic changes if you upgrade to 3.0 and do not configure raft.

orchestrator/raft

Raft is a consensus protocol, supporting leader election and consensus across a distributed system.  In an orchestrator/raft setup orchestrator nodes talk to each other via raft protocol, form consensus and elect a leader. Each orchestrator node has its own dedicated backend database. The backend databases do not speak to each other; only the orchestrator nodes speak to each other.

No MySQL replication setup needed; the backend DBs act as standalone servers. In fact, the backend server doesn't have to be MySQL, and SQLite is supported. orchestrator now ships with SQLite embedded, no external dependency needed.

In a orchestrator/raft setup, all orchestrator nodes talk to each other. One and only one is elected as leader. To become a leader a node must be part of a quorum. On a 3 node setup, it takes 2 nodes to form a quorum. On a 5 node setup, it takes 3 nodes to form a quorum.

Only the leader will run failovers. This much is similar to the existing shared-backend DB setup. However in a orchestrator/raft setup each node is independent, and each orchestrator node runs discoveries. This means a MySQL server in your topology will be routinely visited and probed by not one orchestrator node, but by all 3 (or 5, or what have you) nodes in your raft cluster.

Any communication to orchestrator must take place through the leader. One may not tamper directly with the backend DBs anymore, since the leader is the one authoritative entity to replicate and announce changes to its peer nodes. See orchestrator-client section following.

For details, please refer to the documentation:

The orchetrator/raft setup comes to solve several issues, the most obvious is high availability for the orchestrator service: in a 3 node setup any single orchestrator node can go down and orchestrator will reliably continue probing, detecting failures and recovering from failures.

Another issue solve by orchestrator/raft is network isolation, in particularly cross-DC, also refered to as fencing. Some visualization will help describe the issue.

Consider this 3 data-center replication setup. The master, along with a few replicas, resides on DC1. Two additional DCs have intermediate masters, aka local-masters, that relay replication to local replicas.

We place 3 orchestrator nodes in a raft setup, each in a different DC. Note that traffic between orchestrator nodes is very low, and cross DC latencies still conveniently support the raft communication. Also note that backend DB writes have nothing to do with cross-DC traffic and are unaffected by latencies.

Consider what happens if DC1 gets network isolated: no traffic in or out DC1

Each orchestrator nodes operates independently, and each will see a different state. DC1's orchestrator will see all servers in DC2, DC3 as dead, but figure the master itself is fine, along with its local DC1 replicas:

However both orchestrator nodes in DC2 and DC3 will see a different picture: they will see all DC1's servers as dead, with local masters in DC2 and DC3 having broken replication:

Who gets to choose?

In the orchestrator/raft setup, only the leader runs failovers. The leader must be part of a quorum. Hence the leader will be an orchestrator node in either DC2 or DC3. DC1's orchestrator will know it is isolated, that it isn't part of the quorum, hence will step down from leadership (that's the premise of the raft consensus protocol), hence will not run recoveries.

There will be no split brain in this scenario. The orchestrator leader, be it in DC2 or DC3, will act to recover and promote a master from within DC2 or DC3. A possible outcome would be:

What if you only have 2 data centers?

In such case it is advisable to put two orchestrator nodes, one in each of your DCs, and a third orchestrator node as a mediator, in a 3rd DC, or in a different availability zone. A cloud offering should do well:

The orchestrator/raft setup plays nice and allows one to nominate a preferred leader.

SQLite

Suggested and requested by many, is to remove orchestrator's own dependency on a MySQL backend. orchestrator now supports a SQLite backend.

SQLite is a transactional, relational, embedded database, and as of 3.0 it is embedded within orchestrator, no external dependency required.

SQLite doesn't replicate, doesn't support client/server protocol. As such, it cannot work as a shared database backend. SQLite is only available on:

  • A single node setup: good for local dev installations, testing server, CI servers (indeed, SQLite now runs in orchestrator's CI)
  • orchestrator/raft setup, where, as noted above, backend DBs do not communicate with each other in the first place and are each dedicated to their own orchestrator node.

It should be pointed out that SQLite is a great transactional database, however MySQL is more performant. Load on backend DB is directly (and mostly linearly) affected by the number of probed servers. If you have 50 servers in your topologies or 500 servers, that matters. The probing frequency of course also matters for the write frequency on your backend DB. I would suggest if you have thousands of backend servers, to stick with MySQL. If dozens, SQLite should be good to go. In between is a gray zone, and at any case run your own experiments.

At this time SQLite is configured to commit to file; there is a different setup where SQLite places data in-memory, which makes it faster to execute. Occasional dumps required for durability. orchestrator may support this mode in the future.

orchestrator-client

You install orchestrator as a service on a few boxes; but then how do you access it from other hosts?

  • Either curl the orchestrator API
  • Or, as most do, install orchestrator-cli package, which includes the orchestrator binary, everywhere.

The latter implies:

  • Having the orchestrator binary installed everywhere, hence updated everywhere.
  • Having the /etc/orchestrator.conf.jsondeployed everywhere, along with credentials.

The orchestrator/raft setup does not support running orchestrator in command-line mode. Reason: in this mode orchestrator talks directly to the shared backend DB. There is no shared backend DB in the orchestrator/raft setup, and all communication must go through the leader service. This is a change of paradigm.

So, back to curling the HTTP API. Enter orchestrator-client which mimics the command line interface, while running curl | jq requests against the HTTP API. orchestrator-client, however, is just a shell script.

orchestrator-client will work well on either orchestrator/raft or on your existing non-raft setups. If you like, you may replace your remote orchestrator installations and your /etc/orchestrator.conf.json deployments with this script. You will need to provide the script with a hint: the $ORCHESTRATOR_API environment variable should be set to point to the orchestrator HTTP API.

Here's the fun part:

  • You will either have a proxy on top of your orchestrator service cluster, and you would export ORCHESTRATOR_API=http://my.orchestrator.service/api
  • Or you will provide orchestrator-client with all orchestrator node identities, as in export ORCHESTRATOR_API="https://orchestrator.host1:3000/api https://orchestrator.host2:3000/api https://orchestrator.host3:3000/api" .
    orchestrator-client will figure the identity of the leader and will forward requests to the leader. At least scripting-wise, you will not require a proxy.

Status

orchestrator 3.0 is a Pre-Release. We are running a mostly-passive orchestrator/raft setup in production. It is mostly-passive in that it is not in charge of failovers yet. Otherwise it probes and analyzes our topologies, as well as runs failure detection. We will continue to improve operational aspects of the orchestrator/raft setup (see this issue).

 

by shlomi at August 03, 2017 08:41 AM

August 02, 2017

Peter Zaitsev

Percona Toolkit 3.0.4 is Now Available

Percona Server for MongoDB

Percona ToolkitPercona announces the release of Percona Toolkit 3.0.4 on August 2, 2017.

Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL and MongoDB server and system tasks too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

You download Percona Toolkit packages from the web site or install from official repositories.

This release includes the following changes:

New Features

  • PT-90: Added collection of information about prepared statements by pt-stalk when Performance Schema is enabled. For more information, see #1642750.
  • PT-91: Added the --preserve-triggers option for pt-online-schema-change to support AFTER triggers.
  • PT-138: Added --output-format option for pt-mongodb-summary to choose between JSON format and the default plain text.
  • PT-141: Added the --output-format=csv parameter for pt-archiver to archive rows in CSV format.
  • PT-142: Added the --only-same-schema-fks option for pt-online-schema-change to check foreigns keys only on tables with the same schema as the original table. This should speed up the tool’s execution, but keep in mind that if you have foreign keys referencing tables in other schemas, they won’t be detected. For more information, see #1690122.
  • PT-153: Added the --check-unique-key-change option for pt-online-schema-change to abort if the specified statement for --alter is trying to add a unique index. This is supposed to avoid adding duplicate keys that might lead to silently losing data.
  • PT-173: Added the --truncate-replicate-table option for pt-table-checksum to ensure stale data is removed.

Bug fixes

  • PT-136: Fixed pt-table-checksum to support tables that have columns with different collations or charsets. For more information, see #1674266.
  • PT-143: Fixed primary key handling by pt-archiver. For more information, see #1691630.
  • PT-144: Limited constraint name in the new table when running pt-online-schema-change. For more information, see #1491674.
  • PT-146: Fixed the --no-check-binlog-format option for pt-table-checksum to work as expected.
  • PT-148: Fixed the use of uninitialized value in printf() for pt-online-schema-change. For more information, see #1693614.
  • PT-151: Fixed pt-table-sync to prevent field type point to be taken as decimal.
  • PT-154: Reverted PT-116 to remove the --use-insert-ignore option from pt-online-schema-change.
  • PT-161: Fixed the --skip-check-slave-lag feature for pt-table-checksum to safely check for undefined values.
  • PT-178: Fixed regression in --check-slave-lag option for pt-online-schema-change.
  • PT-180: Fixed regression in --skip-check-slave-lag option for pt-online-schema-change.
  • PT-181: Fixed syntax error in pt-online-schema-change.

Other Improvements

  • PT-162: Updated list of tables ignored by pt-table-checksum.

You can find release details in the release notes. Report bugs in Toolkit’s launchpad bug tracker.

by Alexey Zhebel at August 02, 2017 04:44 PM

Percona Server for MongoDB 3.4.6-1.7 is Now Available

Percona Server for MongoDB 3.4

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.6-1.7 on August 2, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: Red Hat Enterprise Linux 5 (including CentOS 5 and other derivatives), Ubuntu 12.04 and older versions are no longer supported by Percona software.

This release is based on MongoDB 3.4.6 and includes the following additional bug fix:

  • #PSMDB-155: Fixed mongod startup on NUMA systems.

by Alexey Zhebel at August 02, 2017 01:58 PM

MariaDB Foundation

2017年第2次MariaDB开发者大会(深圳,中国)相关安排

(The original English version of this post is available here). MariaDB 2017年第2次开发者大会即将在深圳召开,这是MariaDB社区第一次在亚洲地区举办开发者大会。 以下是本次活动的相关安排: 11月13日 — 新贡献者日 11月14~15日 — 开发者大会 11月16日 — Patch研讨会 宝存科技负责筹办这次大会。如果您有意参加,请在本次会议的Meetup.com页面上注册。 所有活动均免费参加。我们建议您入住会议举办地点附近的酒店。 新贡献者日 11月13日是专门为新贡献者预留的一天。 如果您有兴趣成为MariaDB的贡献者,但不知道如何起步;或者已经做了一些贡献,但希望更深入地了解MariaDB的开发流程,那么这一天的会值得您参加。 参会者需要有C / C ++基础,带上笔记本电脑!会议将兼顾英语和普通话。 开发者大会 11月14日和15日是传统的开发者会议。欢迎任何有兴趣在MariaDB开源项目中贡献力量(代码或者其它形式的贡献)的人参加。 参与会议并不要求您是核心开发人员,这是一个开放的社区,我们欢迎那些愿意学习和参与开源的新贡献者。 这两天我们将就MariaDB开发有关的各种话题进行深入的介绍和讨论。 这次活动是一个开放的会议,有很多机会对各种议题进行讨论与协作。性能、新特性、连接器、安装包以及文档等方向,您可以参与帮助我们一起决定一些事情。许多MariaDB核心开发人员将出席并与大家讨论。 这次活动的目标是为新老开发者提供一个面对面的交流平台,以便大家能够一起攻克一些难题,或者对新特性进行规划。 Patch研讨会 在开发者会议之后,经过了大量的展示和讨论,我们需要着手工作了。这2天将用于补丁审查(Code Review),面对面的讨论比正常的在线审查流程更高效。这也是个好机会可以把前面讨论的好想法一起来设计好实现,因为这些想法还很新,您可以亲自与其他开发者进行沟通。 如果您对该活动有任何疑问,或有关如何贡献MariaDB,请随时在MariaDB开发人员邮件列表或MariaDB讨论邮件列表进行讨论。 请在这里注册MariaDB Developers开发者大会。

The post 2017年第2次MariaDB开发者大会(深圳,中国)相关安排 appeared first on MariaDB.org.

by Ian Gilfillan at August 02, 2017 11:04 AM

2017-2 Developers Unconference and Related Events, Shenzhen

(A Chinese version of this post is available here). The 2nd 2017 MariaDB Developers Unconference will be our first in Asia, and will take place in Shenzhen, China, along with two related events: 13 November – New contributor day 14-15 November – Developers Unconference 16-17 November – Patch review days Provisional Schedule Shannon are kindly […]

The post 2017-2 Developers Unconference and Related Events, Shenzhen appeared first on MariaDB.org.

by Ian Gilfillan at August 02, 2017 11:02 AM

Peter Zaitsev

Percona Server for MySQL 5.6.36-82.1 is Now Available

Percona Server for MySQL 5.6

Percona Server for MySQL 5.6Percona is glad to announce the release of Percona Server for MySQL 5.6.36-82.1 on August 1, 2017 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.6.36, including all the bug fixes in it, Percona Server for MySQL 5.6.34-79.1 is the current GA release in the Percona Server for MySQL 5.6 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.6.36-82.1 milestone at Launchpad.

Please note that RHEL 5, CentOS 5 and Ubuntu versions 12.04 and older are not supported in future releases of Percona Server for MySQL and no further packages are added for these distributions.

New Features

  • Percona Server for MySQL can now be built with support of OpenSSL 1.1.
  • Percona Server for MySQL is now available on Debian 9 (stretch). The support only covers the amd64 architecture.
  • TokuDB enables to kill a query that is awaiting an FT locktree lock.

Bugs Fixed

  • Row counts in TokuDB could be lost intermittently after restarts. Bug fixed #2.
  • In TokuDB, two races in the fractal tree lock manager could significantly affect transactional throughput for some applications that used a small number of concurrent transactions. These races manifested as transactions unnecessarily waiting for an available lock. Bug fixed #3.
  • TokuDB could assert when opening a dictionary with no useful information to error log. Bug fixed #23.
  • TokuDB could assert for various reasons deserializing nodes with no useful error output. Bug fixed #24.
  • Percona Server could crash when running a query over a partitioned table that uses an index to read a range of rows if this range was not covered by any existing partition. Bug fixed #1657941 (upstream #76418).
  • With two client connections to a server (debug server build), the server could crash after one of the clients set the global option userstat and flushed the client statistics (FLUSH CLIENT_STATISTICS) and then both clients were closed. Bug fixed #1661488.
  • TokuDB did not pass cmake flags on to snappy cmake. Bug fixed #41. The progress status for partitioned TokuDB table ALTERs was misleading. Bug fixed #42.
  • When a client application is connecting to the Aurora cluster end point using SSL (--ssl-verify-server-cert or --ssl-mode=VERIFY_IDENTITY option), wildcard and SAN enabled SSL certificates were ignored. See also Compatibility Matrix. Note that the --ssl-verify-server-cert option is deprecated in Percona Server 5.7. Bug fixed #1673656 (upstream #68052).
  • Killing a stored procedure execution could result in an assert failure on a debug server build. Bug fixed #1689736 (upstream #86260).
  • It was not possible to build Percona Server on Debian 9 (stretch) due to issues with OpenSSL 1.1. Bug fixed #1702903 (upstream #83814).
  • The SET STATEMENT .. FOR statement changed the global instead of the session value of a variable if the statement occurred immediately after the SET GLOBAL or SHOW GLOBAL STATUS command. Bug fixed #1385352.
  • The synchronization between the LRU manager and page cleaner threads was not done at shutdown. Bug fixed #1689552.

Other bugs fixed: #6#44#65#1160986#1676740#1689989#1689998#1690012#1699788, and #1684601 (upstream #86016).

Compatibility Matrix

Feature YaSSL OpenSSL < 1.0.2 OpenSSL >= 1.0.2
‘commonName’ validation Yes Yes Yes
SAN validation No Yes Yes
Wildcards support No No Yes

by Borys Belinsky at August 02, 2017 12:41 AM

August 01, 2017

MariaDB AB

M|18 MariaDB User Conference Call for Papers—Now Open!

M|18 MariaDB User Conference Call for Papers—Now Open! Shane Johnson Tue, 08/01/2017 - 15:54

We are excited to announce MariaDB’s annual user conference, M|18, will take place February 26-27, 2018 in New York City at the Conrad Hotel. Over the course of two days, we’ll cover in-depth looks at security, high availability, performance tuning and more with MariaDB, how to run your database on cloud and/or container infrastructure, new functionality for developers like JSON and window functions, plus tons more.

The MariaDB M|18 call for papers is now open! If you are a leader in your space with expertise in operations and/or development and would like to share your knowledge with a global MariaDB audience, we invite you to submit a proposal to speak. We’re interested in the challenges you’ve faced, the lessons you’ve learned, and the best practices you’ve developed – use cases, success stories and strategies too.

If you’ve been in the trenches and led the way forward, submit a session proposal for one of the following tracks to share your experience with MariaDB:

Technology: the latest product and community innovations
Examples: extensions, features, plugins, storage engines

Infrastructure: a foundation for modern database deployments
Examples: automated provisioning, cloud, containers, DBaaS

Operations: the processes and tools for managing databases
Examples: high availability, security, performance tuning

Development: building modern applications with new features
Examples: geospatial, JSON, modeling, SQL

Analytics: unleash the power of data as the speed of now
Examples: data visualization, Kafka and Spark integration

Submitting a proposal is easy. Gather your talk title and abstract, then submit it here by October 31, 2017. We will contact you in by mid-December once proposals have been reviewed and the agenda finalized.

Sponsorships

Did you know that MariaDB is included in almost every major distribution? Chances are that your customers and partners are using MariaDB. M|18 is your chance to directly engage with the rapidly growing MariaDB community. Contact us to take advantage of early sponsorship opportunities. Booth space is limited.

We look forward to seeing you in New York for M|18!

We are excited to announce MariaDB’s annual user conference, M|18, will take place February 26-27, 2018 in New York City at the Conrad Hotel. Call for papers is now open. If you are a leader in your space with expertise in operations and/or development and would like to share your knowledge with a global MariaDB audience, we invite you to submit a proposal to speak.

Login or Register to post comments

by Shane Johnson at August 01, 2017 07:54 PM

Peter Zaitsev

Group Replication: The Sweet and the Sour

Group Replication

In this blog, we’ll look at group replication and how it deals with flow control (FC) and replication lag. 

Overview

In the last few months, we had two main actors in the MySQL ecosystem: ProxySQL and Group-Replication (with the evolution to InnoDB Cluster). 

While I have extensively covered the first, my last serious work on Group Replication dates back to some lab version years past.

Given that Oracle decided to declare it GA, and Percona’s decision to provide some level of Group Replication support, I decided it was time for me to take a look at it again.

We’ve seen a lot of coverage already too many Group Replication topics. There are articles about Group Replication and performance, Group Replication and basic functionalities (or lack of it like automatic node provisioning), Group Replication and ProxySQL, and so on.

But one question kept coming up over and over in my mind. If Group Replication and InnoDB Cluster have to work as an alternative to other (virtually) synchronous replication mechanisms, what changes do our customers need to consider if they want to move from one to the other?

Solutions using Galera (like Percona XtraDB Cluster) must take into account a central concept: clusters are data-centric. What matters is the data and the data state. Both must be the same on each node at any given time (commit/apply). To guarantee this, Percona XtraDB Cluster (and other solutions) use a set of data validation and Flow Control processes that work to the ensure a consistent cluster data set on each node.

The upshot of this principle is that an application can query ANY node in a Percona XtraDB Cluster and get the same data, or write to ANY node and know that the data is visible everywhere in the cluster at (virtually) the same time.

Last but not least, inconsistent nodes should be excluded and either rebuild or fixed before rejoining the cluster.

If you think about it, this is very useful. Guaranteeing consistency across nodes allows you to transparently split write/read operations, failover from one node to another with very few issues, and more.

When I conceived of this blog on Group Replication (or InnoDB Cluster), I put myself in the customer shoes. I asked myself: “Aside from all the other things we know (see above), what is the real impact of moving from Percona XtraDB Cluster to Group Replication/InnoDB Cluster for my application? Since Group Replication still (basically) uses replication with binlogs and relaylog, is there also a Flow Control mechanism?” An alarm bell started to ring in my mind.

My answer is: “Let’s do a proof of concept (PoC), and see what is really going on.”

The POC

I setup a simple set of servers using Group Replication with a very basic application performing writes on a single writer node, and (eventually) reads on the other nodes. 

You can find the schema definition here. Mainly I used the four tables from my windmills test suite — nothing special or specifically designed for Group Replication. I’ve used this test a lot for Percona XtraDB Cluster in the past, so was a perfect fit.

Test Definition

The application will do very simple work, and I wanted to test four main cases:

  1. One thread performing one insert at each transaction
  2. One thread performing 50 batched inserts at each transaction
  3. Eight threads performing one insert to each transaction
  4. Eight threads performing 50 batched inserts at each transaction

As you can see, a pretty simple set of operations. Then I decided to test it using the following four conditions on the servers:

  1. One slave worker FC as default
  2. One slave worker FC set to 25
  3. Eight slave workers FC as default
  4. Eight slave workers FC set to 25

Again nothing weird or strange from my point of view. I used four nodes:

  1. Gr1 Writer
  2. Gr2 Reader
  3. Gr3 Reader minimal latency (~10ms)
  4. Gr4 Reader minimal latency (~10ms)

Finally, I had to be sure I measured the lag in a way that allowed me to reference it consistently on all nodes. 

I think we can safely say that the incoming GTID (last_ Received_transaction_set from replication_connection_status) is definitely the last change applied to the master that the slave node knows about. More recent changes could have occurred, but network delay can prevent them from being “received.” The other point of reference is GTID_EXECUTED, which refers to the latest GTID processed on the node itself.

The closest query that can track the distance will be:

select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-2),':',1),'-',-1) last_executed;select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( Received_transaction_set,':',-2),':',1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag

Or in the case of a single worker:

select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-1),'-',-1) last_executed;select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(Received_transaction_set,':',-1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag;

The result will be something like this:

+---------------+
| last_executed |
+---------------+
| 23607         |
+---------------+
+---------------+
| last_received |
+---------------+
| 23607         |
+---------------+
+----------+
| real_lag |
+----------+
|        0 |
+----------+

The whole set of tests can be found here, with all the commands you need to run the application (you can find it here) and replicate the tests. I will focus on the results (otherwise this blog post would be far too long), but I invite you to see the details.

The Results

Efficiency on Writer by Execution Time and Rows/Sec

Using the raw data from the tests (Excel spreadsheet available here), I was interested in identifying if and how the Writer is affected by the use of Group Replication and flow control.

Reviewing the graph, we can see that the Writer has a linear increase in the execution time (when using default flow control) that matches the increase in the load. Nothing there is concerning, and all-in-all we see what is expected if the load is light. The volume of rows at the end justifies the execution time.

It’s a different scenario if we use flow control. The execution time increases significantly in both cases (single worker/multiple workers). In the worst case (eight threads, 50 inserts batch) it becomes four times higher than the same load without flow control.

What happens to the inserted rows? In the application, I traced the rows inserted/sec. It is easy to see what is going on there:

We can see that the Writer with flow control activated inserts less than a third of the rows it processes without flow control. 

We can definitely say that flow control has a significant impact on the Writer performance. To clarify, let’s look at this graph:

Without flow control, the Writer processes a high volume of rows in a limited amount of time (results from the test of eight workers, eight threads, 50 insert batch). With flow control, the situation changes drastically. The Writer takes a long time processing a significantly smaller number of rows/sec. In short, performance drops significantly.

But hey, I’m OK with that if it means having a consistent data-set cross all nodes. In the end, Percona XtraDB Cluster and similar solutions pay a significant performance price match the data-centric principle. 

Let’s see what happen on the other nodes.

Entries Lag

Well, this scenario is not so good:

When NOT using flow control, the nodes lag behind the writer significantly. Remember that by default flow control in Group Replication is set to 25000 entries (I mean 25K of entries!!!).

What happens is that as soon as I put some salt (see load) on the Writer, the slave nodes start to lag. When using the default single worker, that will have a significant impact. While using multiple workers, we see that the lag happens mainly on the node(s) with minimal (10ms) network latency. The sad thing is that is not really going down with respect to the single thread worker, indicating that the simple minimal latency of 10ms is enough to affect replication.

Time to activate the flow control and have no lag:

Unfortunately, this is not the case. As we can see, the lag of single worker remains high for Gr2 (154 entries). While using multiple workers, the Gr3/4 nodes can perform much better, with significantly less lag (but still high at ~1k entries).

It is important to remember that at this time the Writer is processing one-third or less of the rows it is normally able to. It is also important to note that I set 25 to the entry limit in flow control, and the Gr3 (and Gr4) nodes are still lagging more than 1K entries behind.

To clarify, let check the two graphs below:

Using the Writer (Master) as a baseline in entry #N, without flow control, the nodes (slaves) using Group Replication start to significantly lag behind the writer (even with a light load).

The distance in this PoC ranged from very minimal (with 58 entries), up to much higher loads (3849 entries):

Using flow control, the Writer (Master) diverges less, as expected. If it has a significant drop in performance (one-third or less), the nodes still lag. The worst-case is up to 1363 entries. 

I need to underline here that we have no further way (that I am aware of, anyway) to tune the lag and prevent it from happening.

This means an application cannot transparently split writes/reads and expect consistency. The gap is too high.

A Graph That Tells Us a Story

I used Percona Monitoring and Management (PMM) to keep an eye on the nodes while doing the tests. One of the graphs really showed me that Group Replication still has some “limits” as the replication mechanism for a cluster:

This graph shows the MySQL queries executed on all the four nodes, in the testing using 8-50 threads-batch and flow control. 

As you can see, the Gr1 (Writer) is the first one to take off, followed by Gr2. Nodes Gr3 and Gr4 require a bit more, given the binlog transmission (and 10ms delay). Once the data is there, they match (inconsistently) the Gr2 node. This is an effect of flow control asking the Master to slow down. But as previously seen, the nodes will never match the Writer. When the load test is over, the nodes continue to process the queue for additional ~130 seconds. Considering that the whole load takes 420 seconds on the Writer, this means that one-third of the total time on the Writer is spent syncing the slave AFTERWARDS.

The above graph shows the same test without flow control. It is interesting to see how the Writer moved above 300 queries/sec, while G2 stayed around 200 and Gr3/4 far below. The Writer was able to process the whole load in ~120 seconds instead 420, while Gr3/4 continue to process the load for an additional ~360 seconds.

This means that without flow control set, the nodes lag around 360 seconds behind the Master. With flow control set to 25, they lag 130 seconds.

This is a significant gap.

Conclusions

Going back to the reason why I was started this PoC, it looks like my application(s) are not a good fit for Group Replication given that I have set Percona XtraDB Cluster to scale out the reads and efficiently move my writer to another when I need to. 

Group Replication is still based on asynchronous replication (as my colleague Kenny said). It makes sense in many other cases, but it doesn’t compare to solutions based on virtually synchronous replication. It still requires a lot of refinement.

On the other hand, for applications that can afford to have a significant gap between writers and readers it is probably fine. But … doesn’t standard replication already cover that? 

Reviewing the Oracle documentations (https://dev.mysql.com/doc/refman/5.7/en/group-replication-background.html), I can see why Group Replication as part of the InnoDB cluster could help improve high availability when compared to standard replication. 

But I also think it is important to understand that Group Replication (and derived solutions like InnoDB cluster) are not comparable or a replacement for data-centric solutions as Percona XtraDB Cluster. At least up to now.

Good MySQL to everyone.

by Marco Tusa at August 01, 2017 03:54 PM

Platform End of Life (EOL) Announcement for RHEL 5 and Ubuntu 12.04 LTS

End of Life

End of LifeUpstream platform vendors have announced the general end of life (EOL) for Red Hat Enterprise Linux 5 (RHEL 5) and its derivatives, as well as Ubuntu 12.04 LTS. With this announcement comes some implications to support for Percona software running on these operating systems.

RHEL 5 was EOL as of March 31st, 2017 and Ubuntu 12.04 LTS was end of life as of April 28th, 2017. Pursuant to our end of life policies, we are announcing that these EOLs will go into effect for Percona software on August 1st, 2017. As of this date, we will no longer be producing new packages, binary builds, hotfixes, or bug fixes for Percona software on these platforms.

We generally align our platform end of life dates with those of the upstream platform vendor. The platform end of life dates are published in advance on our website under the page Supported Linux Platforms and Versions.

Per our policies, Percona will continue to provide operational support for your databases on EOLed platforms. However, we will be unable to provide any bug fixes, builds or OS-level assistance if you encounter an issue outside the database itself.

Each platform vendor has a supported migration or upgrade path to their next major release.  Please reach out to us if you need assistance in migrating your database to your vendor’s supported platform – Percona will be happy to assist you.

by Dave Avery at August 01, 2017 12:28 AM

July 31, 2017

Peter Zaitsev

Webinar Wednesday August 2, 2017: MySQL Disk Encryption with LUKS

MySQL Disk Encryption

MySQL Disk EncryptionJoin Percona’s, Senior Architect, Matthew Boehm as he presents MySQL Disk Encryption with LUKS on Wednesday, August 2, 2017, at 1:00 pm PDT / 4:00 pm EDT (UTC-7).

Clients require strong security measures for PCI, HIPAA or PHI. You must encrypt MySQL “at rest” to satisfy the data managed under these standards. InnoDB’s built-in encryption features work, but there are some caveats to that solution.

In this talk, you’ll see how to encrypt your entire disk to protect everything from data, redo logs and binary logs.

Register for the webinar here.

MatthewMatthew Boehm, Architect

Matthew joined Percona in the fall of 2012 as a MySQL consultant. His areas of knowledge include the traditional Linux/Apache/MySQL/PHP stack, memcached, MySQL Cluster, massive sharding topologies, PHP development and a bit of MySQL-C-API development. Previously, Matthew DBAed for the 5th largest MySQL installation at eBay/PayPal, and also hails from managed hosting environments. During his off-hours, Matthew is a nationally-ranked competitive West Coast Swing dancer, and travels to competitions around the US. He enjoys working out, camping, biking and playing MMOs with his son.

by Dave Avery at July 31, 2017 10:12 PM

MariaDB AB

What’s New in MariaDB Connector/C 3.0

What’s New in MariaDB Connector/C 3.0 RalfGebhardt Mon, 07/31/2017 - 05:41

We are pleased to announce the general availability (GA) of MariaDB Connector/C 3.0. MariaDB Connector/C 3.0.2 is the newest version of MariaDB Connector/C. This release is compatible with MariaDB Connector/C 2.3 - no code changes necessary to upgrade.

MariaDB Connector/C 3.0 includes new security enhancements, plugins and API functions.

Security

In addition to OpenSSL, MariaDB Connector/C 3.0 now supports:

  • GnuTLS
  • Windows SChannel: removes dependencies on external libraries
  • Windows SChannel: becomes the default for SSL on Windows
  • TLSv1.1 and TLSv1.2 support
  • Passphrase protected private keys

Plugins

  • All plugins can either be linked statically or built as shared objects (or dynamic link libraries on Windows)
  • Pluggable Virtual IO (PVIO) for communication via socket, named pipe and shared memory
  • Connection plugins, e.g for Aurora failover or replication (master write, slave read)
  • Remote IO plugin, which allows to access remote files (via http, https, ftp, ldap, ...)
  • Trace plugin (for analyzing and dumping network traffic)
  • New GSSAPI authentication plugin

New API Functions

MariaDB Connector/C 3.0 is introducing the following new API functions:

  • Bulk operations (array binding) for prepared statements (insert, update, delete).
  • support for extended client/server capabilities (requires MariaDB 10.2 or newer)
  • mariadb_get_charset_by_name and mariadb_get_charset_by_nr, which return charset information for a given internal number or name of character set. These functions have been previously used internally by MariaDB Connector/ODBC and are now exported, so they can be used also within plugins.
  • mariadb_stmt_execute_direct prepares and executes in one step (mainly used by MariaDB ODBC driver)
  • mariadb_cancel aborts a connection immediately by making all subsequent read/write operations fail
  • mysql_get_option and mysql_get_optionv (variable argument list) for obtaining option values for a given connection.
  • mysql_reconnect which was used internally before (if the option MYSQL_OPT_RECONNECT was set) is now part of the API and can be used by applications and plugins to re-establish a failing connection
  • mysql_reset_connection resets the current connection and clears session state
  • mysql_stmt_warning_count returns warnings per statement
  • Functions for obtaining session state changes:
    • mysql_session_track_get_first
    • mysql_session_track_get_next
  • Added tls_version support for SChannel. tls_version has to be specified via mysql_options(mysql, MARIADB_OPT_TLS_VERSION, ...)
  •  

Download the MariaDB Connector now and learn about the newest evolution of MariaDB Connector/C 3.0.

Download Knowledge Base

We are pleased to announce the general availability (GA) of MariaDB Connector/C 3.0. MariaDB Connector/C 3.0.2 is the newest version of MariaDB Connector/C. This release is compatible with MariaDB Connector/C 2.3

Login or Register to post comments

by RalfGebhardt at July 31, 2017 09:41 AM

July 29, 2017

Valeriy Kravchuk

How to Find Processlist Thread id in gdb

I was involved in a discussion on some complex MySQL-related problem where we had to study backtraces of all threads in gdb (produced by the thread apply all bt command if you ever forgets this) in a hope to find out why MySQL hangs. In the process the question appeared on how to find the thread id for each thread to match it against previous collected outputs of SHOW PROCESSLIST. and SHOW ENGINE INNODB STATUS.

I assumed I know the answer, as I had to find this out recently enough for this blog post (and before that for the real customer case). The idea is simple. Find a frame where function has a parameter of THD * type (usually named thd), like this:
#10 0x0000000000cb47fe in do_command (thd=0x7f32512b7000)
    at /usr/src/debug/percona-server-5.7.18-15/percona-server-5.7.18-15/sql/sql_parse.cc:960
and check thread_id item of this structure.

In that my blog post it looked as simple as just referring to thd of do_command's frame without even checking much:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f7f5ce02b00 (LWP 9232))]
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
238     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: No such file or directory.
(gdb) p do_command::thd->thread_id
$9 = 14
I prefer to double check my suggestions before making them, so I immediately tried this with my CentOS 6.9 VM running recent Percona Server 5.7.x by default since that times when I worked at Percona:
[root@centos ~]# gdb -p `pidof mysqld`
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6)
...
Loaded symbols for /usr/lib64/mysql/plugin/tokudb_backup.so
0x00007f550ad35383 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 jemalloc-3.6.0-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libgcc-4.4.7-18.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libstdc++-4.4.7-18.el6.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 numactl-2.0.9-2.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f550d2b2820 (LWP 1978))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->thread_id
No frame is currently executing in block do_command(THD*).
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f54d837b700 (LWP 2183))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->thread_id
Cannot take address of method thread_id.
(gdb) call do_command::thd->thread_id()
Cannot evaluate function -- may be inlined
As you can see I started to check threads one by one and apply that good old trick. Thread 1 had no frame executing do_command(), but I did not gave up and proceeded to the next thread, as I knew I had at least one active connection (I checked the output of SHOW PROCESSLIST). There I had a surprise, no way to get thread_id of thd. I used tab completion, so I know that thread_id (variable or method) exists, but attempt to call it also failed as you can see.

This is a problem with using gdb-based "tricks" over the code that evolves/changes in time. Last time I used p do_command::thd->thread_id it was for MariaDB 10.1.x probably, and the item was there. But in MySQL 5.7 (and all forks based on it) there were many code changes, so we should be ready to changes in unexpected places.

I had not added more comments on finding thread id to that discussion, made a note to myself and then, later, decided to check the source code of MySQL 5.7 (I did not have Percona 5.7 one at hand, but they hardly differs in such basic details) to find out what had changed in the THD structure so that thread_id is not longer just a variable. I expect to see the structure defined in sql/sql_class.h from the past, but grep will help to find this out even if it's no longer the case:
[root@centos mysql-server]# grep -n "class THD" sql/*.h
sql/debug_sync.h:27:class THD;
sql/derror.h:24:class THD;
sql/event_data_objects.h:40:class THD;
...
sql/sql_class.h:1412:class THD :public MDL_context_owner,
sql/sql_class.h:4175:    raise_error() or raise_warning() methods provided by class THD.
sql/sql_cmd.h:25:class THD;
...
 I found the following there:
class THD :public MDL_context_owner,
           public Query_arena,
           public Open_tables_state
{
...
private:
  my_thread_id  m_thread_id;public:
...
  /**
    Assign a value to m_thread_id by calling
    Global_THD_manager::get_new_thread_id().
  */
  void set_new_thread_id();
  my_thread_id thread_id() const { return m_thread_id; }...
So, in MySQL 5.7 thread_id() is, indeed, a method that was inlined, and essentially it returns private m_thread_id item. Benefits of C++... I had highlighted Global_THD_manager singleton also as during my next gdb sessions I had found out that simple global list of threads is also gone and in 5.7 everything is done via that Global_THD_manager. This is a topic for some other post, though.

At least now I know what to do in gdb:
...
(gdb) thread 7[Switching to thread 7 (Thread 0x7f54d8236700 (LWP 2275))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 86
(gdb) p do_command::thd->m_main_security_ctx
$3 = {m_user = {m_ptr = 0x7f5500fdaf90 "myuser", m_length = 6,
    m_charset = 0x1ded640, m_alloced_length = 8, m_is_alloced = true},
  m_host = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_ip = {m_ptr = 0x7f54f0eb0210 "127.0.0.1", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_host_or_ip = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_external_user = {m_ptr = 0x15167ab "", m_length = 0,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_priv_user = "myuser", '\000' <repeats 89 times>, m_priv_user_length = 6,
  m_proxy_user = '\000' <repeats 161 times>, m_proxy_user_length = 0,
  m_priv_host = "localhost", '\000' <repeats 51 times>,
  m_priv_host_length = 9, m_master_access = 1589248, m_db_access = 0,
  m_password_expired = false}
...
So, I know that specific thread  7 was for a session with Id 86 in the output of SHOW PROCESSLIST, and (from m_main_security_ctx, also a new name for old things in 5.7) I know it was a session of myuser connecting locally.

To summarize, there were notable changes in MySQL 5.7 in THD structure and threads management-related code in general, so make sure to re-check your "old gdb tricks" when you start working with 5.7. Reading the code helps.

Unfortunately (for gdb beginners like me) a lot of C++ approaches were introduced, including singletons, iterators based on templates instead of simple double linked lists etc, so one has to work hard to adapt to these. I hope to discuss some of my further findings and new "C++ specific" and "MySQL 5.7 specific" approaches studying MySQL in gdb in my upcoming posts.

by Valeriy Kravchuk (noreply@blogger.com) at July 29, 2017 03:54 PM

July 28, 2017

Peter Zaitsev

Percona Server for MySQL 5.7.18-16 Is Now Available

Percona Server for MySQL 5.6

Percona Server for MySQL 5.7Percona is glad to announce the GA release of Percona Server for MySQL 5.7.18-16 on July 28, 2017 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.7.18, including all the bug fixes in it, Percona Server for MySQL 5.7.18-16 is the current GA release in the Percona Server for MySQL 5.7 series. All of Percona‘s software is open-source and free, and you can find all the release details in the 5.7.18-16 milestone at Launchpad

Please note that RHEL 5, CentOS 5 and Ubuntu versions 12.04 and older are not supported in future releases of Percona Server and no further packages are added for these distributions.

New Features:

  • Percona Server for MySQL is now available on Debian 9 (stretch). The support only covers the amd64 architecture.
  • Percona Server for MySQL can now be built with the support of OpenSSL 1.1.
  • MyRocks storage engine has been merged into Percona Server.
  • TokuDB enables to kill a query that is awaiting an FT locktree lock.
  • TokuDB enables using the MySQL DEBUG_SYNC facility within Percona FT.

Bugs Fixed:

  • Row counts in TokuDB could be lost intermittently after restarts. Bug fixed #2.
  • In TokuDB, two races in the fractal tree lock manager could significantly affect transactional throughput for some applications that used a small number of concurrent transactions. These races manifested as transactions unnecessarily waiting for an available lock. Bug fixed #3.
  • Percona FT could assert when opening a dictionary with no useful information to an error log. Bug fixed #23.
  • Percona FT could assert for various reasons deserializing nodes with no useful error output. Bug fixed #24.
  • It was not possible to build Percona Server on Debian 9 (stretch) due to issues with OpenSSL 1.1. Bug fixed #1702903 (upstream #83814).
  • Packaging was using the dpkg --verify command which is not available on wheezy/precise. Bug fixed #1694907.
  • Enabling and disabling the slow query log rotation spuriously added the version suffix to the next slow query log file name. Bug fixed #1704056.
  • With two client connections to a server (debug server build), the server could crash after one of the clients set the global option userstat and flushed the client statistics (FLUSH CLIENT_STATISTICS) and then both clients were closed. Bug fixed #1661488.
  • Percona FT did not pass cmake flags on to snappy cmake. Bug fixed #41. The progress status for partitioned TokuDB table ALTERs was misleading. Bug fixed #42.
  • When a client application is connecting to the Aurora cluster end point using SSL (--ssl-verify-server-cert or --ssl-mode=VERIFY_IDENTITY option), wildcard and SAN enabled SSL certificates were ignored. Note that the --ssl-verify-server-cert option is deprecated in Percona Server 5.7. Bug fixed #1673656 (upstream #68052).
  • Killing a stored procedure execution could result in an assert failure on a debug server build. Bug fixed #1689736 (upstream #86260).
  • The SET STATEMENT .. FOR statement changed the global instead of the session value of a variable if the statement occurred immediately after the SET GLOBAL or SHOW GLOBAL STATUS command. Bug fixed #1385352.
  • When running SHOW ENGINE INNODB STATUS, the Buffer pool size, bytes entry contained 0. BUg fixed #1586262.
  • The synchronization between the LRU manager and page cleaner threads was not done at shutdown. Bug fixed #1689552.
  • Spurious lock_wait_timeout_thread wakeup in lock_wait_suspend_thread() could occur. Bug fixed #1704267 (upstream #72123).

Other bugs fixed: #1686603#6#44#65#1160986#1686934#1688319#1689989#1690012#1691682#1697700#1699788#1121072, and #1684601 (upstream #86016).

The release notes for Percona Server for MySQL 5.7.18-16 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Note

Due to new package dependency, Ubuntu/Debian users should use apt-get dist-upgrade or apt-get install percona-server-server-5.7 to upgrade.

by Borys Belinsky at July 28, 2017 06:49 PM

Percona Server for MongoDB 3.4.6-1.6 is Now Available

Percona Server for MongoDB 3.4

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.6-1.6 on July 27, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.6 and does not include any additional changes.

by Alexey Zhebel at July 28, 2017 05:37 PM

July 27, 2017

MariaDB AB

MariaDB Galera Cluster 5.5.57 now available

MariaDB Galera Cluster 5.5.57 now available dbart Thu, 07/27/2017 - 13:17

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.57. See the release notes and changelog for details.

Download MariaDB Galera Cluster 5.5.57

Release Notes Changelog What is MariaDB Galera Cluster?

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.57. See the release notes and changelog for details.

Login or Register to post comments

by dbart at July 27, 2017 05:17 PM

Peter Zaitsev

What is MySQL Partitioning?

MySQL Partitioning

MySQL PartitioningIn this blog, we’ll quickly look at MySQL partitioning.

Partitioning is a way in which a database (MySQL in this case) splits its actual data down into separate tables, but still get treated as a single table by the SQL layer.

When partitioning, it’s a good idea to find a natural partition key. You want to ensure that table lookups go to the correct partition or group of partitions. This means that all SELECT, UPDATE, DELETE should include that column in the WHERE clause. Otherwise, the storage engine does a scatter-gather, and queries ALL partitions in a UNION that is not concurrent.

Generally, you must add the partition key into the primary key along with the auto increment, i.e., PRIMARY KEY (part_id,id). If you don’t have well-designed and small columns for this composite primary key, it could enlarge all of your secondary indexes.

You can partition by range or hash. Range is great because you have groups of known IDs in each table, and it helps when querying across partition IDs. This still can create hotspots in the newest partition, as all new inserts go there. Partitioning by hash “load balances” the table, and allows you to write to partitions more concurrently. This makes range queries on the partition key a bad idea.

In MySQL 5.7, partitioning became native to the store engine and deprecated the old method where MySQL itself had to handle the partitions. This means InnoDB partitions (and a larger amount of partitions) are a better choice than in the past.

As with all features and recommendations, this only makes sense if it helps your data and workload!

by Manjot Singh at July 27, 2017 03:39 PM

July 26, 2017

MariaDB AB

High Availability with Multi-Source Replication in MariaDB Server

High Availability with Multi-Source Replication in MariaDB Server gerrynarvaja Wed, 07/26/2017 - 19:29

Multi-source replication is a handy way to simplify your high availability database setup – especially as compared to the regular MySQL replication approach. Let's look at how it works.

(Note: This post is a revision of one from 2014, with minor updates made.)

The Basics

For our example scenario, we'll use a three-server topology, which is commonly employed for simple failover. (I used the same setup in my post on how to enable GTIDs in MariaDB Server.) A is the active master, B is a standby master set up to replicate from A, and C is a multi-purpose slave replicating from A. I also set up A to replicate from B. This way, if A fails, the system will start writing to B, and once A comes back up it will recover all the transactions from B through replication. Typically this setup is used in combination with a tool like Master High Availability Manager (MHA) to migrate the slave from one master to the next.

For this particular case, assuming that the application(s) write to the active master only, we're going to set the domain ID for all servers to 1.

SET GLOBAL gtid_domain_id = 1;

We'll do the same in the my.cnf files:

gtid-domain-id=1

Set Up the Slave to Replicate from Two Masters

With the basics taken care of, we now will set up server C to replicate from both server A (active master) and server B (standby master). Keep in mind that now each transaction that reaches server C will come from two different sources. For example, if you issue an INSERT on the active master A, it will replicate to C and B. Since C is also B's slave, the same transaction will replicate from B to C. By default, C will apply both transactions, which is redundant and can lead to data inconsistencies and errors. To avoid this, it is necessary to set the gtid_ignore_duplicates variable to ON.

SET GLOBAL gtid_ignore_duplicates=ON;

And do the same in the my.cnf files:

gtid-ignore-duplicates=ON

This way, when C receives the same transaction through two different slave connections, it will verify the GTID and only apply it once.

The next step is to set the second slave connection. In my previous article, I set up the gtid_slave_pos global variable. Since server C is already replicating successfully from server A, this variable already has the proper value and there is no need to set it up manually. We only need to define the new connection using CHANGE MASTER TO:

CHANGE MASTER 's-102' TO  MASTER_HOST='192.168.56.102', 
 MASTER_USER='repl', MASTER_PASSWORD='repl',
 master_use_gtid=slave_pos;

Notice that we add the label 's-102' after the MASTER reserved word. This string identifies the second replication connection. This identifier needs to be used for every replication command. So, to start replicating from server B, you would use START SLAVE like this:

START SLAVE 's-102';

In the same way, to check the status of this replication connection you would use:

SHOW SLAVE 's-102' STATUS\G

You can also issue the commands to all replication connections by using the keyword ALL; for example:

SHOW ALL SLAVES STATUS\G

STOP ALL SLAVES;

START ALL SLAVES;

Notice that when using the keyword ALL, you also have to use SLAVES (plural form). The STOP and START commands will show as many warnings as the number of replication connections:

MariaDB [(none)]> START ALL SLAVES;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
MariaDB [(none)]> SHOW WARNINGS;
+-------+------+-----------------------+
| Level | Code | Message               |
+-------+------+-----------------------+
| Note  | 1937 | SLAVE 's-102' started |
| Note  | 1937 | SLAVE '' started      |
+-------+------+-----------------------+
2 rows in set (0.00 sec)

After completing the setup, the resulting topology will look similar to this:

+-----+         +-----+
|  A  | <- - -> |  B  |
+-----+         +-----+
   |               |
   \               /
    \             /
     V           V
        +-----+
        |  C  |
        +-----+

Failures

If server A goes away or the connection between server A and server C fails, server C will still receive all the transactions through the alternate replication connection to server B (labeled 's-102'). Once server A comes back online again, and/or server C can reconnect to it, server C will compare all the GTIDs coming from server A and ignore those which have already been applied.

The advantage of using this approach over the traditional single-source replication is that now there is no need to re-source the slaves. When the active master fails and the applications start writing to the alternative masters, the slaves will continue to replicate from it, simplifying the management of failover scenarios.

Multi-source replication is a handy way to simplify your high availability database setup – especially as compared to the regular MySQL replication approach. This blog post looks at how it works. 

Login or Register to post comments

by gerrynarvaja at July 26, 2017 11:29 PM

Enabling GTIDs for Server Replication in MariaDB Server 10.2

Enabling GTIDs for Server Replication in MariaDB Server 10.2 gerrynarvaja Wed, 07/26/2017 - 18:56

I originally wrote this post in 2014, after the release of MariaDB Server 10.0. Most of what was in that original post still applies, but I've made some tweaks and updates since replication and high availability (HA) remain among the most popular MariaDB/MySQL features.

Replication first appeared on the MySQL scene more than a decade ago, and as replication implementations became more complex over time, some limitations of MySQL’s original replication mechanisms started to surface. To address those limitations, MySQL v5.6 introduced the concept of global transaction identifiers (GTIDs), which enable some advanced replication features. MySQL DBAs were happy with this, but complained that in order to implement GTIDs you needed to stop all the servers in the replication group and restart them with the feature enabled. There are workarounds – for instance, Booking.com documented a procedure to enable GTIDs with little or no downtime, but it involves more complexity than most organizations are willing to allow. (Check out this blog post for more on how Booking.com handles replication and high availability.)

MariaDB implements GTIDs differently from MySQL, making it possible to enable and disable them with no downtime. Here’s how.

A Simple HA Implementation

Let’s start with a common HA implementation, with three servers running MariaDB 10.0 or higher: One active master (server A), one passive master (server B), and a slave replicating from the active master (server C). The active and passive masters are set up to replicate master-master.

Enabling GTIDs for server replication in MariaDB 10.0

I’m not showing it, but between servers A and B and the application you would want an additional layer to switch database traffic to server B in case A fails. Some organizations might deploy another slave replicating from B, or a mechanism to move server C to replicate from B, such as Master High Availability Manager (MHA), but let’s keep things simple here.

Step 1 – Setting Up the Configuration Files

GTIDs in MariaDB 10 have three parts: server ID, transaction ID, and domain ID. The server ID and transaction ID are similar in concept to those found in MySQL 5.6. In MariaDB, the server ID is a number and not a UUID, and it is taken from the server_id global variable. The domain ID is an important new concept for multisource replication, which you can read more about in the domain ID article in the MariaDB knowledgebase. In our case, this is the only variable we need to set up; the server ID should already be set up if replication is functional. Let’s use 1 for server A's domain ID, 2 for server B's and 3 for server C's by executing commands like the following on each of the servers:

SET GLOBAL gtid_domain_id = 1;

Keep in mind that each session can have its own value for gtid_domain_id, so you'll have to reset all existing connections to properly reset the gtid_domain_id. Finally, persist the values in the corresponding my.cnf files:

# Domain = 1 for active master: server A
gtid-domain-id=1

Step 2 – Changing Replication on the Slave

Running SHOW MASTER STATUS on server A, the active master, shows the current coordinates for its binary log file:

MariaDB [(none)]> SHOW MASTER STATUS\G
*************************** 1. row ***************************
            File: mariadb-bin.000001
        Position: 510
    Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)

We can use these coordinates to find the information we need in order to use GTIDs from the active master by running this command:

SELECT BINLOG_GTID_POS('mariadb-bin.000001', 510);
+--------------------------------------------+
| BINLOG_GTID_POS('mariadb-bin.000001', 510) |
+--------------------------------------------+
| 1-101-1                                    |
+--------------------------------------------+
1 row in set (0.00 sec)

Note that the GTID can be an empty string; for clarity these examples work with non-empty GTID values. The result from the function call is the current GTID, which corresponds to the binary file position on the master. With this value, we can now modify the slave configuration on servers B and C, executing the following statements on each of them:

STOP SLAVE;
SET GLOBAL gtid_slave_pos = '1-101-1';
CHANGE MASTER TO master_use_gtid=slave_pos;
START SLAVE;

Check the slave status to see that the change has taken effect:

MariaDB [mysql]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.56.101
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000001
          Read_Master_Log_Pos: 510
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 642
        Relay_Master_Log_File: mariadb-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
                   Using_Gtid: Slave_Pos
                  Gtid_IO_Pos: 1-101-1
1 row in set (0.00 sec)

The last two lines of SHOW SLAVE STATUS indicate that the slave is now using GTIDs to track replication.

Conclusion

As you can see, the procedure to enable GTIDs is straightforward, and doesn’t require restarting servers or planning for downtime. If you want to revert back to regular replication using binary log position, you can do so by using RESET SLAVE on the slave and resetting the proper binary log coordinates the traditional way. In fact, once you update your servers to use MariaDB Server and review the binary log files with the mysqlbinlog, you'll notice that every transaction in the MariaDB binary logs has the GTID already included. For the binary log in the examples used in this article, here's what you see:

# at 314
#140807 14:16:01 server id 101  end_log_pos 352         GTID 1-101-1
/*!100001 SET @@session.gtid_domain_id=1*//*!*/;
/*!100001 SET @@session.server_id=101*//*!*/;
/*!100001 SET @@session.gtid_seq_no=1*//*!*/;

I hope that the ease of implementing GTIDs in MariaDB Server piques your curiosity and encourages you to explore the variety of replication features. For more on replication and other high availability/disaster recovery strategies, check out our white paper, High Availability with MariaDB TX: The Definitive Guide.

Replication has been one of the most popular MySQL features since it made its way into the application more than a decade ago. Global Transaction IDs was introduced to make handling complex solutions easier. This blog post explains how MariaDB makes handling GTID simpler.

Login or Register to post comments

by gerrynarvaja at July 26, 2017 10:56 PM

MariaDB Foundation

MariaDB Galera Cluster 5.5.57 and Connector/C 3.0.2 now available

The MariaDB project is pleased to announce the availability of MariaDB Galera Cluster 5.5.57 as well as MariaDB Connector/C 3.0.2. See the release notes and changelogs for details. Download MariaDB Galera Cluster 5.5.57 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/C 3.0.2 Release Notes Changelog […]

The post MariaDB Galera Cluster 5.5.57 and Connector/C 3.0.2 now available appeared first on MariaDB.org.

by Ian Gilfillan at July 26, 2017 07:31 PM

Peter Zaitsev

Percona Server for MongoDB 3.2.15-3.5 is Now Available

Percona Server for MongoDB 3.2

Percona Server for MongoDB 3.2Percona announces the release of Percona Server for MongoDB 3.2.15-3.5 on July 26, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database that supports the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocks, Percona Memory Engine, and PerconaFT storage engine, as well as enterprise-grade features like External Authentication, Audit Logging, Profiling Rate Limiting, and Hot Backup at no extra cost. The software requires no changes to MongoDB applications or code.

NOTE: We deprecated the PerconaFT storage engine. It will not be available in future releases.

This release is based on MongoDB 3.2.15 and does not include any additional changes.

Percona Server for MongoDB 3.2.15-3.5 release notes are available in the official documentation.

by Alexey Zhebel at July 26, 2017 04:42 PM

What is innodb_autoinc_lock_mode and why should I care?

innodb_autoinc_lock_mode

innodb_autoinc_lock_modeIn this blog post, we’ll look at what innodb_autoinc_lock_mode is and how it works.

I was recently discussing innodb_autoinc_lock_mode with some colleagues to address issues at a company I was working with.

This variable defines the lock mode to use for generating auto-increment values. The permissible values are 0, 1 or 2 (for “traditional”, “consecutive” or “interleaved” lock mode, respectively). In most cases, this variable is set to the default of 1.

We recommend setting it to 2 when the BINLOG_FORMAT=ROW. With interleaved, INSERT statements don’t use the table-level AUTO-INC lock and multiple statements can execute at the same time. Setting it to 0 or 1 can cause a huge hit in concurrency for certain workloads.

Interleaved (or 2) is the fastest and most scalable lock mode, but it is not safe if using STATEMENT-based replication or recovery scenarios when SQL statements are replayed from the binary log. Another consideration – which you shouldn’t rely on anyway – is that IDs might not be consecutive with a lock mode of 2. That means you could do three inserts and expect IDs 100,101 and 103, but end up with 100, 102 and 104. For most people, this isn’t a huge deal.

If you are only doing simple inserts, this might not help you. I did a sysbench test on MySQL 5.7 in Amazon RDS with 100 threads and found no difference in performance or throughput between lock modes 1 and 2. It helps the most when you when the number of rows can’t be determined, such as with INSERT INTO…SELECT statements.

You can find a longer form article in the manual, but I highly recommend setting this value to 2 if you are not using STATEMENT-based replication.

by Manjot Singh at July 26, 2017 03:15 PM

Jean-Jerome Schmidt

DevOps Considerations for Production-ready Database Deployments

MySQL is easy to install and use, it has always been popular with developers and system administrators. On the other hand, deploying a production-ready MySQL environment for a business-critical enterprise workload is a different story. It can be a bit of a challenge, and requires in-depth knowledge of the database. In this blog post, we’ll discuss some of the steps which have to be taken before we can consider our MySQL deployment production-ready.

High Availability

If you belong to those lucky ones who can accept hours of downtime, you can stop reading here and skip to the next paragraph. For 99.999% of business-critical systems, it would not be acceptable. Therefore a production-ready deployment has to include high availability measures. Automated failover of the database instances, as well as a proxy layer which detects changes in topology and state of MySQL and routes traffic accordingly, would be a main requirement. There are numerous tools which can be used to build such environments, for instance MHA, MRM or ClusterControl.

Proxy layer

Master failure detection, automated failover and recovery - these are crucial when building a production-ready infrastructure. But on their own, it is not enough. There is still an application which will have to adapt to the topology change triggered by the failover. Of course, it is possible to code the application so it is aware of instance failures. This is a cumbersome and inflexible way of handling topology changes though. Here comes the database proxy - a middle layer between application and database. A proxy can hide the complexity of your database layer from the application - all the application does is to connect to the proxy and the proxy will take care of the rest. The proxy will route queries to a database instance, it will handle topology changes and re-route as necessary. A proxy can also be used to implement read-write split, relieving the application from one more complex case to cover. This creates another challenge - which proxy to use? How to configure it? How to monitor it? How to make it highly available, so it does not become a SPOF?

ClusterControl can assist here. It can be used to deploy different proxies to form a proxy layer: ProxySQL, HAProxy and MaxScale. It preconfigures proxies to make sure they will handle traffic correctly. It also makes it easy to implement any configuration changes if you need to customize proxy setup for your application. Read-write split can be configured using any of the proxies ClusterControl supports. ClusterControl also monitors the proxies, and will recover them in case of failures. The proxy layer can become a single point of failure, as automated recovery might not be enough - to address that, ClusterControl can deploy Keepalived and configure Virtual IP to automate failover.

Backups

Even if you do not need to implement high availability, you probably still have to care about your data. Backup is a must for almost every production database. Nothing else than a backup can save you from an accidental DROP TABLE or DROP SCHEMA (well, maybe a delayed replication slave, but only for some period of time). MySQL offers multiple methods of taking backups - mysqldump, xtrabackup, different types of snapshot (some available only with particular hardware or cloud provider). It’s not easy to design the correct backup strategy, decide on which tools to use and then script whole process so it will execute correctly. It’s not rocket science either, and requires careful planning and testing. Once a backup is taken, you are not done. Are you sure the backup can be restored, and the data is not garbage? Verifying your backups is time consuming, and perhaps not the most exciting thing you will have on your todo list. But it is still important, and needs to be done regularly.

ClusterControl has extensive backup and restore functionality. It supports mysqldump for logical backup and Percona Xtrabackup for physical backup - those tools can be used in almost every environment, either cloud or on-premises. It is possible to build a backup strategy with a mixture of logical and physical backups, incremental or full, in an online fashion.

Apart from recovery, it also has options to verify a backup - for instance restore it on a separate host in order to verify if the backup process works ok or not.

If you’d like to regularly keep an eye on the backups (and you would probably want to do this), ClusterControl has the ability to generate operational reports. The backup report helps you track executed backups, and informs if there were any issues while taking them.

Severalnines
 
DevOps Guide to Database Management
Learn about what you need to know to automate and manage your open source databases

Monitoring and trending

No deployment is production-ready without proper monitoring of the services. You want to make sure you will be alerted if some services become unavailable so you can take an action, investigate or start recovery procedures. Of course, you also want to have trending solution too. It can’t be stressed enough how important it is to have monitoring data for assessing the state of the infrastructure or for any investigation,either post-mortem or real-time monitoring of the state of services. Metrics are not equal in importance - if you are not very familiar with a particular database product, you most likely won’t know which are the most important metrics to collect and watch. Sure, you might be able to collect everything but when it comes to reviewing data, it’s hardly possible to go through hundreds of metrics per host - you need to know which of them you should focus on.

The open source world is full of tools designed to monitor and collect metrics from different databases - most of them would require you to integrate them with your overall monitoring infrastructure, chatops platform or oncall support tools (like PagerDuty). It might also be required to install and integrate multiple components - storage (some sort of time-series database), presentation layer and data collection tools.

ClusterControl is a bit of a different approach, as it is one single product with real-time monitoring, trending, and dashboards that show the most important details. Database advisors, which can be anything from simple configuration advice, warning on thresholds or more complex rules for predictions, would generally produce comprehensive recommendations.

Ability to scale-up

Databases tend to grow in size, and it is not unlikely that it would grow in terms of transaction volumes or number of users. The ability to scale out or up can be critical for production. Even if you do a great job in estimating your hardware requirements at the start of the product lifecycle, you will probably have to handle a growth phase - as long as your product is successful, that is (but that’s what we all plan for, right?). You have to have the means to easily scale-up your infrastructure to cope with incoming load. For stateless services like webservers, this is fairly easy - you just need to provision more instances using the latest production image or code from your version control tool. For stateful services like databases, it’s more tricky. You have to provision new instances using your current production data, set up replication or some form of clustering between the current and the new instances. This can be a complex process and to get it right, you need to have more in-depth knowledge of the clustering or replication model chosen.

ClusterControl, as the name suggests, provides extensive support for building out clustered or replicated database setups. The methods used are battle tested through thousands of deployments. It comes with a Command Line Interface (CLI) so it can be easily integrated with configuration management systems. Please keep it in mind, though, that you might not want to make changes to your pool of databases too often - provisioning of a new instance takes time and adds some overhead in existing databases. Therefore you may want to stay on a “over-provisioned” side a little bit so you will have some time to spin up new instance before your cluster gets overloaded.

All in all, there are several steps you still have to take after initial deployment, to make sure your environment is ready for production. With the right tools, it is much easier to get there.

by krzysztof at July 26, 2017 11:11 AM

July 25, 2017

Peter Zaitsev

Webinar Thursday July 27, 2017: Database Backup and Recovery Best Practices (with a Focus on MySQL)

Backups and Disaster Recovery

Database Backup and RecoveryJoin Percona’s, Architect, Manjot Singh as he presents Database Backup and Recovery Best Practices (with a Focus on MySQL) on Thursday, July 27, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

In the case of a failure, do you know how long it will take to restore your database? Do you know how old the backup will be? In this presentation, we will cover the basics of best practices for backup, restoration and business continuity. Don’t put your company on the line due to bad data retention and backup policies.

Register for the webinar here.

Manjot Singh, Architect

Manjot Singh is an Architect with Percona in California. He loves to learn about new technologies and apply them to real-world problems. Manjot is a veteran of startup and Fortune 500 enterprise companies alike, with a few years spent in government, education and hospital IT. Now he consults for Percona with companies around the world on many interesting problems.

by Dave Avery at July 25, 2017 07:46 PM

Jean-Jerome Schmidt

Tips & Tricks - DevOps Database Glossary for the MySQL Novice

When you need to work with a database that you are not 100% familiar with, you can be overwhelmed by the hundreds of metrics available. Which ones are the most important? What should I monitor, and why? What patterns in metrics should ring some alarm bells? In this blog post we will try to introduce you to some of the most important metrics to keep an eye on while running MySQL or MariaDB in production.

Com_* status counters

We will start with Com_* counters - those define the number and types of queries that MySQL executes. We are talking here about query types like SELECT, INSERT, UPDATE and many more. It is quite important to keep an eye on those as sudden spikes or unexpected drops may suggest something went wrong in the system.

Our all-inclusive database management system ClusterControl shows you this data related to the most common query types in the “Overview” section.

Handler_* status counters

A category of metrics you should keep an eye on are Handler_* counters in MySQL. Com_* counters tell you what kind of queries your MySQL instance is executing but one SELECT can be totally different to another - SELECT could be a primary key lookup, it can be also a table scan if an index cannot be used. Handlers tell you how MySQL access stored data - this is very useful for investigating the performance issues and assessing if there is a possible gain in query review and additional indexing.

As you can see from the graph above there are many metrics to track (and ClusterControl graphs the most important ones) - we won’t cover all of them here (you can find descriptions in MySQL documentation) but we’d like to highlight the most important ones.

Handler_read_rnd_next - whenever MySQL accesses a row without an index lookup, in sequential order, this counter will be increased. If in your workload handler_read_rnd_next is responsible for a high percentage of the whole traffic, it means that your tables, most likely, could use some additional indexes because MySQL does plenty of table scans.

Handler_read_next and handler_read_prev - those two counters are updated whenever MySQL does an index scan - forward or backward. Handler_read_first and handler_read_last may shed some more light onto what kind of index scans those are - if we are talking about full index scan (forward or backward), those two counters will be updated.

Handler_read_key - this counter, on the other hand, if its value is high, tells you that your tables are well indexed as many of the rows were accessed through an index lookup.

Replication lag

If you are working with MySQL replication, replication lag is a metric you definitely want to monitor. Replication lag is inevitable and you will have to deal with it, but to deal with it you need to understand why it happens. For that the first step will be to know _when_ it showed up.

Whenever you see a spike of the replication lag, you’d want to check other graphs to get more clues - why has it happened? What might have caused it? Reasons could be different - long, heavy DML’s, significant increase in number of DML’s executed in a short period of time, CPU or I/O limitations.

InnoDB I/O

There are a number of important metrics to monitor that related to the I/O.

In the graph above, you can see couple of metrics which tell you what kind of I/O InnoDB does - data writes and reads, redo log writes, fsyncs. Those metrics will help you to decide, for example, if replication lag was caused by a spike of I/O or maybe because of some other reason. It’s also important to keep track of those metrics and compare them with your hardware limitations - if you are getting close to the hardware limits of your disks, maybe it’s time to look into this before it has more serious effects on your database performance.

Severalnines
 
DevOps Guide to Database Management
Learn about what you need to know to automate and manage your open source databases

Galera metrics - flow control and queues

If you happen to use Galera Cluster (no matter which flavor you use), there are a couple more metrics you’d want to closely monitor, these are somewhat tied together. First of them are metrics related to flow control.

Flow control, in Galera, is a means to keep the cluster in sync. Whenever a node stalls and cannot keep up with the rest of the cluster, it starts to send flow control messages asking the remaining cluster nodes to slow down. This allows it to catch up. This reduces the performance of the cluster, so it is important to be able to tell which node and when it started to send flow control messages. This can explain some of the slowdowns experienced by users or limit the time window and host to use for further investigation.

Second set of metrics to monitor are the ones related to send and receive queues in Galera.

Galera nodes can cache writesets (transactions) if they cannot apply all of them immediately. If needed, they can also cache writesets which are about to be sent to other nodes (if a given node receives writes from the application). Both cases are symptoms of a slow down which, most likely, will result in flow control messages being sent, and require some investigation - why it happened, on which node, at what time?

This is, of course, just the tip of the iceberg when we consider all of the metrics MySQL makes available - still, you can’t go wrong if you start watching those we covered here, in addition to regular OS/hardware metrics like CPU, memory, disk utilization and state of the services.

by krzysztof at July 25, 2017 08:45 AM

July 24, 2017

Peter Zaitsev

Percona XtraBackup 2.4.8 is Now Available

Percona XtraBackup

Percona XtraBackup 2.4.8Percona announces the GA release of Percona XtraBackup 2.4.8 on July 24, 2017. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New features:
Bugs Fixed:
  • xtrabackup would hang with Waiting for master thread to be suspended message when backup was being prepared. Bug fixed #1671437.
  • xtrabackup would fail to prepare the backup with 6th page is not initialized message in case server didn’t properly initialize the page. Bug fixed #1671722.
  • xbstream could run out of file descriptors while extracting the backup which contains many tables. Bug fixed #1690823.
  • When a table was created with the DATA DIRECTORY option xtrabackup would back up the .frm and .isl files, but not the .ibd file. Due to the missing .ibd files backup then could not be restored. Bug fixed #1701736.
  • Percona XtraBackup incorrectly determined use of master_auto_postion on a slave, and thus generated invalid xtrabackup_slave_info file. Bug fixed #1705193.
  • Percona XtraBackup will now print a warning if it encounters unsupported storage engine. Bug fixed #1394493.
  • Percona XtraBackup would crash while backing up MariaDB 10.2.x with --ftwrl-* options. Bug fixed #1704636.
  • xtrabackup --slave-info didn’t write the correct information into xtrabackup_slave_info file when multi-source replication was used. Bug fixed #1551634.
  • Along with xtrabackup_checkpints file, xtrabackup now copies xtrabackup_info file into directory specified by --extra-lsndir option. Bug fixed #1600656.
  • GTID position was not recorded when --binlog-info option was set to AUTO. Bug fixed #1651505.

Release notes with all the bugfixes for Percona XtraBackup 2.4.8 are available in our online documentation. Please report any bugs to the launchpad bug tracker.

by Hrvoje Matijakovic at July 24, 2017 05:59 PM

Percona XtraBackup 2.3.9 is Now Available

Percona XtraBackup

Percona XtraBackup 2.3.9Percona announces the release of Percona XtraBackup 2.3.9 on July 24, 2017. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release is the current GA (Generally Available) stable release in the 2.3 series.

New Features
Bugs Fixed:
  • Percona XtraBackup would crash when being prepared if the index compaction was enabled. Bug fixed #1192834.
  • Fixed build failure on Debian Stretch by adding support for building with OpenSSL 1.1. Bug fixed #1678947.
  • xbstream could run out of file descriptors while extracting the backup which contains many tables. Bug fixed #1690823.
  • Percona XtraBackup incorrectly determined use of master_auto_postion on a slave, and thus generated invalid xtrabackup_slave_info file. Bug fixed #1705193.
  • Percona XtraBackup would crash while backing up MariaDB 10.2.x with --ftwrl-* options. Bug fixed #1704636.
  • Along with xtrabackup_checkpints file, xtrabackup now copies xtrabackup_info file into directory specified by --extra-lsndir option. Bug fixed #1600656.
  • GTID position was not recorded when --binlog-info option was set to AUTO. Bug fixed #1651505.

Release notes with all the bugfixes for Percona XtraBackup 2.3.9 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at July 24, 2017 05:35 PM

MariaDB Foundation

Community contributions to MariaDB

One of the goals of the MariaDB Foundation is to help new contributors understand the source code and to lower the barrier for new participants. One way to measure this is to look at the number of pull requests received and accepted, as these mostly reflect community contributions. The figures below are for the main […]

The post Community contributions to MariaDB appeared first on MariaDB.org.

by Sergey Vojtovich at July 24, 2017 09:19 AM

July 23, 2017

Valeriy Kravchuk

Why Thread May Hang in "Waiting for table level lock" State - Part I

Last time I had to use gdb to check what's going on in MySQL server and found something useful with it to share in the blog post it was April 2017, and I miss this kind of experience already. So, today I decided to try it again in a hope to get some insights in cases when other tools may not apply or may not be as helpful as one could expect. Here is the long enough story, based on recent customer issue I worked on this week.
* * *
Had you seen anything like this output of SHOW PROCESSLIST statement:

Id User Host db Command Time State 
Info Progress
...
28 someuser01 xx.xx.xx.xx:39644 somedb001 Sleep 247
NULL 0.000
29 someuser01 xx.xx.xx.yy:44100 somedb001 Query 276
Waiting for table level lock DELETE FROM t1 WHERE (some_id = 'NNNNN') AND ...
0.000
...
33 someuser01 xx.xx.zz.tt:35886 somedb001 Query 275
Waiting for table level lock DELETE FROM t2 WHERE (some_id = 'XXXXX') 0.000
...
38 someuser01 xx.xx.xx.xx:57055 somedb001 Query 246
Waiting for table level lock DELETE FROM t3 WHERE (some_id in (123456)) AND ... 0.000
...
recently? That is, many threads accessing InnoDB(!) tables and hanging in the "Waiting for table level lock" state for a long time without any obvious reason in the SHOW PROCESSLIST or SHOW ENGINE INNODB STATUS?

I've seen it this week, along with a statement from customer that the server is basically unusable. Unlike in many other cases I had not found any obvious problem from the outputs mentioned above (other than high concurrency and normal InnoDB row lock waits etc for other sessions). There were no unexpected table level locks reported by InnoDB, and the longest running transactions were those in that waiting state.

Unfortunately fine MySQL manual is not very helpful when describing this thread state:
"Waiting for lock_type lock
The server is waiting to acquire a THR_LOCK lock or a lock from the metadata locking subsystem, where lock_type indicates the type of lock.
This state indicates a wait for a THR_LOCK:
  • Waiting for table level lock
These states indicate a wait for a metadata lock:
  • Waiting for event metadata lock
  • Waiting for global read lock
    ...
    "
What that "THR_LOCK" should tell the reader is beyond me (and probably most of MySQL users). Fortunately, I know from experience that this state usually means that thread is trying to access some MyISAM table. The tables mentioned in currently running statements were all InnoDB, though. Server had performance_schema enabled, but only with default instrumentation and consumers defined, so I had no chance to get anything related to metadata locks or much hope to see all previous statement executed by each thread anyway. I surely insisted on checking the source code etc, but this rarely gives immediate results.

So, to check my assumption and try to prove MyISAM tables are involved, and get some hint what sessions they were locked by, I suggested to run mysqladmin debug, the command that is rarely used these days, with the output that Oracle explicitly refused to document properly (see my Bug #71346)! I do know what to expect there (from the days when MyISAM was widely used), and when in doubts I can always check the source code.

The mysqladmin debug command outputs a lot of information into the error log. Among other things I found the following section:

Thread database.table_name          Locked/Waiting        Lock_type

...
928
somedb001.somewhat_seq Waiting - write High priority write lock
...

12940
somedb001.somewhat_seq Locked - write High priority write lock
where somedb001.somewhat_seq surely was the MyISAM table. There were actually many tables mentioned as locked, with waits on them, and names ending with _seq suffix. One can now check what MySQL thread with id=12940 is doing at all levels, where it comes from, should it be killed etc.

Now it became more clear what's going on. Probably application developers tried to implement sequences with MyISAM tables and, probably, triggers to use them when data are not provided for the columns! The idea is quite popular and described in many blog posts. Check this old blog post by Peter Zaitsev, for example. This assumption was confirmed, and essentially they did something like I did in the following deliberately primitive example:
-- this is our sequence table
mysql> show create table misam\G
*************************** 1. row ***************************
       Table: misam
Create Table: CREATE TABLE `misam` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

mysql> create table inno like misam;
Query OK, 0 rows affected (0.30 sec)

mysql> alter table inno engine=InnoDB;
Query OK, 0 rows affected (2.73 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table inno\G
*************************** 1. row ***************************
       Table: inno
Create Table: CREATE TABLE `inno` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.03 sec)

mysql> select * from misam;
Empty set (0.00 sec)

-- let's set up the first value for the sequence
mysql> insert into misam values(1);
Query OK, 1 row affected (0.05 sec)

-- now let's create trigger to insert the value from sequencemysql> delimiter //
mysql> create trigger tbi before insert on inno
    -> for each row
    -> begin
    -> if ifnull(new.id,0) = 0 then
    ->   update misam set id=last_insert_id(id+1);
    ->   set new.id = last_insert_id();
    -> end if;
    -> end
    -> //

Query OK, 0 rows affected (0.25 sec)

mysql> delimiter ;
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into inno values(0);
Query OK, 1 row affected (0.20 sec)

mysql> insert into inno values(0);
Query OK, 1 row affected (0.10 sec)

mysql> select * from inno;
+----+
| id |
+----+
|  2 |
|  3 |
+----+
2 rows in set (0.00 sec)

mysql> select * from misam;
+----+
| id |
+----+
|  3 |
+----+
1 row in set (0.00 sec)
So, this is how it is supposed to work, and, you know, it works from two concurrent sessions without any locking noticed etc.

This is the end of the real story, for now. We have a lot of things to discuss there with customer, including how to find out previous commands executed in each sessions, what useful details we can get from the performance_schema etc. The root cause was clear, but we still have to find out why the sessions are spending so much time holding the blocking locks on MyISAM tables and what we can do about that, or with architecture based on such implementation of sequences (that I kindly ask you to avoid whenever possible! Use auto_increment values, please, if you care about concurrency.) I expect more blog posts eventually inspired by that real story, but now it's time to move to more generic technical details.
* * *
I have two problems with this story if I try to approach it in a more generic way.

First, do we have any other way to see MyISAM table level locks, besides mysqladmin debug command? (Oracle kindly plans to replace with something, let me quote Shane Bester's comment the bug that originates from hist internal feature request: "We need to replace it with appropriate performance_schema or information_schema tables.")

Second, how exactly to reproduce the situation customer reported? My initial attempts to get such a status with the trigger in place failed, I was not able to capture threads spending any notable time in "Waiting for table level lock" state even with concurrent single row inserts and the trigger above in place. I was thinking about explicit LOCK TABLES misam WRITE etc, but I doubt real code does use that. Fortunately, the following INSERT running from one session:
mysql> insert into inno values(sleep(100));
Query OK, 1 row affected (1 min 40.20 sec)
allowed me to run INSERT in the other session that got blocked:
mysql> insert into inno values(0);
Query OK, 1 row affected (1 min 36.36 sec)
and while it was blocked I've seen the thread state I wanted in the output of the SHOW PROCESSLIST:
mysql> show processlist;+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                        | Info                                | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
| 16 | root | localhost | test | Query   |   17 | User sleep                   | insert into inno values(sleep(100)) |         0 |             0 |
| 17 | root | localhost | test | Query   |   13 | Waiting for table level lock | insert into inno values(0)          |         0 |             0 |
| 22 | root | localhost | test | Query   |    0 | starting                     | show processlist                    |         0 |             0 |
+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
3 rows in set (0.00 sec)
So, I know how exactly to reproduce the problem and what it can be caused by. Any slow running single INSERT statement (caused by something complex executed in trigger, some slow function used in the list of values inserted, or, maybe just INSERT ... SELECT ... from the big table) will give us the desired thread state.

Coming back to the first generic problem mentioned above, is there any way besides running mysqladmin debug and checking the output vs processlist etc, to identify the thread that holds MyISAM table level locks? One should expect performance_schema to help with this, at least as of 5.7 (I've used Percona Server 5.7.18-15 on my Ubuntu 14.04 netbook for testing today, while the original problem was on MariaDB 10.2.x. Percona is for fun, while MariaDB is for real work...). Indeed, we have table_handles there, and until recently I ignored its existence (it's new in 5.7, and I am not even sure if MariaDB 10.2.x has it already).

So, I reproduced the problem again and got the following there:
mysql> select * from performance_schema.table_handles;
+-------------+---------------+-------------+-----------------------+-----------------+----------------+---------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | OBJECT_INSTANCE_BEGIN | OWNER_THREAD_ID | OWNER_EVENT_ID | INTERNAL_LOCK | EXTERNAL_LOCK  |
+-------------+---------------+-------------+-----------------------+-----------------+----------------+---------------+----------------+
| TABLE       | test          | t           |       140392772351024 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | t1          |       140392772352560 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | ttime       |       140392772355632 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | misam       |       140392772358704 |              55 |             10 | WRITE         | WRITE EXTERNAL |
| TABLE       | test          | misam       |       140392896981552 |              54 |             10 | WRITE         | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392772361776 |              55 |             10 | NULL          | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392896983088 |              54 |             10 | NULL          | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392826836016 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | misam       |       140392826837552 |               0 |              0 | NULL          | NULL           |
+-------------+---------------+-------------+-----------------------+-----------------+----------------+---------------+----------------+
9 rows in set (0.00 sec)
What I am supposed to do with the above to find out the MySQL thread id of the blocking session? I am supposed to join to performance_schema.threads table, looking for thread_id value that is the same as owner_thread_id above, 54 and 55. I get the following, note that processlist_id is what you are looking for in the processlist:
mysql> select * from performance_schema.threads where thread_id in (54, 55)\G
*************************** 1. row ***************************
          THREAD_ID: 54
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 27
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: test
PROCESSLIST_COMMAND: Sleep
   PROCESSLIST_TIME: 90
  PROCESSLIST_STATE: NULL
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: Socket
       THREAD_OS_ID: 32252
*************************** 2. row ***************************
          THREAD_ID: 55
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 28
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: test
PROCESSLIST_COMMAND: Sleep
   PROCESSLIST_TIME: 101
  PROCESSLIST_STATE: NULL
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: Socket
       THREAD_OS_ID: 27844
2 rows in set (0.10 sec)
I do not see a way to distinguish lock wait from actually holding the lock. All I know at the moment is that there is engine-level lock on misam table (and inno table, for that matter). If the MySQL thread id is NOT for the thread that is "Waiting for table level lock", then it must be the thread that holds the lock! Some smart join with proper WHERE clause would let me to find out what I need directly. Maybe in one of the next parts I'll even present it, but writing it from the top of my head in critical situation is above my current skills related to performance_schema.
* * *
Now, what those of us should do who do not have performance_schema enabled, or have to use version before 5.7? Or those with access to gdb and spare 5 minutes?

Surely we should attach gdb to the mysqld process and, if in doubts, read some parts of the source code to know where to start. I started with the following fragments of code that were easy to find (as long as you know that COM_DEBUG command is actually sent by mysqladmin debug to server):
openxs@ao756:~/git/percona-server$ grep -rni com_debug *
include/mysql/plugin_audit.h.pp:160:  COM_DEBUG,
include/mysql.h.pp:47:  COM_DEBUG,
include/my_command.h:39:  COM_DEBUG,
libmysql/libmysql.c:899:  DBUG_RETURN(simple_command(mysql,COM_DEBUG,0,0,0));
rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task_debug.h:62:  "[XCOM_DEBUG] ",
sql/sql_parse.cc:1884:  case COM_DEBUG:
^C
...
# from the line highlighted above...  case COM_DEBUG:
    thd->status_var.com_other++;
    if (check_global_access(thd, SUPER_ACL))
      break;                                    /* purecov: inspected */
    mysql_print_status();    query_logger.general_log_print(thd, command, NullS);
    my_eof(thd);
    break;

openxs@ao756:~/git/percona-server$ grep -rni mysql_print_status *
sql/sql_test.h:40:void mysql_print_status();
sql/sql_parse.cc:59:#include "sql_test.h"         // mysql_print_status
sql/sql_parse.cc:1888:    mysql_print_status();
sql/sql_test.cc:456:void mysql_print_status()sql/mysqld.cc:69:#include "sql_test.h"     // mysql_print_status
sql/mysqld.cc:2387:        mysql_print_status();   // Print some debug info
openxs@ao756:~/git/percona-server$

# from the line highlighted above...

void mysql_print_status()
{
  char current_dir[FN_REFLEN];
  STATUS_VAR current_global_status_var;

  printf("\nStatus information:\n\n");
  (void) my_getwd(current_dir, sizeof(current_dir),MYF(0));
  printf("Current dir: %s\n", current_dir);
  printf("Running threads: %u  Stack size: %ld\n",
         Global_THD_manager::get_instance()->get_thd_count(),
         (long) my_thread_stack_size);
  thr_print_locks();                            // Write some debug info
...
  printf("\nTable status:\n\
...
  display_table_locks();...
# finally, let's find the code of display_table_locks...openxs@ao756:~/git/percona-server$ grep -rni display_table_locks *
sql/sql_test.cc:371:static void display_table_locks(void)sql/sql_test.cc:501:  display_table_locks();
sql/mysqld.cc:9537:  { &key_memory_locked_thread_list, "display_table_locks", PSI_FLAG_THREAD},
openxs@ao756:~/git/percona-server$

static void display_table_locks(void)
{
  LIST *list;
  Saved_locks_array saved_table_locks(key_memory_locked_thread_list);
  saved_table_locks.reserve(table_cache_manager.cached_tables() + 20);

  mysql_mutex_lock(&THR_LOCK_lock);
  for (list= thr_lock_thread_list; list; list= list_rest(list))
  {
    THR_LOCK *lock=(THR_LOCK*) list->data;
    mysql_mutex_lock(&lock->mutex);
    push_locks_into_array(&saved_table_locks, lock->write.data, FALSE,
                          "Locked - write");
    push_locks_into_array(&saved_table_locks, lock->write_wait.data, TRUE,
                          "Waiting - write");
    push_locks_into_array(&saved_table_locks, lock->read.data, FALSE,
                          "Locked - read");
    push_locks_into_array(&saved_table_locks, lock->read_wait.data, TRUE,
                          "Waiting - read");
    mysql_mutex_unlock(&lock->mutex);
  }
...
That's enough to start. We'll have to finally study what THR_LOCK is. There is a global (double) linked lists of all them, thr_lock_thread_list, and this is where we can get details from, in a way similar (let's hope) to those server does while processing COM_DEBUG command for us. So, let's attach gdb and let's fun begin:
openxs@ao756:~/git/percona-server$ sudo gdb -p 23841
[sudo] password for openxs:
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
...
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
0x00007fafd93d9c5d in poll () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) set $list=(LIST *)thr_lock_thread_list
(gdb) set $lock=(THR_LOCK*) $list->data
(gdb) p *($lock)
$1 = {list = {prev = 0x0, next = 0x7fafd2fa4780, data = 0x7fafbd545c80},
  mutex = {m_mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
        __nusers = 0, __kind = 3, __spins = 0, __elision = 0, __list = {
          __prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\003", '\000' <repeats 22 times>,
      __align = 0}, m_psi = 0x7fafd1f97c80}, read_wait = {data = 0x0,
    last = 0x7fafbd545cc8}, read = {data = 0x0, last = 0x7fafbd545cd8},
  write_wait = {data = 0x0, last = 0x7fafbd545ce8}, write = {
    data = 0x7fafbd69f188, last = 0x7fafbd69f190}
, write_lock_count = 0,
  read_no_write_count = 0, get_status = 0xf84910 <mi_get_status>,
  copy_status = 0xf84a70 <mi_copy_status>,
  update_status = 0xf84970 <mi_update_status>,
  restore_status = 0xf84a50 <mi_restore_status>,
  check_status = 0xf84a80 <mi_check_status>}
In the above we started from the first item in the list. If needed we can move on to the next with set $list=(LIST *)($list->next) etc. We then interpret $list->data as a (THD_LOCK*). There we can see read, read_wait, write and write_wait structures. One of them will have data item that is not 0x0. This way we can identify is it a lock or lock wait, and what kind of lock is it. In my case i was looking for a write lock, and here it is, the very first one. So, I continue:
(gdb) p *($lock)->write.data
$2 = {owner = 0x7fafbd468d38, next = 0x0, prev = 0x7fafbd545cf8,
  lock = 0x7fafbd545c80, cond = 0x0, type = TL_WRITE,
  status_param = 0x7fafbd69ee20, debug_print_param = 0x7fafbd515020,
  m_psi = 0x7fafa4c06cc0}
We can check the owner field:
(gdb) p *($lock)->write.data.owner
$3 = {thread_id = 16, suspend = 0x7fafbd469370}
and thread_id there is what we were looking for, the MySQL thread id of the blocking thread. If we want to get the details about the table locks we can do this as follows, by studying debug_print_param in data:
(gdb) set $table=(TABLE *) &(*($lock)->write.data->debug_print_param)
(gdb) p $table
$4 = (TABLE *) 0x7fafbd515020
(gdb) p *($table)
$5 = {s = 0x7fafbd4d7030, file = 0x7fafbd535230, next = 0x0, prev = 0x0,
  cache_next = 0x0, cache_prev = 0x7fafc4df05a0, in_use = 0x7fafbd468000,
  field = 0x7fafbd4d7440, hidden_field_count = 0, record = {
    0x7fafbd4d7430 "\377", 0x7fafbd4d7438 "\n"}, write_row_record = 0x0,
  insert_values = 0x0, covering_keys = {map = 0}, quick_keys = {map = 0},
  merge_keys = {map = 0}, possible_quick_keys = {map = 0},
  keys_in_use_for_query = {map = 0}, keys_in_use_for_group_by = {map = 0},
  keys_in_use_for_order_by = {map = 0}, key_info = 0x7fafbd4d7508,
  next_number_field = 0x0, found_next_number_field = 0x0, vfield = 0x0,
  hash_field = 0x0, fts_doc_id_field = 0x0, triggers = 0x0,
  pos_in_table_list = 0x7fafbd47a9b8, pos_in_locked_tables = 0x7fafbd4e5030,
  group = 0x0, alias = 0x7fafbbbad120 "misam",
  null_flags = 0x7fafbd4d7430 "\377", bitmap_init_value = 0x0, def_read_set = {
    bitmap = 0x7fafbd4d75a8, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75a8, mutex = 0x0}, def_write_set = {
    bitmap = 0x7fafbd4d75ac, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75ac, mutex = 0x0}, tmp_set = {
    bitmap = 0x7fafbd4d75b0, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75b0, mutex = 0x0}, cond_set = {
    bitmap = 0x7fafbd4d75b4, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75b4, mutex = 0x0},
  def_fields_set_during_insert = {bitmap = 0x7fafbd4d75b8, n_bits = 1,
    last_word_mask = 4294967294, last_word_ptr = 0x7fafbd4d75b8, mutex = 0x0},
---Type <return> to continue, or q <return> to quit---
  read_set = 0x7fafbd515128, write_set = 0x7fafbd515148,
  fields_set_during_insert = 0x7fafbd5151a8, query_id = 0, quick_rows = {
    0 <repeats 64 times>}, const_key_parts = {0 <repeats 64 times>},
  quick_key_parts = {0 <repeats 64 times>}, quick_n_ranges = {
    0 <repeats 64 times>}, quick_condition_rows = 0, lock_position = 0,
  lock_data_start = 0, lock_count = 1, temp_pool_slot = 0, db_stat = 39,
  current_lock = 1, nullable = 0 '\000', null_row = 0 '\000',
  status = 3 '\003', copy_blobs = 0 '\000', force_index = 0 '\000',
  force_index_order = 0 '\000', force_index_group = 0 '\000',
  distinct = 0 '\000', const_table = 0 '\000', no_rows = 0 '\000',
  key_read = 0 '\000', no_keyread = 0 '\000', locked_by_logger = 0 '\000',
  no_replicate = 0 '\000', locked_by_name = 0 '\000',
  fulltext_searched = 0 '\000', no_cache = 0 '\000',
  open_by_handler = 0 '\000', auto_increment_field_not_null = 0 '\000',
  insert_or_update = 0 '\000', alias_name_used = 0 '\000',
  get_fields_in_item_tree = 0 '\000', m_needs_reopen = 0 '\000',
  created = true, max_keys = 0, reginfo = {join_tab = 0x0, qep_tab = 0x0,
    lock_type = TL_WRITE, not_exists_optimize = false,
    impossible_range = false}, mem_root = {free = 0x7fafbd4d7420,
    used = 0x7fafbd535220, pre_alloc = 0x0, min_malloc = 32, block_size = 992,
    block_num = 6, first_block_usage = 0, max_capacity = 0,
    allocated_size = 2248, error_for_capacity_exceeded = 0 '\000',
    error_handler = 0xc4b240 <sql_alloc_error_handler()>, m_psi_key = 106},
---Type <return> to continue, or q <return> to quit---
  blob_storage = 0x0, grant = {grant_table = 0x0, version = 0,
    privilege = 536870911, m_internal = {m_schema_lookup_done = true,
      m_schema_access = 0x0, m_table_lookup_done = true,
      m_table_access = 0x0}}, sort = {filesort_buffer = {m_next_rec_ptr = 0x0,
      m_rawmem = 0x0, m_record_pointers = 0x0, m_sort_keys = 0x0,
      m_num_records = 0, m_record_length = 0, m_sort_length = 0,
      m_size_in_bytes = 0, m_idx = 0}, io_cache = 0x0, merge_chunks = {
      m_array = 0x0, m_size = 0}, addon_fields = 0x0,
    sorted_result_in_fsbuf = false, sorted_result = 0x0,
    sorted_result_end = 0x0, found_records = 0}, part_info = 0x0,
  all_partitions_pruned_away = false, mdl_ticket = 0x7fafbd4c4f10,
  m_cost_model = {m_cost_model_server = 0x0, m_se_cost_constants = 0x0,
    m_table = 0x0}, should_binlog_drop_if_temp_flag = false}
We already see table alias, but more details are in the s filed (of TABLE_SHARE * type):
(gdb) p $table->s$26 = (TABLE_SHARE *) 0x7fafbd4d7030
...
(gdb) p $table->s->table_cache_key
$27 = {str = 0x7fafbd4d73b8 "test", length = 11}
...
(gdb) p $table->s->db
$30 = {str = 0x7fafbd4d73b8 "test", length = 4}
(gdb) p $table->s->path
$31 = {str = 0x7fafbd4d73c8 "./test/misam", length = 12}
(gdb) p $table->s->table_name
$32 = {str = 0x7fafbd4d73bd "misam", length = 5}
# to remind you, this is how to get the thread id directly
(gdb) p (*($lock)->write.data.owner).thread_id
$37 = 16
I skilled some of my tests and intermediate results. To summarize, for me it took only a bit more time to search the code and try commands in gdb than to find out that metadata_locks does not help and one has to use table_handles in performance_schema. Next time in gdb I'll do it instantly, and the number of key strokes is far less :)

Surely, for production environments I'll have to write, test and note somewhere proper queries to the performance_schema. For next attempts with gdb and, maybe, FOSDEM talks I'll have to read the code to remember key details about the data structures used above... More posts to come, stay tuned!

by Valeriy Kravchuk (noreply@blogger.com) at July 23, 2017 06:20 PM

July 21, 2017

Peter Zaitsev

Faster Node Rejoins with Improved IST performance

IST Performance

In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.

Introduction

Starting in version 5.7.17-29.20 of Percona XtraDB Cluster significantly improved performance. Depending on the workload, the increase in throughput is in the range of 3-10x. (More details here). These optimization fixes also helped improve IST (Incremental State Transfer) performance. This blog is aimed at studying the IST impact.

IST

IST stands for incremental state transfer. When a node of the cluster leaves the cluster for a short period of time and then rejoins the cluster it needs to catch-up with cluster state. As part of this sync process existing node of the cluster (aka DONOR) donates missing write-sets to rejoining node (aka JOINER). In short, flow involves, applying missing write-sets on JOINER as it does during active workload replication.

Percona XtraDB Cluster / Galera already can apply write-sets in parallel using multiple applier threads. Unfortunately, due to commit contention, the commit action was serialized. This was fixed in the above Percona XtraDB Cluster release, allowing commits to proceed in parallel.

IST uses the same path for applying write-sets, except that it is more like a batch operation.

IST Performance

Let’s look at IST performance before and now.

Setup

  1. Two node cluster (node-1 and node-2) and gcache is configured large enough to avoid purging as we need IST
  2. Start workload against node-1 for 30 seconds
  3. Shutdown node-2
  4. Start workload that performs 4M requests against node-1. Workload produces ~3.5M write-sets that are cached in gcache and used later for IST
  5. Start node-2 with N-applier threads
  6. Wait until IST is done
  7. ….. repeat steps 3-6 with different values of N.

Observations:

  • IST is 4x faster with PXC-5.7.17 (compared to previous releases)
  • Improved performance means a quicker node rejoin, and an overall increase in cluster productivity as joiner node is available to process the workload more quickly

Conclusion

Percona XtraDB Cluster 5.7.17 significantly improved IST performance. A faster re-join of the node effectively means better cluster productivity and flexibility in planning maintenance window. So what are you waiting for? Upgrade to Percona XtraDB Cluster 5.7.17 or latest Percona XtraDB Cluster 5.7 release and experience the power!

by Krunal Bauskar at July 21, 2017 04:01 PM

Valeriy Kravchuk

Fun with Bugs #54 - On Some Bugs Fixed in MySQL 5.7.19

More than 3 months after 5.7.18 we' ve got MySQL 5.7.19 released recently. This is my quick review of the release notes with interesting fixed bug (reported in public) highlighted in the areas I am usually interested in.

Let's start with InnoDB. The following bug fixes attracted my attention:
  • Bug #85043 is still private. You know how much I hate those. At least we can see it was about the fact that "The server allocated memory unnecessarily for an operation that rebuilt the table." Let's hope this is no longer the case.
  • Bug #84038 - "Errors when restarting MySQL after FLUSH TABLES FOR EXPORT, RENAME and DROP.", was reported by Jean-François Gagné and verified by Umesh Shastry. Basically, InnoDB failed to update INNODB_SYS_DATAFILES internal data dictionary table properly while renaming tables.
  • Bug #80788 - "Reduce the time of looking for MLOG_CHECKPOINT during crash recovery". It was reported by  Zhai Weixiang (who had provided a patch) and quickly verified by Umesh Shastry.
  • Bug #83470 - "Reverse scan on a partitioned table does ICP check incorrectly, causing slowdown". This report that is related to both InnoDB, partitioning and optimizer, was created by my colleague from MariaDB, Sergey Petrunya, who had also suggested a patch under BSD license. Bug #84107 was considered a duplicate of this one it seems.
There were some public bugs fixed in group replication. While I do not really care, yet, about Group Replication, it's nice to see bugs in this area fixed so fast:
  • Bug #85667 - "GR node is in RECOVERING state if binlog_checksum configured on running server". This bug was reported by Ramana Yeruva.
  • Bug #85047 - "Secondaries allows transactions through ASYNC setup into the group". Yet another group replication bug recently fixed. It was reported by Narendra Chauhan who probably works for Oracle.
  • Bug #84728 - "Not Possible To Avoid MySQL From Starting When GR Cannot Start", was reported by Kenny Gryp from Percona.
  • Bug #84329 - "Some Class B private address is not permitted automatically". This funny bug was reported by Satoshi Mitani and verified by Umesh Shastry
  • Bug #84153 - "ASSERT "!contains_gtid(gtid)" rpl_gtid_owned.cc:94 Owned_gtids::add_gtid_owner". It was reported by Narendra Chauhan.
Usual async replication was also improved in 5.7.19, with the following related bugs fixed:
  • Bug #83184 - "Can't set slave_skip_errors > 3000". This bug was reported by Tsubasa Tanaka (who had provided a patch) and verified by Umesh Shastry.
  • Bug #82283 - "main.mysqlbinlog_debug fails with a LeakSanitizer error". Laurynas Biveinis from Percona found it and provided patches. The bug was verified by Umesh Shastry.
  • Bug #81232 - "Changing master_delay after stop slave results in loss of events". Ths regression bug (5.6.x was not affected) was reported by Parveez Baig. Bug #84249 (reported by Sergio Roysen) was declared a duplicate of this one.
  • Bug #77406 - "MTS must be able handle large packets with small slave_pending_jobs_size_max", was reported by my colleague Andrii Nikitin while he worked in Oracle.
  • Bug #84471 is still private. Why it could be so when the bug is supposed to be about "...the master could write to the binary log a last_committed value which was smaller than it should have been. This could cause the slave to execute in parallel transactions which should not have been, leading to inconsistencies or other errors." Let's hide the details about all bugs that may lead to data corruption,s why not?
  • Bug #83186 - "Request for slave_skip_errors = ER_INCONSISTENT_ERROR". Thanks to  Tsubasa Tanaka, who had provided a patch, we can do this now.
  • Bug #82848 - "Restarting a slave after seeding it with a mysqldump loses it's position". This serious problem with GTID-based replication was noted and resolved (with a patch) by Daniël van Eeden.
  • Bug #82209 is also private... Now, I am tired of them, just go real the release notes and hope they describe the problem and solution right. Great for any open source software, isn't it?
  • Bug #81119 - "multi source replication slave sql running error 1778". This serious bug (where GTIDs also play role) was reported by Mohamed Atef. Good to see it fixed finally.
I know long term query cache is doomed, Oracle is not going to work on it any more, but in the meantime, please, check this important bug fixed, Bug #86047 - "FROM sub query with 'group by' is cached by mistake", by Meiji Kimura. If you still use query cache and have complex queries with derived tables, please, consider upgrade. You might be getting wrong results all this time...

If you use xtrabackup or Oracle's MySQL Enterprise Backup in the environment with XA transactions, please, check Bug #84442 - "XA PREPARE inconsistent with XTRABACKUP", by David Zhao, who had also provided a patch. Your backups may be inconsistent until you upgrade...

Now, on some optimizer bugs fixed:
  • Bug #81031 - "count(*) on innodb sometimes returns 0". This happened when index merge was used by the optimizer. The bug was reported by Ashraf Amayreh and verified by Shane Bester.
  • Bug #84320 - "DISTINCT clause does not work in GROUP_CONCAT". It was reported by Varun Gupta and verified by Bogdan Kecman. See Bug #68145 also (that is still "Verified").
  • Bug #79675 - "index_merge_intersection optimization causes wrong query results". This bug was reported by Saverio M and verified by Miguel Solorzano.
There were a lot more bugs fixed in 5.7.19. I'd say that if one uses replication (especially GTID-based or group replication), upgrade is a must, but I had no tried this version myself, yet.

by Valeriy Kravchuk (noreply@blogger.com) at July 21, 2017 10:32 AM

July 20, 2017

MariaDB AB

Do You Need Five 9s?

Do You Need Five 9s? Amy Krishnamohan Thu, 07/20/2017 - 17:12

Finding the right high availability strategy for your business

When it comes to database high availability, how many 9s do you really need? The minimum for most organizations is three 9s. Will that suffice for you, or do you need four? Or is five 9s—less than 1 second of downtime per day—a must?

Finding the right mix of reliability and infrastructure complexity is vital to make the best use of your resources. Let’s take a high-level look at three common approaches, spanning from 99.9% to 99.999% uptime.

Master-slave replication: 99.9% or 99.99%

Master-slave replication is one of the most common methods for avoiding service disruption after a server fails. It allows you to mirror the contents of one or more master servers to one or more slave servers. Here’s how it works, in a nutshell: All database changes are written into the binary log as binlog events. The slaves then read the binary log from each master to access the data for replication. A slave server keeps track of the master binlog location for the most recent event applied on the slave, allowing for smooth failover, where the slave resumes from where the master left off.

Failover can be handled manually or automatically. For non–mission-critical applications, which require only three 9s, a DBA or sysadmin can manually promote one of the slaves to master, but most organizations now choose automatic failover. Automating failover is faster and many database management systems provide tools to make it easy. Replication with automatic failover can grant you four 9s.

The MariaDB MaxScale database proxy supports automatic failover by guaranteeing that all your slaves have same information and are therefore equally equipped to take over for the master. Additionally, because sudden spikes in the number of connections or queries can lead to unplanned downtime, MaxScale includes load-balancing features such as read/write splitting, connection-rate limiting and result-set limiting.

Master-slave replication can be asynchronous or semi-synchronous. We won’t go into detail on those topics here, but they’re a main focus of our Best Practices for High Availability with MariaDB webinar; you can watch the recording anytime. That webinar also delves into multi-master synchronous replication with the MariaDB Cluster, which we’ll take a quick look at now.

MariaDB Cluster: 99.999%

The MariaDB Cluster uses Galera technology to enable multi-master clustering solutions. It is a synchronous multi-master cluster that keeps all your server nodes in sync. When you conduct a transaction on one node, that transaction’s data is immediately available on all the other nodes in the cluster. Because there’s no slave lag and no chance for lost transactions (as is sometimes possible with simple replication), this schema allows for very fast failover and five 9s of uptime.

There’s so much more!

Now that you know the basic options for high availability, you need to know the pros and cons of each approach. Our definitive guide to high availability with MariaDB TX covers that topic plus use cases, various topologies and HA tuning tips.

When it comes to database high availability, how many 9s do you really need? Finding the right mix of reliability and infrastructure complexity is vital to make the best use of your resources. In this blog, we will take a high-level look at three common approaches, spanning from 99.9% to 99.999% uptime.

Login or Register to post comments

by Amy Krishnamohan at July 20, 2017 09:12 PM

Peter Zaitsev

Where Do I Put ProxySQL?

ProxySQL

In this blog post, we’ll look at how to deploy ProxySQL.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. It was created for DBAs by René Cannaò, as a means of solving complex replication topology issues. When bringing up ProxySQL with my clients, I always get questions about where it fits into the architecture. This post should clarify that.

Before continuing, you might want to know why you should use this software. The features that are of interest include:

  • MySQL firewall
  • Connection pooling
  • Shard lookup and automated routing
  • Ability to read/write split
  • Automatically switch to another master in case of active master failure
  • Query cache
  • Performance metrics
  • Other neat features!

Initial Configuration

In general, you install it on nodes that do not have a running MySQL database. You manage it via the MySQL command line on another port, usually 6032. Once it is started the configuration in /etc is not used, and you do everything within the CLI. The backend database is actually SQLite, and the db file is stored in /var/lib/proxysql.

There are many guides out there on initializing and installing it, so I won’t cover those details here. It can be as simple as:

apt-get install proxysql

ProxySQL Architecture

While most first think to install ProxySQL on a standalone node between the application and database, this has the potential to affect query performance due to the additional latency from network hops.

 

ProxySQL

To have minimal impact on performance (and avoid the additional network hop), many recommend installing ProxySQL on the application servers. The application then connects to ProxySQL (acting as a MySQL server) on localhost, using Unix Domain Socket, and avoiding extra latency. It would then use its routing rules to reach out and talk to the actual MySQL servers with its own connection pooling. The application doesn’t have any idea what happens beyond its connection to ProxySQL.

ProxySQL

Reducing Your Network Attack Surface

Another consideration is reducing your network attack surface. This means attempting to control all of the possible vulnerabilities in your network’s hardware and software that are accessible to unauthenticated users.

Percona generally suggests that you put a ProxySQL instance on each application host, like in the second image above. This suggestion is certainly valid for reducing latency in your database environment (by limiting network jumps). But while this is good for performance, it can be bad for security.

Every instance must be able to talk to:

  • Every master
  • Every slave

As you can imagine, this is a security nightmare. With every instance, you have x many more connections spanning your network. That’s x many more connections an attacker might exploit.

Instead, it can be better to have one or more ProxySQL instances that are between your application and MySQL servers (like the first image above). This provides a reasonable DMZ-type setup that prevents opening too many connections across the network.

That said, both architectures are valid production configurations – depending on your requirements.

by Manjot Singh at July 20, 2017 06:57 PM

July 19, 2017

Peter Zaitsev

Blog Poll: What Operating System Do You Run Your Development Database On?

Blog Poll

Blog PollIn this post, we’ll use a blog poll to find out what operating system you use to run your development database servers.

In our last blog poll, we looked at what OS you use for your production database. Now we would like to see what you use for your development database.

As databases grow to meet more challenges and expanding application demands, they must try and get the maximum amount of performance out of available resources. How they work with an operating system can affect many variables, and help or hinder performance. The operating system you use for your database can impact consumable choices (such as hardware and memory). The operating system you use can also impact your choice of database engine as well (or vice versa).

When new projects, new applications or services or testing new architecture solutions, it makes sense to create a development environment in order to test and run scenarios before they hit production. Do you use the same OS in your development environment as you do your production environment?

Please let us know what operating system you use to run your development database. For this blog poll, we’re asking which operating system you use to actually run your development database server (not the base operating system).

If you’re running virtualized Linux on Windows, please select Linux as the OS used for development. Pick up to three that apply. Add any thoughts or other options in the comments section:

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Thanks in advance for your responses – they will help the open source community determine how database environments are being deployed.

by Dave Avery at July 19, 2017 10:05 PM

Multi-Threaded Slave Statistics

Multi-Threaded Slave Statistics

Multi-Threaded Slave StatisticsIn this blog post, I’ll talk about multi-threaded slave statistics printed in MySQL error log file.

MySQL version 5.6 and later allows you to execute replicated events using parallel threads. This feature is called Multi-Threaded Slave (MTS), and to enable it you need to modify the

slave_parallel_workers
 variable to a value greater than 1.

Recently, a few customers asked about the meaning of some new statistics printed in their error log files when they enable MTS. These error messages look similar to the example stated below:

[Note] Multi-threaded slave statistics for channel '': seconds elapsed = 123; events assigned = 57345; worker queues filled over overrun level = 0; waited due a Worker queue full = 0; waited due the total size = 0; waited at clock conflicts = 0 waited (count) when Workers occupied = 0 waited when Workers occupied = 0

The MySQL reference manual doesn’t show information about these statistics. I’ve filled a bug report asking Oracle to add information about these statistics in the MySQL documentation. I reported this bug as #85747.

Before they update the documentation, we can use the MySQL code to get insight as to the statistics meaning. We can also determine how often these statistics are printed in the error log file. Looking into the rpl_slave.cc file, we find that when you enable MTS – and log-warnings variable is greater than 1 (log-error-verbosity greater than 2 for MySQL 5.7) – the time to print these statistics in MySQL error log is 120 seconds. It is determined by a hard-coded constant number. The code below shows this:

/*
  Statistics go to the error log every # of seconds when --log-warnings > 1
*/
const long mts_online_stat_period= 60 * 2;

Does this mean that every 120 seconds MTS prints statistics to your MySQL error log (if enabled)? The answer is no. MTS prints statistics in the mentioned period depending on the level of activity of your slave. The following line in MySQL code verifies the level of the slave’s activity to print the statistics:

if (rli->is_parallel_exec() && rli->mts_events_assigned % 1024 == 1)

From the above code, you need MTS enabled and the modulo operation between the 

mts_events_assigned
 variable and 1024 equal to 1 in order to print the statistics. The 
mts_events_assigned
 variable stores the number of events assigned to the parallel queue. If you’re replicating a low level of events, or not replicating at all, MySQL won’t print the statistics in the error log. On the other hand, if you’re replicating a high number of events all the time, and the
mts_events_assigned
 variable increased its value until the remainder from the division between this variable and 1024 is 1, MySQL prints MTS statistics in the error log almost every 120 seconds.

You can find the explanation these statistics below (collected from information in the source code):

  1. Worker queues filled over overrun level: MTS tends to load balance events between all parallel workers, and the  
    slave_parallel_workers
     variable determines the number of workers. This statistic shows the level of saturation that workers are suffering. If a parallel worker queue is close to full, this counter is incremented and the worker replication event is delayed in order to avoid reaching worker queue limits.
  2. Waited due to a Worker queue full: This statistic is incremented when the coordinator thread must wait because of the worker queue gets overfull.
  3. Waited due to the total size: This statistic shows the number of times that the coordinator thread slept due to reaching the limit of the memory available in the worker queue to hold unapplied events. If this statistic keeps increasing when printed in your MySQL error log, you should resize the
    slave_pending_jobs_size_max
     variable to a higher value to avoid the coordinator thread waiting time.
  4. Waited at clock conflicts: In the case of possible dependencies between transactions, this statistic shows the wait time corresponding to logical timestamp conflict detection and resolution.
  5. Waited (count) when used occupied: A counter of how many times the coordinator saw Workers filled up with “enough” with assignments. The enough definition depends on the scheduler type (per-database or clock-based).
  6. Waited when workers occupied: These are statistics to compute coordinator thread waiting time for any worker available, and applies solely to the Commit-clock scheduler.
Conclusion

Multi-threaded slave is an exciting feature that allows you to replicate events faster, and keep in sync with master instances. By changing the log-warnings variable to a value greater than 1, you can get information from the slave error log file about how multi-threaded performance.

by Juan Arruti at July 19, 2017 05:02 PM

MariaDB AB

MariaDB 5.5.57 now available

MariaDB 5.5.57 now available dbart Wed, 07/19/2017 - 12:56

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.57. See the release notes and changelog for details.

Download MariaDB 5.5.57

Release Notes Changelog What is MariaDB 5.5?

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.2.7. See the release notes and changelog for details.

Login or Register to post comments

by dbart at July 19, 2017 04:56 PM

July 18, 2017

MariaDB Foundation

MariaDB 5.5.57 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.57. This is a stable (GA) release. See the release notes and changelog for details. Download MariaDB 5.5.57 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 5.5.57 now available appeared first on MariaDB.org.

by Ian Gilfillan at July 18, 2017 09:13 PM

Peter Zaitsev

Backups and Disaster Recovery

Backups and Disaster Recovery

Backups and Disaster RecoveryIn this post, we’ll look at strategies for backups and disaster recovery.

Note: I am giving a talk on Backups and Disaster Recovery Best Practices on July 27th.

When discussing disaster recovery, it’s important to take your business’ continuity plan into consideration. Backup and recovery processes are a critical part of any application infrastructure.

A well-tested backup and recovery system can be the difference between a minor outage and the end of your business.

You will want to take three things into consideration when planning your disaster recovery strategy: recovery time objective, recovery point objective and risk mitigation.

Recovery time objective (RTO) is how long it takes to restore your backups. Recovery point objective (RPO) is what point in time you want to recover (in other words, how much data you can afford to lose after recovery). Finally, you need to understand what risks you are trying to mitigate. Risks to your data include (but are not limited to) bad actors, data corruption, user error, host failure and data center failure.

Recommended Backup Strategies

We recommend that you use both physical (Percona XtraBackup, RDS/LVM Snapshots, MySQL Enterprise Backup) and logical backups (mysqldump, mydumper, mysqlpump). Logical backups protect against the loss of single data points, while physical backups protect against total data loss or host failure.

The best practice is running Percona XtraBackup nightly, followed by mysqldump (or in 5.7+, mysqlpump). Percona XtraBackup enables you to quickly restore a server, and mysqldump enables you to quickly restore data points. These address recovery time objectives.

For point-in-time recovery, it is recommended that you download binlogs on a regular basis (once an hour, for example).

Another option is binlog streaming. You can find more information on binlog streaming in our blog: Backing up binary log files with mysqlbinlog.

There is also a whitepaper that is the basis of my webinar here: MySQL Backup and Recovery Best Practices.

Delayed Slave

One way to save on operational overhead is to create a 24-hour delayed slave. This takes the place of the logical backup (mysqldump) as well as the binlog streaming. You want to ensure that you stop the delayed slave immediately following any issues. This ensures that the data does not get corrupted on the backup as well.

A delayed slave is created in 5.6 and above with:

CHANGE MASTER TO MASTER_DELAY = N;

After a disaster, you would issue:

STOP SLAVE;

Then, in order to get a point-in-time, you can use:

START SLAVE UNTIL MASTER_LOG_FILE = 'log_name', MASTER_LOG_POS = log_pos;

Restore

It is a good idea to test your backups at least once a quarter. Backups do not exist unless you know you can restore them. There are some recent high-profile cases where developers dropped tables or schemas, or data was corrupted in production, and in one case five different backup types were not viable to use to restore.

The best case scenario is an automated restore test that runs after your backup, and gives you information on how long it takes to restore (RTO) and how much data you can restore (RPO).

For more details on backups and disaster recovery, come to my webinar.

by Manjot Singh at July 18, 2017 05:06 PM

Upcoming Webinar Wednesday, July 19, 2017: Learning MySQL 5.7

MySQL 5.7

MySQL 5.7Join Percona’s, Technical Services Manager, Jervin Real as he presents Learning MySQL 5.7 on Wednesday, July 19, 2017 at 10:00 am PDT / 1:00 pm EDT (UTC-7).

MySQL 5.7 has a lot of new features. If you’ve dabbled with an older version of MySQL, it is worth learning what new features exist and how to use them. This webinar teaches you about new features such as multi-source replication, global transaction IDs (GTIDs), security improvements and more.

We’ll also discuss logical decoding. Logical decoding is one of the features under the BDR implementation, which allows bidirectional streams of data between Postgres instances. Also, it allows you to stream data outside Postgres into many other data systems.

Register for the webinar here.

MySQL 5.7Jervin Real, Technical Services Manager

Jervin is a member of the Technical Services at Percona, where he partners with customers to build reliable and highly-performant MySQL infrastructures, He also does other fun stuff like watching cat videos on the internet, reading bugs for novels and playing around servers for numbers. Jervin joined Percona in April of 2010.

by Dave Avery at July 18, 2017 01:13 PM

Percona Live Europe 2017 Call for Papers Deadline Extended to July 24, 2017

Percona Live Call for Papers

Percona Live Call for PapersWe are extending the Percona Live Open Source Database Conference Europe 2017 call for papers deadline to Monday, July 24, 2017. Get your submissions in now!

Between our Conference Committee working hard to review all the outstanding talk ideas, and the many community requests for more time, we didn’t want to shortchange any of our applicants. If you procrastinated too long, didn’t complete your talk submission or just plain forgot to submit, this is your reprieve! You have one extra week to pull together a proposal for a talk on open source databases at Percona Live Europe 2017.

The theme of Percona Live Europe 2017 is “Championing Open Source Databases.” There are sessions on MySQL, MariaDBMongoDB and Other Open Source Database technologies, including time series databases, PostgreSQL and RocksDB. Are you:

  • Working with MongoDB as a developer?
  • Creating a new MySQL-variant time series database?
  • Deploying MariaDB in a novel way?
  • Using open source database technology to solve a business issue?

Share your open source database experiences with peers and professionals in the open source community! We invite you to submit your speaking proposal for breakout, tutorial or lightning talk sessions:

  • Breakout Session. Broadly cover a technology area using specific examples. Sessions should be either 25 minutes or 50 minutes in length (including Q&A).
  • Tutorial Session. Present a technical session that aims for a level between a training class and a conference breakout session. Encourage attendees to bring and use laptops for working on detailed and hands-on presentations. Tutorials will be three or six hours in length (including Q&A).
  • Lightning Talk. Give a five-minute presentation focusing on one key point that interests the open source community: technical, lighthearted or entertaining talks on new ideas, a successful project, a cautionary story, a quick tip or demonstration.

Submit your talk ideas now.

PLEASE NOTE: We have a new call for papers system that requires everyone to register for a new account. You can save proposals as drafts and edit them later. Once a proposal is submitted, it is final. You can NOT change a proposal once submitted.

Register for Percona Live Europe 2017 now! Early Bird registration lasts until August 8

Last year’s Percona Live Europe sold out, and we’re looking to do the same at this year’s conference. Don’t miss your chance to get your ticket at its most affordable price. Click here to register.

Percona Live 2017 sponsorship opportunities are available now

Percona Live Europe 2017 is just around the corner. Have you secured your sponsorship yet? Last year’s event sold out! Booth selection for our limited sponsorship opportunities is on a first-come, first-served basis. Click here to find out how to sponsor.

Don’t miss out on a special room rate at the Percona Live Europe 2017 venue

This year, Percona Live Europe is being held at the Radisson Blu Royal Hotel, Dublin. You can get a special room rate when you attend the conference. But hurry, rooms are going fast! To get the special room rate:

  1. Visit https://www.radissonblu.com/en/royalhotel-dublin.
  2. Click BOOK NOW at the top right.
  3. Enter your preferred check-in and check-out dates, and how many rooms.
  4. From the drop-down “Select Rate Type,” choose Promotional Code.
  5. Enter the code PERCON to get the discount.

This special deal includes breakfast each morning! The group rate only applies if used within the Percona Live Europe group block dates (September 25-27, 2017). The deadline for booking with this rate is July 24, 2017.

by Dave Avery at July 18, 2017 03:21 AM

July 17, 2017

Peter Zaitsev

Percona Server for MongoDB 3.2.14-3.4 is Now Available

Percona Server for MongoDB 3.2

Percona Server for MongoDB 3.2.14-3.4Percona announces the release of Percona Server for MongoDB 3.2.14-3.4 on July 17, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database that supports the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocks, Percona Memory Engine, and PerconaFT storage engine, as well as enterprise-grade features like External Authentication, Audit Logging, Profiling Rate Limiting, and Hot Backup at no extra cost. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: This release deprecates the PerconaFT storage engine. It will not be available in future releases.

This release is based on MongoDB 3.2.14 and includes the following additional changes:

New Features

Bugs Fixed

  • #PSMDB-67: Fixed mongod service status messages.

Percona Server for MongoDB 3.2.14-3.4 release notes are available in the official documentation.

by Alexey Zhebel at July 17, 2017 07:47 PM

Valeriy Kravchuk

Fun with Bugs #53 - On Some Percona Software Bugs I've Reported

So far in this series I had written only/mostly about MySQL server bugs. Does it mean that there are no unique or interesting bugs in other MySQL-related software or MySQL forks and projects that use MySQL code?

Surely no, it doesn't. So, today I'd like to present a short list of currently active (that is, new or probably not yet resolved) bugs that I had reported for Percona's software at Launchpad over the years I use it. The list is based on this link, but I excluded minor reports and those that probably are just forgotten, while the problem is fixed long time ago.

Let me start with bugs in XtraBackup, the tool that plays a key role in most MySQL environments that are not using Oracle's Enterprise subscriptions:
  • lp:1704636 - "xtrabackup 2.4.7 crashes while backing up MariaDB 10.2.x with ftwrl-* options". I've reported it yesterday. Probably some unique background thread of MariaDB 10.2.x is not taken into account and some calculations are performed for it that lead to crash. Sorry, but there is no way to use --ftwrl-* options of XtraBackup with MariaDB 10.2.x.
  • lp:1418150 - "Provide a way for xtrabackup to communicate with storage managers via XBSA API". This is a feature request created based on customer's request to store backups in Tivoli Storage Manager. Similar request was recently noted from MariaDB customer as well. MySQL's Enterprise Backup allows to work with storage managers, even if not in the most direct way. I've used TSM a lot in the past in production as Informix DBA, so I care about this feature more than about many others...
Let's continue with another unique software package from Percona, Percona Toolkit. I still use some of the tools every day (and do not hesitate to write about this), and would like to see more attention to the toolkit in areas other than MongoDB. I've reported the following two bugs in the past that are still active:
  • lp:1566939 - "pt-stalk should not call mysqladmin debug by default". Having in mind documentation Bug #71346 that Oracle just refuses to fix, I'd say there may be a dozen of people in the world who can make a good use of the output even when it's correct, useful and really needed. For the rest of us it just pollutes error logs and pt-stalk data collections.
  • lp:1533271 - "pt-online-schema-change does not report anything when it waits for slave to replicate new table creation". The problem is clear enough from description. There is nothing useful to find out the reason for the hang in this case without PTDEBUG=1.
Finally, let's check several Percona Server bugs. Note that in many cases i had to file bug for Percona Server even if upstream MySQL was also affected, in a hope that Percona may fix the problem faster (and this happened sometimes). The following bugs were not inherited from MySQL directly:
  • lp:1455432 - "Audit log plugin should allow to include or exclude logging for specific users". This was request for the same functionality that was originally provided in MariaDB's implementation. Actually the bug was fixed in Percona Server 5.6.32-78.0 and 5.7.14-7, but remains in the list as it is still just "Triaged" for 5.5.x branch. Time to close it completely, I'd say.
  • lp:1408971 - "Manual does not describe Innodb_current_row_locks in any useful way". Nothing changed, check the manual.
  • lp:1401528 - "Add option to switch to "old" get_lock() behavior". It's hard to introduce new features in minor release during GA period, and do it right. Somebody had to think about backward compatibility, upgrades and downgrades, and about the completeness of implementation (see lp:1396336 about that). In this specific case Percona failed to do it right.
  • lp:1396035 - "Percona-Server-test-56 RPM package does NOT check /usr/share/percona-server directory and thus is unusable "as is". Packaging is another thing that is hard to do right. I was not lazy today, so I re-check this on CentOS 6.x. with 5.7.18, and ./mtr is somewhat useful there:

    [root@centos mysql-test]# ls -l ./mtr
    lrwxrwxrwx. 1 root root 19 Jun  7 18:20 ./mtr -> ./mysql-test-run.pl
    [root@centos mysql-test]# perl ./mtr analyze
    Logging: ./mtr  analyze
    MySQL Version 5.7.18
    Checking supported features...
     - SSL connections supported
    Collecting tests...
    Checking leftover processes...
    Removing old var directory...
    Creating var directory '/usr/share/mysql-test/var'...
    Installing system database...
    Using parallel: 1

    ==============================================================================

    TEST                                      RESULT   TIME (ms) or COMMENT
    --------------------------------------------------------------------------

    worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
    worker[1] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped
    main.analyze                             [ pass ]  10920
    --------------------------------------------------------------------------
    The servers were restarted 0 times
    Spent 10.920 of 58 seconds executing testcases

    Completed: All 1 tests were successful.
  • lp:1263968 - "DOCS: http://www.percona.com/doc/percona-server/5.6/installation.html does not provide enough details on further steps after installing .deb". I'd still say that some details requested are missing. Writing proper manuals is hard.
I had surely reported a lot more bugs, but many of them have related upstream MySQL ones, so there are no reasons to discuss them separately now. Also, note that in Percona I was involved in setting priorities for the bugs and mostly reported them based on customer issues or complains. So, no wonder that my bug reports were fixed fast, and not so many active ones left by now.

There was yet another Percona tool that I spent as lot of time on, both on building it from current sources, using it and  reporting problems with it. It's Percona Playback, Based on this, they seem to work on other but similar tool now. Still, I consider this request useful for any tool that allows to replay specific load: lp:1365538.

by Valeriy Kravchuk (noreply@blogger.com) at July 17, 2017 09:40 AM

July 14, 2017

Peter Zaitsev

Percona Monitoring and Management 1.2.0 is Now Available

Percona Monitoring and Management (PMM)

Percona announces the release of Percona Monitoring and Management 1.2.0 on July 14, 2017.

For installation instructions, see the Deployment Guide.


Changes in PMM Server

PMM Server 1.2.0 introduced the following changes:

New Features

  • PMM-737: New graphs in System Overview dashboard:
      • Memory Advanced Details
      • Saturation Metrics

  • PMM-1090: Added ESXi support for PMM Server virtual appliance.

UI Fixes

  • PMM-707: Fixed QPS metric in MySQL Overview dashboard to always show queries per second regardless of the selected interval.
  • PMM-708: Fixed tooltips for graphs that displayed incorrectly.
  • PMM-739PMM-797: Fixed PMM Server update feature on the landing page.
  • PMM-823: Fixed arrow padding for collapsible blocks in QAN.
  • PMM-887: Disabled the Add button when no table is specified for showing query info in QAN.
  • PMM-888: Disabled the Apply button in QAN settings when nothing is changed.
  • PMM-889: Fixed the switch between UTC and local time zone in the QAN time range selector.
  • PMM-909: Added message No query example when no example for a query is available in QAN.
  • PMM-933: Fixed empty tooltips for Per Query Stats column in the query details section of QAN.
  • PMM-937: Removed the percentage of total query time in query details for the TOTAL entry in QAN (because it is 100% by definition).
  • PMM-951: Fixed the InnoDB Page Splits graph formula in the MySQL InnoDB Metrics Advanced dashboard.
  • PMM-953: Enabled stacking for graphs in MySQL Performance Schema dashboard.
  • PMM-954: Renamed Top Users by Connections graph in MySQL User Statistics dashboard to Top Users by Connections Created and added the Connections/sec label to the Y-axis.
  • PMM-957: Refined titles for Client Connections and Client Questions graphs in ProxySQL Overview dashboard to mentioned that they show metrics for all host groups (not only the selected one).
  • PMM-961: Fixed the formula for Client Connections graph in ProxySQL Overview dashboard.
  • PMM-964: Fixed the gaps for high zoom levels in MySQL Connections graph on the MySQL Overview dashboard.
  • PMM-976: Fixed Orchestrator handling by supervisorctl.
  • PMM-1129: Updated the MySQL Replication dashboard to support new connection_name label introduced in mysqld_exporter for multi-source replication monitoring.
  • PMM-1054: Fixed typo in the tooltip for the Settings button in QAN.
  • PMM-1055: Fixed link to Query Analytics from Metrics Monitor when running PMM Server as a virtual appliance.
  • PMM-1086: Removed HTML code that showed up in the QAN time range selector.

Bug Fixes

  • PMM-547: Added warning page to Query Analytics app when there are no PMM Clients running the QAN service.
  • PMM-799: Fixed Orchestrator to show correct version.
  • PMM-1031: Fixed initialization of Query Profile section in QAN that broke after upgrading Angular.
  • PMM-1087: Fixed QAN package building.

Other Improvements

  • PMM-348: Added daily log rotation for nginx.
  • PMM-968: Added Prometheus build information.
  • PMM-969: Updated the Prometheus memory usage settings to leverage new flag. For more information about setting memory consumption by PMM, see FAQ.

Changes in PMM Client

PMM Client 1.2.0 introduced the following changes:

New Features

  • PMM-1114: Added PMM Client packages for Debian 9 (“stretch”).

Bug Fixes

  • PMM-481PMM-1132: Fixed fingerprinting for queries with multi-line comments.
  • PMM-623: Fixed mongodb_exporter to display correct version.
  • PMM-927: Fixed bug with empty metrics for MongoDB query analytics.
  • PMM-1126: Fixed promu build for node_exporter.
  • PMM-1201: Fixed node_exporter version.

Other Improvements

  • PMM-783: Directed mongodb_exporter log messages to stderr and excluded many generic messages from the default INFO logging level.
  • PMM-756: Merged upstream node_exporter version 0.14.0.
    PMM deprecated several collectors in this release:

    • gmond – Out of scope.
    • megacli – Requires forking, to be moved to textfile collection.
    • ntp – Out of scope.

    It also introduced the following breaking change:

    • Collector errors are now a separate metric: node_scrape_collector_success, not a label on node_exporter_scrape_duration_seconds
  • PMM-1011: Merged upstream mysqld_exporter version 0.10.0.
    This release introduced the following breaking change:

    • mysql_slave_... metrics now include an additional connection_name label to support MariaDB multi-source replication.

About Percona Monitoring and Management

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

Percona Monitoring and Management is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, use the PMM project in JIRA.

by Alexey Zhebel at July 14, 2017 05:51 PM

A Little Trick Upgrading to MySQL 5.7

Upgrading to MySQL 5.7

Upgrading to MySQL 5.7In this blog post, I’ll look at a trick we use at Percona when upgrading to MySQL 5.7.

I’ll be covering this subject (and others) in my webinar Learning MySQL 5.7 on Wednesday, July 19, 2017.

We’ve been doing upgrades for quite a while here are Percona, and we try to optimize, standardize and improve this process to save time. When upgrading to MySQL 5.7, more often than not you need to run REPAIR or ALTER via mysql_upgrade to a number of MySQL tables. Sometimes a few hundred, sometimes hundreds of thousands.

One way to cut some time from testing or executing mysql_upgrade is to combine it with mysqlcheck. This identifies tables that need to be rebuilt or repaired. The first step is to capture the output of this process:

revin@acme:~$ mysqlcheck --check-upgrade --all-databases > mysql-check.log

This provides a lengthy output of what needs to be done to successfully upgrade our tables. On my test data, I get error reports like the ones below. I’ll need to take the specified action against them:

ads.agency
error    : Table upgrade required. Please do "REPAIR TABLE `agency`" or dump/reload to fix it!
store.categories
error    : Table rebuild required. Please do "ALTER TABLE `categories` FORCE" or dump/reload to fix it!

Before we run through this upgrade, let’s get an idea of how long it would take for a regular mysql_upgrade to complete on this dataset:

revin@acme:~$ time mysql_upgrade
Enter password:
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv                                 OK
mysql.db                                           OK
...
mysql.user                                         OK
Upgrading the sys schema.
Checking databases.
ads.account_preference_assoc         OK
...
Repairing tables
...
ads.agency
Note     : TIME/TIMESTAMP/DATETIME columns of old format have been upgraded to the new format.
status   : OK
...
`store`.`categories`
Running  : ALTER TABLE `store`.`categories` FORCE
status   : OK
...
Upgrade process completed successfully.
Checking if update is needed.
real	25m57.482s
user	0m0.024s
sys	0m0.072s

On a cold server, my baseline above took about 25 minutes.

The second step on our time-saving process is to identify the tables that need some action (in this case, REPAIR and ALTER … FORCE). Generate the SQL statements to run them and put them into a single SQL file:

revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 REPAIR | egrep -v 'REPAIR|--');
	do echo "mysql -e 'REPAIR TABLE $t;'" >> upgrade.sql; done
revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 ALTER | egrep -v 'ALTER|--');
	do echo "mysql -e 'ALTER TABLE $t FORCE;'" >> upgrade.sql; done

My upgrade.sql file will have something like this:

mysql -e 'ALTER TABLE store.categories FORCE;'
mysql -e 'REPAIR TABLE ads.agency;'

Now we should be ready to run these commands in parallel as the third step in the process:

revin@acme:~$ time parallel -j 4 -- < upgrade.sql
...
real	17m31.448s
user	0m1.388s
sys	0m0.616s

Getting some parallelization is not bad, and the process improved by about 38%. If we are talking about multi-terabyte data sets, then it is already a big gain.

On the other hand, my dataset has a few tables that are bigger than the rest. Since mysqlcheck processes them in a specific order, one of the threads was processing most of them instead of spreading them out evenly to each thread by size. To fix this, we need to have an idea of the sizes of each table we will be processing. We can use a query from the INFORMATION_SCHEMA.TABLES for this purpose:

revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 ALTER | egrep -v 'ALTER|--');
	do d=$(echo $t|cut -d'.' -f1); tbl=$(echo $t|cut -d'.' -f2);
	s=$(mysql -BNe "select sum(index_length+data_length) from information_schema.tables where table_schema='$d' and table_name='$tbl';");
	echo "$s |mysql -e 'ALTER TABLE $t FORCE;'" >> table-sizes.sql; done
revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 REPAIR | egrep -v 'REPAIR|--');
	do d=$(echo $t|cut -d'.' -f1); tbl=$(echo $t|cut -d'.' -f2);
	s=$(mysql -BNe "select sum(index_length+data_length) from information_schema.tables where table_schema='$d' and table_name='$tbl';");
	echo "$s |mysql -e 'REPAIR TABLE $t;'" >> table-sizes.sql; done

Now my table-sizes.sql file will have contents like below, which I can sort and pass to the parallel command again and cut even more time!

32768 |mysql -e 'REPAIR TABLE ads.agency;'
81920 |mysql -e 'ALTER TABLE store.categories FORCE;'

revin@acme:~$ cat table-sizes.sql |sort -rn|cut -d'|' -f2 > upgrade.sql
revin@acme:~$ time parallel -j 4 -- < upgrade.sql
...
real	8m1.116s
user	0m1.260s
sys	0m0.624s

This go-around, my total execution time is 8 minutes – a good 65% improvement. To wrap it up, we will need to run mysql_upgrade one last time so that the system tables are also upgraded, the tables are checked again and then restart the MySQL server as instructed by the manual:

revin@acme:~$ time mysql_upgrade --force

The whole process should be easy to automate and script, depending on your preference. Lastly: YMMV. If you have one table that is more than half the size of your total data set, there might not be big gains.

If you want to learn more about upgrading to MySQL 5.7, come to my webinar on Wednesday, July 19: Learning MySQL 5.7. This process is only one of the phases in a multi-step upgrade process when moving to 5.7. I will discuss them in more detail next week. Register now from the link below, and I’ll talk to you soon!

by Jervin Real at July 14, 2017 05:20 PM

July 13, 2017

Peter Zaitsev

Setting Up Percona PAM with Active Directory for External Authentication

Percona PAM

Percona PAMIn this blog post, we’ll look at how to set up Percona PAM with Active Directory for external authentication.

In my previous article on Percona PAM, I demonstrated how to use Samba as a domain, and how easy it is to create domain users and groups via the samba-tool. Then we configured nss-pam-ldapd and nscd to enumerate user and group information via LDAP calls, and authenticate users from this source.

This time around, I will demonstrate two other ways of using Active Directory for external authentication by joining the domain via SSSD or Winbind. System Security Services Daemon (SSSD) allows you to configure access to several authentication hosts such as LDAP, Kerberos, Samba and Active Directory and have your system use this service for all types of lookups. Winbind, on the other hand, pulls data from Samba or Active Directory only. If you’re mulling over using SSSD or Winbind, take a look at this article on what SSSD or Winbind support.

For both methods, we’ll use realmd. That makes it easy to join a domain and enumerate users from it.

My testbed environment consists of two machines:

Samba PDC
OS: CentOS 7
IP Address: 172.16.0.10
Hostname: samba-10.example.com
Domain name: EXAMPLE.COM
DNS: 8.8.8.8(Google DNS), 8.8.4.4(Google DNS), 172.16.0.10(Samba)
Firewall: none

Note: Please follow the steps in the last article for setting up the Samba PDC environment.

Percona Server 5.7 with LDAP authentication via SSS or WinBind
OS: CentOS 7
IP Address: 172.16.0.21
Hostname: ps-ldap-21.example.com
DNS: 172.16.0.10(Samba PDC)

Installing realmd and Its Dependencies

  1. First, we need to make sure that the time is in sync (since this is a requirement for joining domains). Install NTP and make sure that it starts up at boot time:
    [root@ps-ldap-21 ~]# yum -y install ntp
    * * *
    Installed:
    ntp.x86_64 0:4.2.6p5-25.el7.centos.2
    * * *
    [root@ps-ldap-21 ~]# ntpdate 0.centos.pool.ntp.org
    systemctl enable ntpd.service
    systemc 3 Jul 03:48:35 ntpdate[3708]: step time server 202.90.132.242 offset 1.024550 sec
    [root@ps-ldap-21 ~]# systemctl enable ntpd.service
    Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.
    [root@ps-ldap-21 ~]# systemctl start ntpd.service
  2. Install realmd and its dependencies for SSSD or Winbind.
    For SSSD:
    yum -y install realmd oddjob oddjob-mkhomedir sssd adcli samba-common-tools

    For Winbind:
    yum -y install realmd oddjob oddjob-mkhomedir samba-winbind-clients samba-winbind samba-common-tools

Joining the Domain via SSSD and Preparing It for Percona PAM

  1. Run realm discover domain for realmd to discover what type of server it’s connecting to and what packages dependencies need to be installed:
    [root@ps-ldap-21 ~]# realm discover example.com
    example.com
    type: kerberos
    realm-name: EXAMPLE.COM
    domain-name: example.com
    configured: no
    server-software: active-directory
    client-software: sssd
    required-package: oddjob
    required-package: oddjob-mkhomedir
    required-package: sssd
    required-package: adcli
    required-package: samba-common-tools

    Our Samba PDC is detected as an Active Directory Controller, and the packages required have been installed previously.
  2. The next step is to join the domain by running realm join domain. If you want to get more information, add the
    --verbose option
    . You could also add the
    -U user
     option if you want to use a different administrator account.
    [root@ps-ldap-21 ~]# realm join example.com --verbose
     * Resolving: _ldap._tcp.example.com
     * Performing LDAP DSE lookup on: 172.16.0.10
     * Successfully discovered: example.com
    Password for Administrator:
     * Required files: /usr/sbin/oddjobd, /usr/libexec/oddjob/mkhomedir, /usr/sbin/sssd, /usr/bin/net
     * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.DM6W2Y -U Administrator ads join example.com
    Enter Administrator's password:
    Using short domain name -- EXAMPLE
    Joined 'PS-LDAP-21' to dns domain 'example.com'
     * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.DM6W2Y -U Administrator ads keytab create
    Enter Administrator's password:
     * /usr/bin/systemctl enable sssd.service
    Created symlink from /etc/systemd/system/multi-user.target.wants/sssd.service to /usr/lib/systemd/system/sssd.service.
     * /usr/bin/systemctl restart sssd.service
     * /usr/bin/sh -c /usr/sbin/authconfig --update --enablesssd --enablesssdauth --enablemkhomedir --nostart && /usr/bin/systemctl enable oddjobd.service && /usr/bin/systemctl start oddjobd.service
     * Successfully enrolled machine in realm

    As you can see from the command above, the realm command simplifies SSSD configuration and uses existing tools such as net and authconfig to join the domain and use it as an identity provider.
  3. Let’s test if we enumerate existing accounts by using the
    id
     command:
    [root@ps-ldap-21 ~]# id jervin
    id: jervin: no such user
    [root@ps-ldap-21 ~]# id jervin@example.com
    uid=343401115(jervin@example.com) gid=343400513(domain users@example.com) groups=343400513(domain users@example.com),343401103(support@example.com)

    As you can see, the user can be queried if the domain is specified. So if you want to log in as ‘jervin@example.com’, in Percona Server for MySQL you’ll need to create the user as ‘jervin@example.com’ and not ‘jervin’. For example:
    # Creating user 'jervin@example.com'
    CREATE USER 'jervin@example.com'@'%' IDENTIFIED WITH auth_pam;
    # Logging in as 'jervin@example.com'
    mysql -u 'jervin@example.com'

    If you want to omit the domain name when logging in, you’ll need to replace “use_fully_qualified_names = True” to “use_fully_qualified_names = False” in /etc/sssd/sssd.conf, and then restart SSSD. If you do this, then the user can be found without providing the domain:
    [root@ps-ldap-21 ~]# id jervin
    uid=343401115(jervin) gid=343400513(domain users) groups=343400513(domain users),343401103(support)
    [root@ps-ldap-21 ~]# id jervin@example.com
    uid=343401115(jervin) gid=343400513(domain users) groups=343400513(domain users),343401103(support)

    When you create the MySQL user, you don’t need to include the domain anymore:
    # Creating user 'jervin'
    CREATE USER 'jervin'@'%' IDENTIFIED WITH auth_pam;
    # Logging in as 'jervin'
    mysql -u jervin
  4. Optionally, you can specify which users and groups can log in by adding these settings to SSSD:
    Domain access filter
    Under “[domain/example.com]” /etc/sssd/sssd.conf, you can add the following to specify that only users that are members of support and dba are allowed to use SSSD. For example:
    ad_access_filter = (|(memberOf=CN=dba,CN=Users,DC=example,DC=com)(memberOf=CN=support,CN=Users,DC=example,DC=com))

    Simple filters
    You can use
    realm permit
     or
    realm permit -g
     to allow particular users or groups. For example:
    realm permit jervin
    realm permit -g support
    realm permit -g dba

    You can check sssd.conf on how these ACLs are implemented:
    access_provider = simple
    simple_allow_groups = support, dba
    simple_allow_users = jervin
  5. Finally, configure Percona Server for MySQL to authenticate to SSSD by creating /etc/pam.d/mysqld with this content:
    auth required pam_sss.so
    account required pam_sss.so
  6. Done. All you need to do now is to install Percona Server for MySQL, enable the auth_pam and auth_pam_compat plugins, and add PAM users. You can then check for authentication errors at /var/log/secure for troubleshooting. You could also get verbose logs by adding debug_level=[1-9] to [nss], [pam], or [domain] and then restarting SSSD. You can view the logs from /var/log/sssd.

Joining the Domain via Winbind and Preparing it for Percona PAM

  1. The
    realm
     command assumes that SSSD is used. To change the client software, use
    --client-software=winbind
     instead:
    [root@ps-ldap-21 ~]# realm --client-software=winbind discover example.com
    example.com
        type: kerberos
        realm-name: EXAMPLE.COM
        domain-name: example.com
        configured: no  
        server-software: active-directory
        client-software: winbind
        required-package: oddjob-mkhomedir
        required-package: oddjob
        required-package: samba-winbind-clients
        required-package: samba-winbind
        required-package: samba-common-tools
  2. Since the required packages have already been installed, we can now attempt to join this host to the domain:
    [root@ps-ldap-21 ~]# realm --verbose --client-software=winbind join example.com
     * Resolving: _ldap._tcp.example.com
     * Performing LDAP DSE lookup on: 172.16.0.10
     * Successfully discovered: example.com
    Password for Administrator:
     * Required files: /usr/libexec/oddjob/mkhomedir, /usr/sbin/oddjobd, /usr/bin/wbinfo, /usr/sbin/winbindd, /usr/bin/net
     * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.9YEO2Y -U Administrator ads join example.com
    Enter Administrator's password:
    Using short domain name -- EXAMPLE
    Joined 'PS-LDAP-21' to dns domain 'example.com'
     * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.9YEO2Y -U Administrator ads keytab create
    Enter Administrator's password:
     * /usr/bin/systemctl enable winbind.service
    Created symlink from /etc/systemd/system/multi-user.target.wants/winbind.service to /usr/lib/systemd/system/winbind.service.
     * /usr/bin/systemctl restart winbind.service
     * /usr/bin/sh -c /usr/sbin/authconfig --update --enablewinbind --enablewinbindauth --enablemkhomedir --nostart && /usr/bin/systemctl enable oddjobd.service && /usr/bin/systemctl start oddjobd.service
     * Successfully enrolled machine in realm
  3. Let’s test if we enumerate existing accounts by using the
    id
     command
    [root@ps-ldap-21 ~]# id jervin
    id: jervin: no such user
    [root@ps-ldap-21 ~]# id jervin@example.com
    uid=10000(EXAMPLEjervin) gid=10000(EXAMPLEdomain users) groups=10000(EXAMPLEdomain users),10001(EXAMPLEsupport)

    Unfortunately for Winbind, users identified with their domains cannot login to Percona Server for MySQL. We need to disable this from the Samba config (performed in the next step).
  4. Edit /etc/samba/smb.conf, and change “winbind use default domain = no” to “winbind use default domain = yes”. Restart the Winbind service. For example:
    vi /etc/samba/smb.conf
    #Look for:
    "winbind use default domain = no"
    #Change to:
    "winbind use default domain = yes"
    systemctl restart winbind.service

    Try running
    id
     again:
    [root@ps-ldap-21 ~]# id jervin
    uid=10000(jervin) gid=10000(domain users) groups=10000(domain users),10001(support)
    [root@ps-ldap-21 ~]# id jervin@example.com
    id: jervin@example.com: no such user

    When you create the MySQL user, do not include the domain name. For example:
    # Creating user 'jervin'
    CREATE USER 'jervin'@'%' IDENTIFIED WITH auth_pam;
    # Logging in as 'jervin'
    mysql -u jervin
  5. Finally, configure Percona Server for MySQL to authenticate to Winbind by creating /etc/pam.d/mysqld with this content:
    auth required pam_winbind.so
    account required pam_winbind.so

You can debug authentication attempts by reviewing the logs at /var/log/secure. You may also change “auth required pam_winbind.so” to “auth required pam_winbind.so debug” in /etc/pam.d/mysqld to get verbose logging in the same file.

As for filtering who can authenticate with Winbind, you can add

require_membership_of=group_name
 under the [global] section of /etc/security/pam_winbind.conf

You’ll need to restart winbind daemon to apply the changes.

Conclusion

Thanks to realmd, it’s easier to setup Active Directory as an identity provider. With minimal configuration tweaks, you can use the identity provider to authenticate MySQL users.

by Jaime Sicam at July 13, 2017 07:17 PM

MariaDB AB

Analysis of Financial Time Series Data Using MariaDB ColumnStore

Analysis of Financial Time Series Data Using MariaDB ColumnStore Satoru Goto Thu, 07/13/2017 - 11:31

MariaDB ColumnStore is a GPLv2 open source columnar database built on MariaDB Server. It is a fork and evolution of the former InfiniDB. It can be deployed in the cloud (optimized for Amazon Web Services) or on a local cluster of Linux servers using either local or networked storage. In this blog post, I will show some examples of analysis of financial time series data using MariaDB ColumnStore.

Free Forex Historical Data at HistData.com

First of all, we will download forex historical data (GBPUSD M1 Full 2016 Year Data) which is freely available on HistData.com. HistaData.com forex historical data look like:

20160103 170000;1.473350;1.473350;1.473290;1.473290;0
20160103 170100;1.473280;1.473360;1.473260;1.473350;0
20160103 170200;1.473350;1.473350;1.473290;1.473290;0
20160103 170300;1.473300;1.473330;1.473290;1.473320;0

1st column is timestamp of currency rate of every minute, but needed to convert the format, in order to fit with ColumnStore DATETIME data type. I modified the timestamp format using the following simple Ruby script :

convert.rb:
#!/usr/bin/env ruby
id = 0
while line = gets
  timestamp, open, high, low, close = line.split(";")
  year, month, day, hour, minute, second = timestamp.unpack("a4a2a2xa2a2a2")
  id+= 1
  print "#{id},#{year}-#{month}-#{day} #{hour}:#{minute},”
  puts  [open, high, low, close].join(“,”)
end

For example, you can convert the historical data as follows:

ruby convert.rb DAT_ASCII_GBPUSD_M1_2016.csv > gbpusd2016.csv

The converted forex historical data will be:

1,2016-01-03 17:00,1.473350,1.473350,1.473290,1.473290
2,2016-01-03 17:01,1.473280,1.473360,1.473260,1.473350
3,2016-01-03 17:02,1.473350,1.473350,1.473290,1.473290
4,2016-01-03 17:03,1.473300,1.473330,1.473290,1.473320

This demo was performed on following setup:

  • CPU: Intel Core i7-7560U 2.4 GHz 2 physical cores / 4 logical cores
  • OS: CentOS 7.3 ( virtualized on VMware Workstation 12 Pro on Windows 10 Pro)
  • Memory: 4GB
  • ColumnStore: 1 UM & 1 PM on single node

Next, we create a database and a table with MariaDB monitor:

# mcsmysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.23-MariaDB Columnstore 1.0.9-1

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> CREATE DATABASE forex;
MariaDB [(none)]> USE forex;
MariaDB [forex]> CREATE TABLE gbpusd (
    id INT, 
    time DATETIME,
    open DOUBLE,
    high DOUBLE, 
    low DOUBLE, 
    close DOUBLE
    ) engine=ColumnStore default character set=utf8;

Notes:
With standard ColumnStore installation, MariaDB monitor command is mcsmysql ( aliased to /usr/local/mariadb/columnstore/mysql/bin/mysql )
Storage engine must be ColumnStore

Now it’s time to import CSV data whose timestamps are corrected. For bulk import, we use cpimport command with ColumnStore. With -s (separator) option, we specify the delimiter character (in this example, comma). 

# cpimport -s ',' forex gbpusd gbpusd2016.csv
Locale is : C
Column delimiter : ,

Using table OID 3163 as the default JOB ID
Input file(s) will be read from : /home/vagrant/histdata
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T103843_S950145_Job_3163.xml

…

2017-06-24 10:38:45 (29756) INFO : For table forex.gbpusd: 372480 rows processed and 372480 rows inserted.
2017-06-24 10:38:46 (29756) INFO : Bulk load completed, total run time : 2.11976 seconds

We could import 372,280 rows of forex data in 2 seconds. LOAD DATA LOCAL INFILE can be used as well with ColumnStore.

Sample queries using functions

The reason why GBPUSD M1 data 2016 was chosen, is to track very quick moves of GBPUSD before and after the Brexit vote, which was held on 23th June 2016.

gbpusd-h4.png

 

Now we look at maximum, minimum and drop off rate during 23th – 24th June 2016, using Windows Functions of ColumnStore.

MariaDB [forex]> SELECT MAX(close) FROM gbpusd WHERE time BETWEEN TIMESTAMP ('2016-06-23') AND TIMESTAMP ('2016-06-25');
+------------+
| MAX(close) |
+------------+
|    1.50153 |
+------------+
1 row in set (0.03 sec)

MariaDB [forex]> SELECT MIN(close) FROM gbpusd WHERE time BETWEEN TIMESTAMP ('2016-06-23') AND TIMESTAMP ('2016-06-25');
+------------+
| MIN(close) |
+------------+
|    1.32322 |
+------------+
1 row in set (0.03 sec)

MariaDB [forex]> SELECT MAX(close)/MIN(close) FROM gbpusd WHERE time BETWEEN TIMESTAMP ('2016-06-23') AND TIMESTAMP ('2016-06-25');
+-----------------------+
| MAX(close)/MIN(close) |
+-----------------------+
|    1.1347546399658233 |
+-----------------------+
1 row in set (0.03 sec)

GBPUSD fell from 1.50 to 1.32 (-13%) in 2 days (actually in half a day). 

Correlation GBPUSD – USDJPY on 23th June 2016

Next graph shows scatter plot of GBPUSD vs. USDJPY during 23th – 24th June 2016. (USDJPY M1 2016 were imported same as GBPUSD)

scatter-plot-gbpusd-usdjpy.png

Note: GBPUSD is multiplied by 100, then subtracted by 130. USDJPY is subtracted by 110.

Now we calculate Pearson correlation coefficient using  statistical aggregation functions. Following SELECT statement was used: 

SELECT
  (AVG(gbpusd.close*usdjpy.close) - AVG(gbpusd.close)*AVG(usdjpy.close)) / 
  (STDDEV(gbpusd.close) * STDDEV(usdjpy.close))
AS correlation_coefficient_population
FROM gbpusd
JOIN usdjpy ON gbpusd.time = usdjpy.time
WHERE
  gbpusd.time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');

Query result:

+ ------------------------------------+
| correlation_coefficient_population |
+------------------------------------+
|                 0.9647697302087848 |
+------------------------------------+
1 row in set (0.57 sec)

GBPUSD and USDJPY were highly correlated during the UK EU membership referendum.

Moving Average GBPUSD 23th - 24th June 2016

In Forex trading, moving average is often used to smooth out price fluctuations. The following query allows moving average of sliding 13 rows window.

SELECT 
  time,
  close,
  AVG(close) OVER ( 
               ORDER BY time ASC
                 ROWS BETWEEN 
                   6 PRECEDING AND
                   6 FOLLOWING ) AS MA13,
  COUNT(close) OVER ( 
               ORDER BY time ASC
                 ROWS BETWEEN 
                   6 PRECEDING AND
                   6 FOLLOWING ) AS row_count
FROM gbpusd
WHERE time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');

GBPUSD-MA13.png

Summary

In this blog post, the following features of MariaDB ColumnStore were explained:

  • With cpimport command, users can easily perform fast bulk import to MariaDB ColumnStore tables.
  • MariaDB ColumnStore allows fast and easy data analytics using SQL 2003 compliant Window Functions, such as MAX, MIN, AVG and STDDEV.
  • No index management for query performance tuning needed.

MariaDB ColumnStore can be downloaded here, detailed instructions to install and test MariaDB ColumnStore on Windows using Hyper-V can be found here.

MariaDB ColumnStore is a GPLv2 open source columnar database built on MariaDB Server. It is a fork and evolution of the former InfiniDB. It can be deployed in the cloud (optimized for Amazon Web Services) or on a local cluster of Linux servers using either local or networked storage. In this blog post, I will show some examples of analysis of financial time series data using MariaDB ColumnStore.

 

annaj moore

annaj moore

Mon, 07/17/2017 - 07:02

I truly value this post. I've been searching all over for this! Thank heavens I discovered it on this blog . I truly respect your work and i seek in future i will return after more data. like this one.You have filled my heart with joy! Much appreciated once more! I have some good work experience with best essay writing service [https://buyessays.us] and my words are clearly based on what I felt through such processes in the past.

Login or Register to post comments

by Satoru Goto at July 13, 2017 03:31 PM

Colin Charles

CFP for Percona Live Europe Dublin 2017 closes July 17 2017!

I’ve always enjoyed the Percona Live Europe events, because I consider them to be a lot more intimate than the event in Santa Clara. It started in London, had a smashing success last year in Amsterdam (conference sold out), and by design the travelling conference is now in Dublin from September 25-27 2017.

So what are you waiting for when it comes to submitting to Percona Live Europe Dublin 2017? Call for presentations close on July 17 2017, the conference has a pretty diverse topic structure (MySQL [and its diverse ecosystem including MariaDB Server naturally], MongoDB and other open source databases including PostgreSQL, time series stores, and more).

And I think we also have a pretty diverse conference committee in terms of expertise. You can also register now. Early bird registration ends August 8 2017.

I look forward to seeing you in Dublin, so we can share a pint of Guinness. Sláinte.

by Colin Charles at July 13, 2017 01:27 AM

July 12, 2017

MariaDB Foundation

MariaDB 10.2.7 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.2.7. This is a stable (GA) release. See the release notes and changelog for details. Download MariaDB 10.2.7 Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.2.7 now available appeared first on MariaDB.org.

by MariaDB Foundation at July 12, 2017 08:59 PM

Peter Zaitsev

Gh-ost benchmark against pt-online-schema-change performance

gh-ost-benchmark

In this blog post, I will run a gh-ost benchmark against the performance of pt-online-schema-change.

When gh-ost came out, I was very excited. As MySQL ROW replication became commonplace, you could use it to track changes instead of triggers. This practice is cleaner and safer compared to Percona Toolkit’s pt-online-schema-change. Since gh-ost doesn’t need triggers, I assumed it would generate lower overhead and work faster. I frequently called it “pt-online-schema-change on steroids” in my talks. Finally, I’ve found some time to check my theoretical claims with some benchmarks.

DISCLAIMER: These benchmarks correspond to one specific ALTER TABLE on the table of one specific structure and hardware configuration. I have not set up a broad set of tests. If you have other results – please comment!

Benchmark Setup Details

  • pt-online-schema-change from Percona Toolkit 3.0.3
  • gh-ost 1.0.36
  • Percona Server 5.7.18 on Ubuntu 16.04 LTS
  • Hardware: 28CPU cores/56 Threads.  128GB Memory.   Samsung 960 Pro 512GB
  • Sysbench 1.0.7

Prepare the table by running:

sysbench --threads=40 --rate=0 --report-interval=1 --percentile=99 --events=0 --time=0 --db-ps-mode=auto --mysql-user=sbtest --mysql-password=sbtest  /usr/share/sysbench/oltp_read_write.lua --table_size=10000000 prepare

The table size is about 3GB (completely fitting to innodb_buffer_pool).

Run the benchmark in “full ACID” mode with:

  • sync_binlog=1
  • innodb_flush_log_at_trx_commit=1
  • innodb_doublewrite=1

This is important as this workload is heavily commit-bound, and extensively relies on group commit.

This is the pt-online-schema-change command to alter table:

time pt-online-schema-change --execute --alter "ADD COLUMN c1 INT" D=sbtest,t=sbtest1

This the gh-ost command to alter table:

time ./gh-ost  --user="sbtest" --password="sbtest" --host=localhost --allow-on-master --database="sbtest" --table="sbtest1"  --alter="ADD COLUMN c1 INT" --execute

Tests Details

For each test the old sysbench table was dropped and a new one prepared. I tested alter table in three different cases:

  • When nothing else was running (“Idle Load”)   
  • When the system handled about 2% of load it can handle at full capacity (“Light Background Load”)
  • When the system handled about 40% of the possible load, with sysbench injected about 25% of the transactions/sec the system could handle at full load (“Heavy Background Load”)

I measured the alter table completion times for all cases, as well as the overhead generated by the alter (in other words, how much peak throughput is reduced by running alter table through the tools).

Idle Load

gh-ost benchmark 1

For the Idle Load test, pt-online-schema-change completed nearly twice as fast as gh-ost. This was a big surprise for me. I haven’t looked into the reasons or details yet, though I can see most of the CPU usage for gh-ost is on the MySQL server side. Perhaps the differences relate to the SQL used to perform non-blocking alter tables.

Light Background Load

I generated the Light Background Load by running the sysbench command below. It corresponds to a roughly 4% load, as the system can handle some 2500 transactions/sec at this concurrency under full load. Adjust the 

--rate
 value to scale it for your system.

time sysbench --threads=40 --rate=100 --report-interval=1 --percentile=99 --events=0 --time=0 --db-ps-mode=auto --mysql-user=sbtest --mysql-password=sbtest  /usr/share/sysbench/oltp_read_write.lua --table_size=10000000 run

gh-ost benchmark 2

The numbers changed (as expected), but pt-online-schema-change is still approximately twice as fast as gh-ost.

What is really interesting in this case is how a relatively light background load affects the process completion time. It took both pt-online-schema-change and gh-ost about 2.7x times longer to finish! 

Heavy Background Load

I generated the Heavy Background Load running the sysbench command below. It corresponds to a roughly 40% load, as the system can handle some 2500 transactions/sec at this concurrency under full load. Adjust

--rate
 value to scale it for your system.

time sysbench --threads=40 --rate=1000 --report-interval=1 --percentile=99 --events=0 --time=0 --db-ps-mode=auto --mysql-user=sbtest --mysql-password=sbtest  /usr/share/sysbench/oltp_read_write.lua --table_size=10000000 run

gh-ost benchmark 3

What happened in this case? When the load gets higher, gh-ost can’t keep up with binary log processing, and just never finishes at all. While this may be surprising at first, it makes sense if you think more about how these tools work. pt-online-schema-change uses triggers, and while they have a lot of limitations and overhead they can execute in parallel. gh-ost, on the other hand, processes the binary log in a single thread and might not be able to keep up.   

In MySQL 5.6 we didn’t have parallel replication, which applies writes to the same table in parallel. For that version the gh-ost limitation probably isn’t as big a deal, as such a heavy load would also cause replication lag. MySQL 5.7 has parallel replication. This makes it much easier to quickly replicate workloads that are too heavy for gh-ost to handle.

I should note that the workload being simulated in this benchmark is a rather extreme case. The table being altered by gh-ost here is at the same time handling a background load so high it can’t be replicated in a single thread.

Future versions of gh-ost could improve this issue by applying binlog events in parallel, similar to what MySQL replicas do.

An excerpt from the gh-ost log shows how it is totally backed up trying to apply the binary log:

root@rocky:/tmp# time ./gh-ost  --user="sbtest" --password="sbtest" --host=localhost --allow-on-master --database="sbtest" --table="sbtest1"  --alter="ADD COLUMN c1 INT" --execute
2017/06/25 19:16:05 binlogsyncer.go:75: [info] create BinlogSyncer with config &{99999 mysql localhost 3306 sbtest sbtest  false false <nil>}
2017/06/25 19:16:05 binlogsyncer.go:241: [info] begin to sync binlog from position (rocky-bin.000018, 640881773)
2017/06/25 19:16:05 binlogsyncer.go:134: [info] register slave for master server localhost:3306
2017/06/25 19:16:05 binlogsyncer.go:568: [info] rotate to (rocky-bin.000018, 640881773)
2017-06-25 19:16:05 ERROR parsing time "" as "2006-01-02T15:04:05.999999999Z07:00": cannot parse "" as "2006"
# Migrating `sbtest`.`sbtest1`; Ghost table is `sbtest`.`_sbtest1_gho`
# Migrating rocky:3306; inspecting rocky:3306; executing on rocky
# Migration started at Sun Jun 25 19:16:05 -0400 2017
# chunk-size: 1000; max-lag-millis: 1500ms; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.sbtest.sbtest1.sock
Copy: 0/9872432 0.0%; Applied: 0; Backlog: 0/100; Time: 0s(total), 0s(copy); streamer: rocky-bin.000018:641578191; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 0; Backlog: 100/100; Time: 1s(total), 1s(copy); streamer: rocky-bin.000018:641626699; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 640; Backlog: 100/100; Time: 2s(total), 2s(copy); streamer: rocky-bin.000018:641896215; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 1310; Backlog: 100/100; Time: 3s(total), 3s(copy); streamer: rocky-bin.000018:642178659; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 1920; Backlog: 100/100; Time: 4s(total), 4s(copy); streamer: rocky-bin.000018:642436043; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 2600; Backlog: 100/100; Time: 5s(total), 5s(copy); streamer: rocky-bin.000018:642722777; State:
...
Copy: 0/9872432 0.0%; Applied: 120240; Backlog: 100/100; Time: 3m0s(total), 3m0s(copy); streamer: rocky-bin.000018:694142377; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 140330; Backlog: 100/100; Time: 3m30s(total), 3m30s(copy); streamer: rocky-bin.000018:702948219; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 160450; Backlog: 100/100; Time: 4m0s(total), 4m0s(copy); streamer: rocky-bin.000018:711775662; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 180600; Backlog: 100/100; Time: 4m30s(total), 4m30s(copy); streamer: rocky-bin.000018:720626338; State: migrating; ETA: N/A
Copy: 0/9872432 0.0%; Applied: 200770; Backlog: 100/100; Time: 5m0s(total), 5m0s(copy); streamer: rocky-bin.000018:729509960; State: migrating; ETA: N/A

Online Schema Change Performance Impact

For this test I started the alter table, waited 60 seconds and then ran sysbench at full speed for five minutes. Then I measured how much the performance was impacted by running the tool:

sysbench --threads=40 --rate=0 --report-interval=1 --percentile=99 --events=0 --time=300 --db-ps-mode=auto --mysql-user=sbtest --mysql-password=sbtest  /usr/share/sysbench/oltp_read_write.lua --table_size=10000000 run

gh-ost benchmark 4

As we can see, gh-ost has negligible overhead in this case. pt-online-schema-change on the other hand, had peformance reduced by 12%. It is worth noting though that pt-online-schema-change still makes progress in this case (though slowly), while gh-ost would never complete.

If anything, I was surprised at how little impact the pt-online-schema-change run had on sysbench performance.

It’s important to note that in this case we only measured the overhead for the “copy” stage of the online schema change. Another thing you should worry about is the impact to performance during “table rotation” (which I have not measured).

Summary

While gh-ost introduces a number of design advantages, and gives better results in some situation, I wouldn’t call it always superior the tried and true pt-online-schema-change. At least in some cases, pt-online-schema-change offers better performance than gh-ost and completes a schema change when gh-ost is unable to keep up. Consider trying out both tools and see what works best in your situation.

by Peter Zaitsev at July 12, 2017 06:31 PM

MariaDB AB

MariaDB 10.2.7 now available

MariaDB 10.2.7 now available dbart Wed, 07/12/2017 - 11:21

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.2.7. See the release notes and changelog for details.

Download MariaDB 10.2.7

Release Notes Changelog What is MariaDB 10.2?

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.2.7. See the release notes and changelog for details.

Login or Register to post comments

by dbart at July 12, 2017 03:21 PM

July 11, 2017

Peter Zaitsev

Thread_Statistics and High Memory Usage

thread_statistics

thread_statisticsIn this blog post, we’ll look at how using thread_statistics can cause high memory usage.

I was recently working on a high memory usage issue for one of our clients, and made some interesting discoveries: high memory usage with no bounds. It was really tricky to diagnose.

Below, I am going to show you how to identify that having thread_statistics enabled causes high memory usage on busy systems with many threads.

Part 1: Issue Background

I had a server with 55.0G of available memory. Percona Server for MySQL version:

Version | 5.6.35-80.0-log Percona Server (GPL), Release 80.0, Revision f113994f31
                 Built On | debian-linux-gnu x86_64

We have calculated approximately how much memory MySQL can use in a worst case scenario for max_connections=250:

mysql> select ((@@key_buffer_size+@@innodb_buffer_pool_size+@@innodb_log_buffer_size+@@innodb_additional_mem_pool_size+@@net_buffer_length+@@query_cache_size)/1024/1024/1024)+((@@sort_buffer_size+@@myisam_sort_buffer_size+@@read_buffer_size+@@join_buffer_size+@@read_rnd_buffer_size+@@thread_stack)/1024/1024/1024*250);
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ((@@key_buffer_size+@@innodb_buffer_pool_size+@@innodb_log_buffer_size+@@innodb_additional_mem_pool_size+@@net_buffer_length+@@query_cache_size)/1024/1024/1024)+((@@sort_buffer_size+@@myisam_sort_buffer_size+@@read_buffer_size+@@join_buffer_size+@@read_rnd |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 12.445816040039 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

So in our case, this shouldn’t be more than ~12.5G.

After MySQL Server has been restarted, it allocated about 7G. After running for a week, it reached 44G:

# Memory #####################################################
       Total | 55.0G
        Free | 8.2G
        Used | physical = 46.8G, swap allocated = 0.0, swap used = 0.0, virtual = 46.8G
      Shared | 560.0k
     Buffers | 206.5M
      Caches | 1.0G
       Dirty | 7392 kB
     UsedRSS | 45.3G
  Swappiness | 60
 DirtyPolicy | 20, 10
 DirtyStatus | 0, 0

# Top Processes ##############################################
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2074 mysql     20   0 53.091g 0.044t   8952 S 126.0 81.7  34919:49 mysqld

I checked everything that could be related to the high memory usage (for example, operating system settings such as Transparent Huge Pages (THP), etc.). But I still didn’t find the cause (THP was disabled on the server). I asked my teammates if they had any ideas.

Part 2: Team Is on Rescue

After brainstorming and reviewing the status, metrics and profiles again and again, my colleague (Yves Trudeau) pointed out that User Statistics is enabled on the server.

User Statistics adds several INFORMATION_SCHEMA tables, several commands, and the

userstat
 variable. The tables and commands can be used to better understand different server activity, and to identify the different load sources. Check out the documentation for more information.

mysql> show global variables like 'user%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| userstat      | ON    |
+---------------+-------+
mysql> show global variables like 'thread_statistics%';
+-------------------------------+---------------------------+
| Variable_name                 | Value                     |
+-------------------------------+---------------------------+
| thread_statistics             | ON                        |
+-------------------------------+---------------------------+

Since we saw many threads running, it was a good option to verify this as the cause of the issue.

Part 3: Cause Verification – Did It Really Eat Our Memory?

I decided to apply some calculations, and the following test cases to verify the cause:

  1. Looking at the THREAD_STATISTICS table in the INFORMATION_SCHEMA, we can see that for each connection there is a row like the following:
    mysql> select * from THREAD_STATISTICS limit 1G
    *************************** 1. row ***************************
    THREAD_ID: 3566
    TOTAL_CONNECTIONS: 1
    CONCURRENT_CONNECTIONS: 0
    CONNECTED_TIME: 30
    BUSY_TIME: 0
    CPU_TIME: 0
    BYTES_RECEIVED: 495
    BYTES_SENT: 0
    BINLOG_BYTES_WRITTEN: 0
    ROWS_FETCHED: 27
    ROWS_UPDATED: 0
    TABLE_ROWS_READ: 0
    SELECT_COMMANDS: 11
    UPDATE_COMMANDS: 0
    OTHER_COMMANDS: 0
    COMMIT_TRANSACTIONS: 0
    ROLLBACK_TRANSACTIONS: 0
    DENIED_CONNECTIONS: 0
    LOST_CONNECTIONS: 0
    ACCESS_DENIED: 0
    EMPTY_QUERIES: 0
    TOTAL_SSL_CONNECTIONS: 1
    1 row in set (0,00 sec)
  2. We have 22 columns, each of them BIGINT, which gives us ~ 176 bytes per row.
  3. Let’s calculate how many rows we have in this table at this time, and check once again in an hour:
    mysql> select count(*) from information_schema.thread_statistics;
    +----------+
    | count(*) |
    +----------+
    | 7864343  |
    +----------+
    1 row in set (15.35 sec)

    In an hour:

    mysql> select count(*) from information_schema.thread_statistics;
    +----------+
    | count(*) |
    +----------+
    | 12190801 |
    +----------+
    1 row in set (24.46 sec)

  4. Now let’s check on how much memory is currently in use:

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    2096 mysql 20 0 12.164g 0.010t 16036 S 173.4 18.9 2274:51 mysqld

  5. We have 12190801 rows in the THREAD_STATISTICS table, which is ~2G in size.
  6. Issuing the following statement cleans up the statistics:

    mysql> flush thread_statistics;
    mysql> select count(*) from information_schema.thread_statistics;
    +----------+
    | count(*) |
    +----------+
    | 0        |
    +----------+
    1 row in set (00.00 sec)

  7. Now, let’s check again on how much memory is in use:

    ID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2096 mysql 20 0 12.164g 7.855g 16036 S 99.7 14.3 2286:24 mysqld

As we can see, memory usage drops to the approximate value of 2G that we had calculated earlier!

That was the root cause of the high memory usage in this case.

Conclusion

User Statistics (basically Thread_Statistics) is a great feature that allows us to identify load sources and better understand server activity. At the same time, though, it can be dangerous (from the memory usage point of view) to use as a permanent monitoring solution due to no limitations on memory usage.

As a reminder, thread_statistics is NOT enabled by default when you enable User_Statistics. If you have enabled Thread_Statistics for monitoring purposes, please don’t forget to pay attention to it.

As a next step, we are considering submitting a feature request to implement some default limits that can prevent Out of Memory issues on busy systems.

by Alex Poritskiy at July 11, 2017 08:15 PM

MariaDB AB

New in Microsoft Azure: MariaDB TX 2.0

New in Microsoft Azure: MariaDB TX 2.0 kolbekegel Tue, 07/11/2017 - 12:56

Now in the Microsoft Azure Marketplace: MariaDB TX 2.0 with MariaDB Server 10.2, configured in a MariaDB Galera Cluster topology, with MariaDB MaxScale 2.1, and Azure Managed Disks.

What is MariaDB TX 2.0? This is the new name of MariaDB's flagship subscription offering for transactional (OLTP) workloads. MariaDB TX 2.0 includes MariaDB Server 10.2, featuring the Galera clustering technology we use in this solution, as well as MariaDB MaxScale 2.1.

The MariaDB TX 2.0 solution in the Azure Marketplace deploys a cluster with three MariaDB Server data nodes and two MariaDB MaxScale proxy nodes. This provides HA at the data store layer as well as the connectivity layer. MaxScale can automatically split reads and writes between the backend data nodes in the MariaDB Cluster, and it can automatically choose a new master if the topology changes. An Azure Load Balancer sits in front of the MaxScale nodes to seamlessly manage a switchover or failover between the MaxScale nodes.

MariaDB Server 10.2 includes lots of really interesting functionality, including Window Functions, Common Table Expressions (CTEs), CHECK constraints, new JSON functions, and lots more. Take a look at our recent MariaDB Server 10.2 new features for the full list and links to details.

New in the MariaDB TX 2.0 solution in Azure is a move to Ubuntu 16.04 LTS from CentOS 7. This is mostly transparent to users, as all the same packages and functionality are available as in previous iterations of this solution. This move makes maintenance of the solution a bit easier, and it seems that there's better support in Azure for their Ubuntu VMs than for CentOS, so this may even bring some performance and stability improvements.

Another notable change is a move to Azure Managed Disks. This finally brings Azure storage in line with what can be found from other cloud providers. Azure Managed Disks are a relatively new feature of Azure, and they'll be a best practice going forward, so I'm happy to have been able to add early support for them to our MariaDB TX 2.0 solution in the Azure Marketplace. Azure Managed Disks remove the complexity of having to directly manage Storage Accounts, which means a little bit less configuration to kick off the deployment process. It also gives access to Snapshots, Azure Disk Encryption, increased resiliency, and the ability to upgrade from Standard (hard disk) to Premium (SSD) storage without having to re-create the VM. Learn more about Azure Managed Disks.

Find MariaDB TX 2.0 in the Azure Marketplace. From there you can click "Get It Now" to deploy the solution into your existing Azure Subscription, or you can click "Test Drive" to start a 2-hour Test Drive to kick the tires.

The free-of-charge Test Drive deploys the same topology, including MariaDB TX 2.0, MariaDB Server 10.2, MariaDB MaxScale 2.1, and the move to Ubuntu 16.04 LTS. The Test Drive doesn't use Managed Disks, however. To get a look at that experience, you'll need to deploy the MariaDB TX 2.0 solution yourself into your own Azure Subscription.

Maybe you just want to start simple, with a single instance of MariaDB Server? If so, you can also start a standalone MariaDB Server instance from the Azure Marketplace.

Documentation and more information for all of these solutions can be found in the MariaDB Knowledge Base.

Have questions? If you're a customer with a MariaDB TX or MariaDB Enterprise subscription, please contact our MariaDB support team. If you're not a customer yet, get in touch with us — we're looking forwarding to hearing from you.

Now in the Microsoft Azure Marketplace: MariaDB TX 2.0 with MariaDB Server 10.2, configured in a MariaDB Galera Cluster topology, with MariaDB MaxScale 2.1, and Azure Managed Disks.

Login or Register to post comments

by kolbekegel at July 11, 2017 04:56 PM

Peter Zaitsev

Webinar Wednesday July 12, 2017: MongoDB Index Types – How, When and Where Should They Be Used?

MongoDB Index Types

MongoDB Index TypesJoin Percona’s Senior Technical Services Engineer, Adamo Tonete as he presents MongoDB Index Types: How, When and Where Should They Be Used? on Wednesday, July 12, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

MongoDB has 12 index types. Do you know how each works, or when you should use each of them? This talk will arm you with this knowledge, and help you understand how indexes impact performance, storage and even the sharding of your data. We will also discuss some solid index operational practices, as well as some settings for things like TTL you might not know exist. The contents of this webinar will make you a Rock Star!

Register for the webinar here.

MongoDB Index TypesAdamo Tonete, Senior Technical Services Engineer

Adamo joined Percona in 2015, after working as a MongoDB/MySQL database administrator for three years. As the main database admin of a startup, he was responsible for suggesting the best architecture and data flows for a worldwide company in a 24/7 environment. Before that, he worked as a Microsoft SQL Server DBA for a large e-commerce company, working mainly on performance tuning and automation. Adamo has almost eight years of experience working as a DBA, and in the past three years he has moved to NoSQL technologies (without giving up relational databases).

by Emily Ikuta at July 11, 2017 03:32 PM

July 10, 2017

Peter Zaitsev

Percona Server for MongoDB 3.4.5-1.5 is Now Available

Percona Server for MongoDB 3.4

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.5-1.5 on July 10, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.5 and includes the following additional changes:

New Features

Bugs Fixed

  • #PSMDB-67: Fixed mongod service status messages.

by Alexey Zhebel at July 10, 2017 06:37 PM