Planet MariaDB

May 25, 2016

Peter Zaitsev

MongoDB 3.2: elections just got better!

MongoDB 3.2 elections

Introduction

In this blog, we’ll review MongoDB 3.2 elections and how they work, as well as what is really new and different in the election protocol.

MongoDB 3.2 revamped its election protocol for increased stability! Exciting times, with smarter and faster elections are here! With this latest release, you will find that replication (and the election protocol) have been improved. Some of the changes include:

  • The addition of
    electionTimeoutMS
  • WriteConcern
      now implies “j:true”
    • Old j:true meant just the primary node
    • New j:true means all involved nodes must ACK the journal
    • j:true means your journal MS will be thirded, and synchronization occurs every 10ms (MMAP) or 33ms (WiredTiger) by default
  • Optime in rs.status now an Object, not a Timestamp

You’ll need to enable the Election Protocol when upgrading MongoDB from an earlier version, while new replSets get it enabled by default.

Election Protocol: what is an election?

Mongo uses a consensus protocol. This means that all nodes must agree who is the most current when handing:

  • Hardware failure
  • Network split
  • Time shifts

New updates allow for faster elections using an (term) electionId to prevent timeout between separate voting rounds. This guarantees there aren’t double (and conflicting) votes while also reducing the time to wait to know a vote completed.

How does it do it?

Elections now have “term” or “vote” identifiers (ID). Terms are used to separate voting rounds. Every vote attempt increments the ID. The ID incrementation prevents a node from double voting in the same term, and makes it easier for nodes to know if a re-vote is needed where before it could be up to 5 minutes!

The protocol timeouts have some new features and behaviors:

  • Now configurable
  • Randomness added to each node
  • Less chance all node timeout at the same time

Normal election process

Below I’m going to walk you through a typical replica set operation. The configuration looks like the following:

MongoDB 3.2 elections

In this topology:

  • There are three members
  • All of them are heartbeating to each other
  • There is no arbiter, so you get full high availability (HA)

The following diagram provides a more detailed picture of the interactions:

MongoDB 3.2 elections

Notice how replication pulls from the primary to each secondary from the primary – the secondary does all the work. A heartbeat is still shared by all the nodes.

Now let’s see what happens when our primary crashes. It just did!

MongoDB 3.2 elections

Nodes will still try to heartbeat to it until two have failed in a short period.

MongoDB 3.2 elections

After the failure, things happen quickly.

  1. Secondaries give up on heartbeats
  2. They then vote with each other on who is newest in oplog
  3. If they have > 50% of total voting population they select a new winner

A new Primary is selected, and the heartbeat system is cleaned up.

MongoDB 3.2 elections

Replication now gets restarted. If the fatal node comes back online, it’s treated as a secondary once it “catches up” via the oplog.

Stepdown Election Process

The stepdown election process is the same as above, with the following caveats:

  • It’s MUCH faster, as the existing primary “starts” an election
  • There is less chance of the old primary not having data replicated
  • It kills writes while doing election
  • The election process doesn’t wait for heartbeat timeouts

Generally speaking, you should always try to use the stepdown election process. Timeouts are for crashes and failures, not general use.

 

by David Murphy at May 25, 2016 07:26 PM

Percona Server 5.6.30-76.3 is now available

percona server 5.6.30-76.3


percona server 5.6.30-76.3Percona
is glad to announce the release of Percona Server 5.6.30-76.3 on May 25, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.6.30, including all the bug fixes in it, Percona Server 5.6.30-76.3 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.30-76.3 milestone on Launchpad.

Bugs Fixed:

  • When Read Free Replication was enabled for TokuDB, and there was no explicit primary key for the replicated TokuDB table, there could be duplicated records in the table on update operation. The fix disables Read Free Replication for tables without an explicit primary key and does rows lookup for UPDATE and DELETE binary log events and issues warning. Bug fixed #1536663 (#950).
  • Attempting to execute a non-existing prepared statement with Response Time Distribution plugin enabled could lead to a server crash. Bug fixed #1538019.
  • TokuDB was using using different memory allocators; this was causing safemalloc warnings in debug builds and crashes because memory accounting didn’t add up. Bug fixed #1546538 (#962).
  • Fixed heap allocator/deallocator mismatch in Metrics for scalability measurement. Bug fixed #1581051.
  • Percona Server is now built with system zlib library instead of the older bundled one. Bug fixed #1108016.
  • Reduced the memory overhead per page in the InnoDB buffer pool. The fix was based on Facebook patch #91e979e. Bug fixed #1536693 (upstream #72466).
  • CREATE TABLE ... LIKE ... could create a system table with an unsupported enforced engine. Bug fixed #1540338.
  • Change buffer merge could throttle to 5% of I/O capacity on an idle server. Bug fixed #1547525.
  • Slave_open_temp_tables would fail to decrement on the slave with a disabled binary log if the master was killed. Bug fixed #1567361.
  • The server will now show a more descriptive error message when Percona Server fails with errno == 22 "Invalid argument", if innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1578604.
  • Killed connection threads could get their sockets closed twice on shutdown. Bug fixed #1580227.
  • AddressSanitizer build with LeakSanitizer enabled was failing at gen_lex_hash invocation. Bug fixed #1580993 (upstream #80014).
  • apt-cache show command for percona-server-client was showing innotop included as part of the package. Bug fixed #1201074.
  • mysql-systemd would fail with PAM authentication and proxies due to a regression introduced when fixing #1534825 in Percona Server 5.6.29-76.2. Bug fixed #1558312.
  • Upgrade logic for figuring if TokuDB upgrade can be performed from the version on disk to the current version was broken due to a regression introduced when fixing bug #684 in Percona Server 5.6.27-75.0. Bug fixed #717.
  • If ALTER TABLE was run while tokudb_auto_analyze variable was enabled it would trigger auto-analysis, which could lead to a server crash if ALTER TABLE DROP KEY was used because it would be operating on the old table/key meta-data. Bug fixed #945.
  • The TokuDB storage engine with tokudb_pk_insert_mode set to 1 is safe to use in all conditions. On INSERT IGNORE or REPLACE INTO, it tests to see if triggers exist on the table, or replication is active with !BINLOG_FORMAT_STMT before it allows the optimization. If either of these conditions is met, then it falls back to the “safe” operation of looking up the target row first. Bug fixed #952.
  • Bug in TokuDB Index Condition Pushdown was causing ORDER BY DESC to reverse the scan outside of the WHERE bounds. This would cause a query to hang in a sending data state for several minutes in some environments with large amounts of data (3 billion records) if the ORDER BY DESC statement was used. Bugs fixed #988, #233, and #534.

Other bugs fixed: #1399562 (upstream #75112), #1510564 (upstream #78981), #1496282 (#964), #1496786 (#956), #1566790, #1552673, #1567247, #1567869, #718, #914, #970, #971, #972, #976, #977, #981, #637, and #982.

Release notes for Percona Server 5.6.30-76.3 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

by Hrvoje Matijakovic at May 25, 2016 04:54 PM

Jean-Jerome Schmidt

Watch the replay: ClusterControl 1.3 webinar with new features for MySQL, MariaDB, Percona Server, PostgreSQL and more!

Thanks to everyone who joined us yesterday for our ClusterControl 1.3 release webinar!

Johan Andersson, CTO at Severalnines and creator of ClusterControl, walked us through the latest features of the 1.3 release and demonstrated them live as well. In addition to an overview of ClusterControl’s deployment, monitoring, management and scaling functionalities for MySQL, MariaDB, Percona Server, MongoDB and PostgreSQL, Johan focussed our attention on new features around key management, operational reports and more.

One feature-set that triggered particular interest in yesterday’s audience was the automated deployment of a production setup of NDB / MySQL Cluster: users can create a production setup of NDB/MySQL Cluster from ClusterControl and deploy management, SQL/API and data nodes - all via the ClusterControl interface.

The replay of this webinar and the slides are now available for viewing online:

Sign up for the the replay Read the slides

To get started with ClusterControl, download it today.

Webinar Agenda

  • ClusterControl overview
  • New features deep-dive
    • Key management and encryption
    • Additional operational reports
    • Improved security
    • Create / mirror repository
    • Create NDB / MySQL Cluster
  • Live Demo
  • Q&A

Speaker

Johan Andersson, CTO, Severalnines - Johan's technical background and interest are in high performance computing as demonstrated by the work he did on main-memory clustered databases at Ericsson as well as his research on parallel Java Virtual Machines at Trinity College Dublin in Ireland. Prior to co-founding Severalnines, Johan was Principal Consultant and lead of the MySQL Clustering & High Availability consulting group at MySQL / Sun Microsystems / Oracle, where he designed and implemented large-scale MySQL systems for key customers. Johan is a regular speaker at MySQL User Conferences as well as other high profile community gatherings with popular talks and tutorials around architecting and tuning MySQL Clusters.

For more information on ClusterControl 1.3:

To get started with ClusterControl, download it today.

by Severalnines at May 25, 2016 02:52 PM

Oli Sennhauser

Why you should take care of MySQL data types

A customer reported last month that MySQL does a full table scan (FTS) if a query was filtered by a INT value on a VARCHAR column. First I told him that this is not true any more because MySQL has fixed this behaviour long time ago. He showed me that I was wrong:

CREATE TABLE `test` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `data` varchar(64) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `data` (`data`)
) ENGINE=InnoDB;

EXPLAIN SELECT * FROM test WHERE data = 42\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ALL
possible_keys: data
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 522500
     filtered: 10.00
        Extra: Using where

EXPLAIN SELECT * FROM test WHERE data = '42'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: ref
possible_keys: data
          key: data
      key_len: 67
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

When I executed the query I got some more interesting information:

SELECT * FROM test WHERE data = '42';
Empty set (0.00 sec)

SELECT * FROM test WHERE data = 42;
+--------+----------------------------------+---------------------+
| id     | data                             | ts                  |
+--------+----------------------------------+---------------------+
|   1096 | 42a5cb4a3e76857a3efe7af44ba9f4dd | 2016-05-25 10:26:59 |
...
| 718989 | 42a1921fb2df42126d85f9586532eda4 | 2016-05-25 10:27:12 |
+--------+----------------------------------+---------------------+
767 rows in set, 65535 warnings (0.26 sec)

Looking at the warnings we also find the reason: MySQL does the cast on the column and not on the value which is a bit odd IMHO:

show warnings;
| Warning | 1292 | Truncated incorrect DOUBLE value: '80f52706c2f9de40472ec29a7f70c992' |

A bit suspicious I looked at the warnings of the query execution plan again:

show warnings;
+---------+------+---------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                     |
+---------+------+---------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'data' due to type or collation conversion on field 'data'   |
| Warning | 1739 | Cannot use range access on index 'data' due to type or collation conversion on field 'data' |
+---------+------+---------------------------------------------------------------------------------------------+

I thought this was fixed, but it seems not. The following releases behave like this: MySQL 5.0.96, 5.1.73, 5.5.38, 5.6.25, 5.7.12 and MariaDB 5.5.41, 10.0.21 and 10.1.9

The other way around it seems to work in both cases:

SELECT * FROM test WHERE id = 42;
+----+----------------------------------+---------------------+
| id | data                             | ts                  |
+----+----------------------------------+---------------------+
| 42 | 81d74057d7be8f20563da404bb1b3ab0 | 2016-05-25 10:26:56 |
+----+----------------------------------+---------------------+

SELECT * FROM test WHERE id = '42';
+----+----------------------------------+---------------------+
| id | data                             | ts                  |
+----+----------------------------------+---------------------+
| 42 | 81d74057d7be8f20563da404bb1b3ab0 | 2016-05-25 10:26:56 |
+----+----------------------------------+---------------------+

EXPLAIN SELECT * FROM test WHERE id = 42\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

Taxonomy upgrade extras: 

by Shinguz at May 25, 2016 09:42 AM

May 24, 2016

Peter Zaitsev

Looking inside the MySQL 5.7 document store

MySQL 5.7 Document Store

MySQL 5.7 Document StoreIn this blog, we’ll look at the MySQL 5.7 document store feature, and how it is implemented.

Document Store

MySQL 5.7.12 is a major new release, as it contains quite a number of new features:

  1. Document store and “MongoDB” like NoSQL interface to JSON storage
  2. Protocol X / X Plugin, which can be used for asynchronous queries (I will write about it as well)
  3. New MySQL shell

Peter already wrote the document store overview; in this post, I will look deeper into the document store implementation. In my next post, I will demonstrate how to use document store for Internet of Things (IoT) and event logging.

Older MySQL 5.7 versions already have a JSON data type, and an ability to create virtual columns that can be indexed. The new document store feature is based on the JSON datatype.

So what is the document store anyway? It is an add-on to a normal MySQL table with a JSON field. Let’s take a deep dive into it and see how it works.

First of all: one can interface with the document store’s collections using the X Plugin (default port: 33060). To do that:

  1. Enable X Plugin and install MySQL shell.
  2. Login to a shell:
    mysqlsh --uri root@localhost
  3. Run commands (JavaScript mode, can be switched to SQL or Python):
    mysqlsh --uri root@localhost
    Creating an X Session to root@localhost:33060
    Enter password:
    No default schema selected.
    Welcome to MySQL Shell 1.0.3 Development Preview
    Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    Type 'help', 'h' or '?' for help.
    Currently in JavaScript mode. Use sql to switch to SQL mode and execute queries.
    mysql-js> db = session.getSchema('world_x')                                                                                                                                                                 <Schema:world_x>
    mysql-js> db.getCollections()
    {
        "CountryInfo": <Collection:CountryInfo>
    }

Now, how is the document store’s collection different from a normal table? To find out, I’ve connected to a normal MySQL shell:

mysql world_x
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 2396
Server version: 5.7.12 MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> show create table CountryInfo
*************************** 1. row ***************************
       Table: CountryInfo
Create Table: CREATE TABLE `CountryInfo` (
  `doc` json DEFAULT NULL,
  `_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL,
  PRIMARY KEY (`_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> show tables;
+-------------------+
| Tables_in_world_x |
+-------------------+
| City              |
| Country           |
| CountryInfo       |
| CountryLanguage   |
+-------------------+
4 rows in set (0.00 sec)

So the document store is actually an InnoDB table with one field: doc json + Primary key, which is a generated column.

As we can also see, there are four tables in the world_x database, but db.getCollections() only shows one. So how does MySQL distinguish between a “normal” table and a “document store” table? To find out, we can enable the general query log and see which query is being executed:

$ mysql -e 'set global general_log=1'
$ tail /var/log/general.log
2016-05-17T20:53:12.772114Z  186 Query  SELECT table_name, COUNT(table_name) c FROM information_schema.columns WHERE ((column_name = 'doc' and data_type = 'json') OR (column_name = '_id' and generation_expression = 'json_unquote(json_extract(`doc`,''$._id''))')) AND table_schema = 'world_x' GROUP BY table_name HAVING c = 2
2016-05-17T20:53:12.773834Z  186 Query  SHOW FULL TABLES FROM `world_x`

As you can see, every table that has a specific structure (doc JSON or specific generation_expression) is considered to be a JSON store. Now, how does MySQL translate the .find or .add constructs to actual MySQL queries? Let’s run a sample query:

mysql-js> db.getCollection("CountryInfo").find('Name= "United States"').limit(1)
[
    {
        "GNP": 8510700,
        "IndepYear": 1776,
        "Name": "United States",
        "_id": "USA",
        "demographics": {
            "LifeExpectancy": 77.0999984741211,
            "Population": 278357000
        },
        "geography": {
            "Continent": "North America",
            "Region": "North America",
            "SurfaceArea": 9363520
        },
        "government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "George W. Bush",
            "HeadOfState_title": "President"
        }
    }
]
1 document in set (0.02 sec)

and now look at the slow query log again:

2016-05-17T21:02:21.213899Z  186 Query  SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1

We can verify that MySQL translates all document store commands to SQL. That also means that it is 100% transparent to the existing MySQL storage level and will work with other storage engines. Let’s verify that, just for fun:

mysql> alter table CountryInfo engine=MyISAM;
Query OK, 239 rows affected (0.06 sec)
Records: 239  Duplicates: 0  Warnings: 0
mysql-js> db.getCollection("CountryInfo").find('Name= "United States"').limit(1)
[
    {
        "GNP": 8510700,
        "IndepYear": 1776,
        "Name": "United States",
        "_id": "USA",
        "demographics": {
            "LifeExpectancy": 77.0999984741211,
            "Population": 278357000
        },
        "geography": {
            "Continent": "North America",
            "Region": "North America",
            "SurfaceArea": 9363520
        },
        "government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "George W. Bush",
            "HeadOfState_title": "President"
        }
    }
]
1 document in set (0.00 sec)
2016-05-17T21:09:21.074726Z 2399 Query  alter table CountryInfo engine=MyISAM
2016-05-17T21:09:41.037575Z 2399 Quit
2016-05-17T21:09:43.014209Z  186 Query  SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1

Worked fine!

Now, how about the performance? We can simply take the SQL query and run

explain
:

mysql> explain SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: CountryInfo
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 239
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

Hmm, it looks like it is not using an index. That’s because there is no index on Name. Can we add one? Sure, we can add a virtual column and then index it:

mysql> alter table CountryInfo add column Name varchar(255)
    -> GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$.Name'))) VIRTUAL;
Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table CountryInfo add key (Name);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> explain SELECT doc FROM `world_x`.`CountryInfo` WHERE (JSON_EXTRACT(doc,'$.Name') = 'United States') LIMIT 1
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: CountryInfo
   partitions: NULL
         type: ref
possible_keys: name
          key: name
      key_len: 768
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

That is really cool! We have added an index, and now the original query starts using it. Note that we do not have to reference the new field, the MySQL optimizer is smart enough to translate the

(JSON_EXTRACT(doc,'$.Name') = 'United States'
 to an index scan on the virtual column.

But please note: JSON attributes are case-sensitive. If you will use

(doc,'$.name')
 instead of
(doc,'$.Name')
 it will not generate an error, but will simply break the search and all queries looking for “Name” will return 0 rows.

Finally, if you looked closely at the output of

db.getCollection("CountryInfo").find('Name= "United States"').limit(1)
 , you noticed that the database has outdated info:

"government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "George W. Bush",
            "HeadOfState_title": "President"
        }

Let’s change “George W. Bush” to “Barack Obama” using the .modify clause:

mysql-js> db.CountryInfo.modify("Name = 'United States'").set("government.HeadOfState", "Barack Obama" );
Query OK, 1 item affected (0.02 sec)
mysql-js> db.CountryInfo.find('Name= "United States"')
[
    {
        "GNP": 8510700,
        "IndepYear": 1776,
        "Name": "United States",
        "_id": "USA",
        "demographics": {
            "LifeExpectancy": 77.0999984741211,
            "Population": 278357000
        },
        "geography": {
            "Continent": "North America",
            "Region": "North America",
            "SurfaceArea": 9363520
        },
        "government": {
            "GovernmentForm": "Federal Republic",
            "HeadOfState": "Barack Obama",
            "HeadOfState_title": "President"
        }
    }
]
1 document in set (0.00 sec)

Conclusion

Document store is an interesting concept and a good add-on on top of the existing MySQL JSON feature. Using the new .find/.add/.modify methods instead of the original SQL statements can be convenient in some cases.

Some might ask, “why do you want to use document store and store information in JSON inside the database if it is relational anyway?” Storing data in JSON can be quite useful in some cases, for example:

  • You already have a JSON (i.e., from external feeds) and need to store it anyway. Using the JSON datatype will be more convenient and more efficient.
  • You have a flexible schema, typical for the Internet of Things for example, where some sensors might only send temperature data, some might send temperature/humidity/light (but light information is only recorded during the day), etc. Storing it in the JSON format can be more convenient so that you do not have to declare all possible fields in advance, and do not have to run “alter table” if a new sensor starts sending new types of data.

In the next two blog posts, I will show how to use document store for Internet of Things / event streaming, and how to use X Protocol for asynchronous queries in MySQL.

by Alexander Rubin at May 24, 2016 10:36 PM

pt-online-schema-change (if misused) can’t save the day

pt-online-schema-change

pt-online-schema-changeIn this blog post we’ll discuss pt-online-schema-change, and how to correctly use it.

Always use pt-osc?

Altering large tables can be still a problematic DBA task, even now after we’ve improved Online DDL features in MySQL 5.6 and 5.7. Some ALTER types are still not online, or sometimes just too expensive to execute on busy production master.

So in some cases, we may want to apply an

ALTER
 first on slaves, taking them out of traffic pool one by one and bringing them back after the
ALTER
 is done. In the end, we can promote one of the already altered slaves to be new master, so that the downtime/maintenance time is greatly minimized. The ex-master can be altered later, without affecting production. Of course, this method works best when the schema change is backwards-compatible.

So far so good, but there is another problem. Let’s say the table is huge, and

ALTER
 takes a lot of time on the slave. When it is a DML-blocking type ALTER (perhaps when using MySQL 5.5.x or older, etc.), there will be a long slave lag (if the table is being written by replication SQL thread at the same time, for example). So what do we to speed up the process and avoid the altered slave lag? One temptation that could tempt you is why not use pt-online-schema-change on the slave, which can do the ALTER in a non-blocking fashion?

Let’s see how it that would work. I need to rebuild big table on slave using MySQL version 5.6.16 (“null alter” was made online since 5.6.17) to reclaim disk space after some rows are deleted.

This example demonstrates the process (db1 is the master, db2 is the slave):

[root@db2 ~]# pt-online-schema-change --execute --alter "engine=innodb" D=db1,t=sbtest1
 No slaves found.  See --recursion-method if host db2 has slaves.
 Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
 Operation, tries, wait:
 analyze_table, 10, 1
 copy_rows, 10, 0.25
 create_triggers, 10, 1
 drop_triggers, 10, 1
 swap_tables, 10, 1
 update_foreign_keys, 10, 1
 Altering `db1`.`sbtest1`...
 Creating new table...
 Created new table db1._sbtest1_new OK.
 Altering new table...
 Altered `db1`.`_sbtest1_new` OK.
 2016-05-16T10:50:50 Creating triggers...
 2016-05-16T10:50:50 Created triggers OK.
 2016-05-16T10:50:50 Copying approximately 591840 rows...
 Copying `db1`.`sbtest1`:  51% 00:28 remain
(...)

The tool is still working during the operation, and the table receives some writes on master:

db1 {root} (db1) > update db1.sbtest1 set k=k+2 where id<100;
 Query OK, 99 rows affected (0.06 sec)
 Rows matched: 99  Changed: 99  Warnings: 0
db1 {root} (db1) > update db1.sbtest1 set k=k+2 where id<100;
 Query OK, 99 rows affected (0.05 sec)
 Rows matched: 99  Changed: 99  Warnings: 0

which are applied on slave right away, as the table allows writes all the time.

(...)
 Copying `db1`.`sbtest1`:  97% 00:01 remain
 2016-05-16T10:51:53 Copied rows OK.
 2016-05-16T10:51:53 Analyzing new table...
 2016-05-16T10:51:53 Swapping tables...
 2016-05-16T10:51:53 Swapped original and new tables OK.
 2016-05-16T10:51:53 Dropping old table...
 2016-05-16T10:51:53 Dropped old table `db1`.`_sbtest1_old` OK.
 2016-05-16T10:51:53 Dropping triggers...
 2016-05-16T10:51:53 Dropped triggers OK.
 Successfully altered `db1`.`sbtest1`.

Done! No slave lag, and the table is rebuilt. But . . . let’s just make sure data is consistent between the master and slave (you can use pt-table-checksum):

db1 {root} (db1) > select max(k) from db1.sbtest1 where id<100;
 +--------+
 | max(k) |
 +--------+
 | 392590 |
 +--------+
 1 row in set (0.00 sec)
db2 {root} (test) > select max(k) from db1.sbtest1 where id<100;
 +--------+
 | max(k) |
 +--------+
 | 392586 |
 +--------+
 1 row in set (0.00 sec)

No, it is not! The slave is clearly missing the updates that happened during a pt-osc run. Why?

The explanation is simple. The

pt-online-schema-change
 relies on triggers. The triggers are used to make the writes happening to the original table also populate to the temporary table copy, so that both tables are consistent when the final table switch happens at the end of the process. So what is the problem here? It’s the binary log format: in ROW based replication, the triggers are not fired on the slave! And my master is running in ROW mode:

db1 {root} (db1) > show variables like 'binlog_format';
 +---------------+-------+
 | Variable_name | Value |
 +---------------+-------+
 | binlog_format | ROW |
 +---------------+-------+
 1 row in set (0.01 sec)

So, if I used

pt-online-schema-change
 on the master, the data inconsistency problem doesn’t happen. But using it on the slave is just dangerous!

Conclusion

Whenever you use

pt-online-schema-change
, make sure you are not executing it on a slave instance. For that reason, I escalated this bug report: https://bugs.launchpad.net/percona-toolkit/+bug/1221372. Also in many cases, using a normal
ALTER
 will work well enough. As in my example, to rebuild the table separately on each slave in lockless mode, I would just need to upgrade to the more recent 5.6 version.

BTW, if you’re wondering about Galera replication (used in Percona XtraDB Cluster, etc.) since it also uses a ROW-based format, it’s not a problem. The pt-osc triggers are created in all nodes thanks to synchronous write-anywhere replication nature. It does not matter which node you start

pt-online-schema-change
 on, and which other nodes your applications writes on at the same time. No slaves, no problem! 🙂

by Przemysław Malkowski at May 24, 2016 06:27 PM

Webinar Thursday May 26: Troubleshooting MySQL hardware resource usage

Troubleshooting

Join Sveta on Thursday, May 26, 2016, at 10 am PDT (UTC-7) for her webinar Troubleshooting MySQL hardware resource usage.Troubleshooting MySQL hardware resource usage

MySQL does not just run on its own. It stores data on disk, and stores data and temporarily results in memory. It uses CPU resources to perform operations, and a network to communicate with its clients.

In this webinar, we’ll discuss common resource usage issues, how they affect MySQL Server performance, and methods to find out how resources are being used. We will employ both OS-level tools, and new features in Performance Schema that provide detailed information on what exactly is happening inside MySQL Server.

Register for the webinar here.

Troubleshooting MySQL hardware resource usageSveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015.

Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker, teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as Support Engineer in MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle.

She is the author of the book MySQL Troubleshooting and JSON UDF Functions for MySQL.

by Dave Avery at May 24, 2016 03:46 PM

Jean-Jerome Schmidt

Join us for our European Polyglot Persistence Meetups Tour this summer!

We’ve been gearing up for this in the past months and we’re delighted to announce the first dates of our European Polyglot Persistence Meetups this summer. We’re starting off with Amsterdam, then moving on to Dublin, Paris, Berlin, Stockholm and London.

Sign up for a Polyglot Persistence Meetup

Some of you may ask what Polyglot Persistence is all about …

Polyglot Persistence means that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by the application. In a world where developers reign, these developers would choose the optimal programming language for the job - and this trickles down to the data tier. The former is often referred to as polyglot languages and the latter as polyglot persistence. However, this freedom comes at a cost - more complexity.

In our world of 9s, we like to look at the database aspects of DevOps and this is also where we’re focussing with this new meetups series.

Initially, the idea is to cover MySQL, PostgreSQL and MongoDB storage backends with the topics of deployment, scaling, configuration, management, backups and monitoring. And with time and as members suggest topics, the scope is likely to broaden.

Sign up for a Polyglot Persistence Meetup

Our meetups schedule so far:

  • 06 June - Amsterdam
  • 08 June - Dublin
  • 16 June - Paris
  • 17 June - Berlin
  • Stockholm tbc
  • London tbc

Please sign up for the meetup of your choice, as we’re announcing locations, speakers etc. via the Meetup platform.

For our kick-off in Amsterdam, we’re lucky to be sponsored by booking.com and we’ll be hosting the meetup in their offices.

We look forward to seeing you there and to the polyglot discussions ahead!

by Severalnines at May 24, 2016 02:30 PM

May 23, 2016

Peter Zaitsev

Take Percona’s one-click high availability poll

high availability poll

high availability pollWondering what high availability (HA) solutions are most popular? Take our high availability poll below!

HA is always a hot topic. The reality is that if your data is not available, your customers cannot do business with you. In fact, estimates show the average cost of downtime is about $5K per minute. With an average outage taking 40 minutes to correct, you could be looking at a potential cost of $200K if your MySQL instance goes down. Whether your database is on premise, or in public or private clouds, it is critical that your database deployment does not have a potentially devastating single point of failure.

Please take a few seconds and answer the following poll. It will help the community get an idea of how companies are approaching HA in their critical database environments.

If you’re using other solutions or have specific issues, feel free to comment below. We’ll post a follow-up blog with the results!

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

by Dave Avery at May 23, 2016 08:47 PM

Percona disabling TLSv1.0 May 31st 2016

disabling TLSv1.0

disabling TLSv1.0As of May 31st, 2016, we will be disabling TLSv1.0 support on www.percona.com, repo.percona.com, etc.

This is ahead of the PCI changes that will affect the June 30th 2016 deprecation the TLSv1.0 protocol. (PDF)

What does this mean for you the user?

Based on analysis of our IDS logs, this will affect around 6.32% of requests. As of May 31st, such requests will present an error when trying to negotiate a TLS connection.

Users are advised to update their clients accordingly. SSLabs provides a good test for browsers, though this does not support command line tools. Going forward, we will only support TLSv1.1 and TLSv1.2.

These changes come a little over a year from our previous SSL overhaul, and are part of our ongoing effort to ensure the security of our users.

Thank you for your time. Please leave any questions in the comments section, or email us at security(at)percona.com.

 

 

by David Busby at May 23, 2016 06:09 PM

Percona XtraBackup 2.4.3 is now available

Percona XtraBackup 2.4.3


Percona XtraBackup 2.4.3Percona
is glad to announce the GA release of Percona XtraBackup 2.4.3 on May 23rd, 2016. Downloads are available from our download site and from apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups

New Features:

  • Percona XtraBackup has implemented new --reencrypt-for-server-id option. Using this option allows users to start the server instance with different server_id from the one the encrypted backup was taken from, like a replication slave or a Galera node. When this option is used, xtrabackup will, as a prepare step, generate a new master key with ID based on the new server_id, store it into keyring file and re-encrypt the tablespace keys inside of tablespace headers.

Bugs Fixed:

  • Running DDL statements on Percona Server 5.7 during the backup process could in some cases lead to failure while preparing the backup. Bug fixed #1555626.
  • MySQL 5.7 can sometimes skip redo logging when creating an index. If such ALTER TABLE is being issued during the backup, the backup would be inconsistent. xtrabackup will now abort with an error message if such ALTER TABLE has been done during the backup. Bug fixed #1582345.
  • .ibd files for remote tablespaces were not copied back to the original location pointed by the .isl files. Bug fixed #1555423.
  • When called with insufficient parameters, like specifying the empty --defaults-file option, Percona XtraBackup could crash. Bug fixed #1566228.
  • The documentation states that the default value for –ftwrl-wait-query-type is all, however it was update. Changed the default value to reflect the documentation. Bug fixed #1566315.
  • When –keyring-file-data option was specified, but no keyring file was found, xtrabackup would create an empty one instead of reporting an error. Bug fixed #1578607.
  • If ALTER INSTANCE ROTATE INNODB MASTER KEY was run at the same time when xtrabackup --backup was bootstrapping it could catch a moment when the key was not written into the keyring file yet and xtrabackup would overwrite the keyring with the old copy of a keyring, so the new key would be lost. Bug fixed #1582601.
  • The output of the --slave-info option was missing an apostrophe. Bug fixed #1573371.

Release notes with all the bugfixes for Percona XtraBackup 2.4.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at May 23, 2016 01:56 PM

Jean-Jerome Schmidt

MySQL slave promotion with- and without using GTID

MySQL Replication can be fragile: whenever it encounters a connectivity error, it will retry and if it is a serious error, it will simply stop. Obviously in the latter case, you will need to repair the broken replication yourself.

The most common problems with replication are when it stops due to either master failure or network problems. In case the master fails, the whole topology becomes read-only and this means the write queries cannot be applied anymore. This is where normally you would promote one of the replicas to become the new master. To illustrate the difference in promotion between GTID and non-GTID cases, we will go through the manual promotion process below.

Most advanced slave without GTID

The first step in this promotion is to find the most advanced slave. As the master is no longer available, not all replicas may have copied and applied the same amount of transactions. So it is key to find the most advanced slave first.

We first iterate through all replicas to see which one is the furthest in the last binary log and elect this host to become the new master.

SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.12.11
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000003
          Read_Master_Log_Pos: 1447420
...
          Exec_Master_Log_Pos: 1447420

The next step is to advance the other replicas to the latest transactions on the candidate master. As the replicas are logging their slave updates in their own binary logs, they have a different numbering for their own transactions and thus it is very difficult to match this data. An automated tool like MySQL HA Master (MHA) is capable of doing this, so when you are failing over by hand, you generally scan through the binary logs or skip these transactions.

Once we have done this, we tell the replicas to start replicating from the designated points of the new master.

CHANGE MASTER TO 
  MASTER_HOST = 'new.master',
  MASTER_PORT = 3306,
  MASTER_USER = 'repl',
  MASTER_PASSWORD = 'repl',
  MASTER_LOG_FILE = 'binlog.000002',
  MASTER_LOG_POS = 1446089;

Most advanced slave with GTID

By far the greatest benefit of using GTIDs within replication is that within the replication topology, all we have to do is to find the most advanced slave, promote it to master, and point the others to this new master.

So the most advanced slave is the same as without GTID:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.12.14
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000003
          Read_Master_Log_Pos: 1590
...
          Exec_Master_Log_Pos: 1590

Finding out which part of the binary logs the other hosts are missing is not necessary, as the new master’s binary logs already contain transactions with the GTIDs of the dead master and thus the slaves can realign with the new master automatically. This applies to both MariaDB and MySQL implementations of GTID.

CHANGE MASTER TO 
  MASTER_HOST = new.master',
  MASTER_PORT = 3306,
  MASTER_USER = 'repl',
  MASTER_PASSWORD = 'repl',
  MASTER_AUTO_POSITION = 1;

As you can see this is a far more reliable way of promoting a slave to a master without the chance of loss of transactions. Therefore GTID failover is the preferred way in ClusterControl.

To learn more about MySQL Replication and failover, check out our MySQL Replication Blueprint eBook  for a more comprehensive walkthrough. Apart from failover, it covers installation and configuration best practices, what you should monitor, how to make topology changes and repair broken setups, backup strategies, schema changes and load balancing with read-write splitting.

by Severalnines at May 23, 2016 11:56 AM

May 20, 2016

Peter Zaitsev

Introduction to Troubleshooting Performance – Troubleshooting Slow Queries webinar: Q & A

Troubleshooting Slow Queries

Troubleshooting Slow QueriesIn this blog, I will provide answers to the Q & A for the Troubleshooting Slow Queries webinar.

First, I want to thank you for attending the April 28 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: I’ve heard that is a bad idea to use

select *
; what do you recommend?

A: When I used

SELECT *
 in my slides, I wanted to underline the idea that sometimes you need to select all columns from the table. There is nothing bad about it if you need them.
SELECT *
 is bad when you need only a few columns from the table. In this case, you retrieve more data than needed, which affects performance. Another issue that  
SELECT *
 can cause is if you hard-code the statement into your application, then change table definition; the application could start retrieving columns in wrong order and output (e.g., email instead of billing address). Or even worse, it will try to access a non-existent index in the result set array. The best practice is to explicitly enumerate all columns that your application needs.

Q: I heard that using 

index_field
 length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?

A: I assume you are asking about the ability to create an index with lengths smaller than the column length? They work as follows:

Assume you have a 

TEXT
  field which contains these user questions:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using 
    index_field
     length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?
  3. ….

Since this is a 

TEXT
  field you cannot create and index on it without specifying its length, so you need to make the index as minimal as possible to uniquely identify questions. If you create an index with length 10 it will contain:

  1. I’ve heard
  2. I heard th

You will index only those parts of questions that are not very distinct from each other, and do not contain useful information about what the question is. You can create index of length 255:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as index

In this case, the index includes the whole first question and almost all the second question. This makes the index too large and requires us to use more disk space (which causes more IO). Also, information from the second question is probably too much.

If make index of length 75, we will have:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle du

This is more than enough for the first question and gives a good idea of what is in the second question. It also potentially will have enough unique entries to make its cardinality look more like the cardinality of real data distribution.

To conclude: choosing the correct index length is something that requires practice and analysis of your actual data. Try to make them as short as possible, but long enough so that the number of unique entries in the index will be similar to a number of unique entries in the table.

Q: Which view can we query to see stats?

A: Do you mean index statistics?

SHOW INDEX FROM table_name
 will do it.

Q: We have an InnoDB table with 47 fields (mostly text); some are ft-indexed. I tried to do an alter table, and it ran for 24 hours. What is the best way to run an alter table to add one extra field? The table has 1.9 M rows and 47 columns with many indexes.

A: Adding a column requires a table copy. Therefore, the speed of this operation depends on the table size and speed of your disk. If you are using version 5.6 and later, adding a column would not block parallel queries (and therefore is not a big deal). If you are using an older version, you can always use the pt-online-schema-change utility from Percona Toolkit. However, it will run even more slowly than the regular

ALTER TABLE
. Unfortunately, you cannot speed up the execution of
ALTER TABLE
 much. The only thing that you can do is to use a faster disk (with options, tuned to explore speed of the disk).

However, if you do not want to have this increased IO affect the production server, you can alter the table on the separate instance, then copy tablespace to production and then apply all changes to the original table from the binary logs. The steps will be something like:

  1. Ensure you use option
    innodb_file_per_table
      and the big table has individual tablespace
  2. Ensure that binary log is enabled
  3. Start a new server (you can also use an existent stand-by slave).
  4. Disable writes to the table
  5. Record the binary log position
  6. Copy the tablespace to the new server as described here.
  7. Enable writes on the production server
  8. Run
    ALTER TABLE
     on the new server you created in step 2 (it will still take 24 hours)
  9. Stop writes to the table on the production server
  10. Copy the tablespace, altered in step 7
  11. Apply all writes to this table, which are in the binary logs after position, recorded in step 4.
  12. Enable writes to the table

This scenario will take even more time overall, but will have minimal impact on the production server

Q: If there is a compound index like index1(emp_id,date), will the following query be able to use index? “select * from table1 where emp_id = 10”

A: Yes. At least it should.

Q: Are 

filesort
 and
temporary
 in extended info for explain not good?

A: Regarding

filesort
: it depends. For example, you will always have the word
filesort
” for tables which perform 
ORDER BY
 and cannot use an index for
ORDER BY
. This is not always bad. For example, in this query:

mysql> explain select emp_no, first_name from employees where emp_no <20000 order by first_nameG
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 18722
     filtered: 100.00
        Extra: Using where; Using filesort
1 row in set, 1 warning (0,01 sec)

the primary key used to resolve rows and

filesort
 were necessary and not avoidable. You can read about different
filesort
 algorithms here.

Regarding

Using temporary
: this means what during query execution temporary table will be created. This is can be not good, especially if the temporary table is large and cannot fit into memory. In this case, it would be written to disk and slow down operations. But, again, sometimes creating temporary tables in not avoidable, for example, if you have both
GROUP BY
 and
ORDER BY
 clauses which list columns differently as stated in the user manual.

Q: Is

key_len
 length more of a good thing for query execution?

A:

key_len
 field is not
NULL
 for all queries that use and index, and just shows the length of the key part used. It is not good or bad, it is just for information. You can use this information, for example, to identify which part of combined index is used to resolve the query.

Q: Does an alter query go for an optimizer check?

A: No. You can check it either by enabling optimizer trace, running

ALTER
 and find what trace is empty. Or by enabling the debug option and searching the resulting trace for
optimize
.

Q: A query involves four columns that are all individually covered by an index. The optimizer didn’t merge indexes because of cost, and even didn’t choose the composite index I created.

A: This depends on the table definition and query you used. I cannot provide a more detailed answer based only on this information.

Q cont.: Finally, only certain composite indexes were suitable, the column order in the complex index mattered a lot. Why couldn’t the optimizer merge the four individual single column indexes, and why did the order of the columns in the composite index matter?

A: Regarding why the optimizer could not merge four indexes, I need to see how the table is defined and which data is in these indexed columns. Regarding why the order of the columns in the composite index matters, it depends on the query. Why the optimizer can use an index, say, on

(col1, col2)
 where the conditions
col1=X AND col2=Y
 and
col2=Y AND col2=X
 for the case when you use
OR
, the order is important. For example, for the condition
col1=X OR col2=Y
, where the part
col1=X
 is always executed and the part
col2=Y
  is executed only when
col1=X
 is false. The same logic applies to queries like
SELECT col1 WHERE col2=Y ORDER BY col3
. See the user manual for details.

Q: When I try to obtain the optimizer trace on the console, the result is cut off. Even if I redirect the output to a file, how to overcome that?

A: Which version of MySQL Server do you use? The 

TRACE
 column is defined as
longtext NOT NULL
, and should not cause such issues. If it does with a newer version, report a bug at http://bugs.mysql.com/.

Q: Are there any free graphical visualizers for either EXPLAIN or the optimizer TRACE output?

A: There is graphical visualizer for

EXPLAIN
 in MySQL Workbench. But it works with online data only: you cannot run it on
EXPLAIN
 output, saved into a file. I don’t know about any visualizer for the optimizer
TRACE
 output. However, since it is
JSON
 you can simply save it to file and open in web browser. It will allow a better view than if opened in simple text editor.

Q: When do you use force index instead of

use index
 hints?

A: According to user manual “

USE INDEX (index_list)
 hint tells MySQL to use only one of the named indexes to find rows in the table” and “
FORCE INDEX
  hint acts like
USE INDEX (index_list)
, with the addition that a table scan is assumed to be very expensive . . . a table scan is used only if there is no way to use one of the named indexes to find rows in the table.” This means that when you use
USE INDEX
, you are giving a hint for the optimizer to prefer a particular index to others, but not enforcing index usage if the optimizer prefers a table scan, while
FORCE INDEX
 requires using the index. I myself use only
FORCE
 and
IGNORE
  index hints.

Q: Very informative session. I missed the beginning part. Are you going to distribute the recoded session later?

A: Yes. As usual slides and recording available here.

by Sveta Smirnova at May 20, 2016 08:50 PM

Percona XtraDB Cluster 5.6.29-25.15 is now available

Percona XtraDB Cluster 5.6


Percona XtraDB Cluster 5.6Percona
is glad to announce the new release of Percona XtraDB Cluster 5.6 on May 20, 2016. Binaries are available from the downloads area or our software repositories.

Percona XtraDB Cluster 5.6.29-25.15 is now the current release, based on the following:

All of Percona software is open-source and free, and all the details of the release can be found in the 5.6.29-25.15 milestone at Launchpad.

For more information about relevant Codership releases, see this announcement.

Bugs Fixed:

  • Node eviction in the middle of SST now causes the node to shut down properly.
  • After an error during node startup, the state is now marked unsafe only if SST is required.
  • Fixed data inconsistency during multi-insert auto-increment workload on async master with binlog-format=STATEMENTwhen a node begins async slave with wsrep_auto_increment_control=ON.
  • Fixed crash when a prepare statement is aborted (due to a conflict with applier) and then replayed.
  • Removed a special case condition in wsrep_recover() that would not happen under normal conditions.
  • Percona XtraDB Cluster no longer fails during SST, if a node reserves a very large amount of memory for InnoDB buffer pool.
  • If the value of wsrep_cluster_address is not valid, trying to create a slave thread will now generate a warning instead of an error, and the thread will not be created.
  • Fixed error with loading data infile (LDI) into a multi-partitioned table.
  • The wsrep_node_name variable now defaults to host name.
  • Starting mysqld with unknown option now fails with a clear error message, instead of randomly crashing.
  • Optimized the operation of SST and IST when a node fails during startup.
  • The wsrep_desync variable can now be enabled only after a node is synced with cluster. That is, it cannot be set during node bootup configuration).
  • Fixed crash when setting a high flow control limit (fc_limit) and the recv queue fills up.
  • Only the default 16 KB page size (innodb_page_size=16384) is accepted until the relevant upstream bug is fixed by Codership (see https://github.com/codership/galera/issues/398). All other sizes will report Invalid page size and shut down (the server will not start up).
  • If a node is executing RSU/FTWRL, explicit desync of the node will not happen until the implicit desync action is complete.
  • Fixed multiple bugs in the test suite to improve quality assurance.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at May 20, 2016 02:06 PM

Jean-Jerome Schmidt

Become a ClusterControl DBA: Operational Reports for MySQL and MariaDB

The majority of DBA’s perform health checks every now and then. Usually, it would happen on a daily or weekly basis. We previously discussed why such checks are important and what they should include.

To make sure your systems are in a good shape, you’d need to go through quite a lot of information - host statistics, MySQL statistics, state of backups, logs and so forth. Such data should be available in every properly monitored environment, although sometimes it is scattered across multiple locations - you may have one tool to monitor MySQL state, another tool to collect system statistics, maybe a set of scripts, e.g., to check the state of your backups. This makes health checks much more time-consuming than they should be - the DBA has to put together the different pieces to understand the state of the system.

Integrated tools like ClusterControl have an advantage that all of the bits are located in the same place (or in the same application). It still does not mean they are located next to each other - they may be located in different sections of the UI and a DBA may have to spend some time clicking through the UI to reach all the interesting data. This is why we introduced operational reports in ClusterControl 1.3 - which you can discover live next Tuesday during our release webinar.

The whole idea behind creating Operational Reports is to put all of the most important data into a single document, which can be quickly reviewed to get an understanding of the state of the databases.

See the Operational Reports live in action during our release webinar

Operational Reports are available from the menu Settings -> Operational Reports.

Once you go there, you’ll be presented with a list of reports created manually or automatically, based on a pre-defined schedule.

If you want to create a new report manually, you’ll use the ‘Create’ option. Pick the cluster, type of report, email recipients, and you’re pretty much done.

The reports can also be scheduled to be created on a regular basis.

At this time, three types of  reports are available  and we’ll show examples of these below.

Availability report

Availability reports focuses on, well, availability. It includes three sections. First, availability summary.

You can see information about availability statistics of your databases, the cluster type, total uptime and downtime, current state of the cluster and when that state last changed.

Another section gives more details on availability.

We can see when a node switched state and what the transition was. It’s a nice place to check if there were any recent problems with the cluster.

Similar data is shown in the third section of this report, where you can go through the history of changes in cluster state.

Backup report

The second type of the report is one covering backups.

It contains two sections and basically gives you a short summary of when the last backup was created, if it completed successfully or failed? You can also check the list of backups executed on the cluster with their state, type and size. This is as close you can get to be certain that backups work correctly without running a full recovery test. We definitely recommend that such tests are performed every now and then.

Default cluster report

This type of report contains detailed information about a particular cluster. It starts with a summary of different alerts which are related to the cluster.

Next section is about the state of the nodes that are part of the cluster.

You have a list of the nodes in the cluster, their type, role (master or slave), status of the node, uptime and the OS.

Another section of the report is the backup summary, same as we discussed above. Next one presents a summary of top queries in the cluster.

Finally, we see a “Node status overview” in which you’ll be presented with graphs related to OS and MySQL metrics for each node.

As you can see, we have here graphs covering all of the aspects of the load on the host - CPU, memory, network, disk, CPU load and disk free. This is enough to get an idea whether anything weird happened recently or not. You can also see some details about MySQL workload - how many queries were executed, which type of query, how the data was accessed (via which handler)? This, on the other hand, should be enough to pick most of the issues on MySQL side. What you want to look at are all spikes and dips that you haven’t seen in the past. Maybe a new query has been added to the mix and, as a result, handler_read_rnd_next skyrocketed? Maybe there was an increase of CPU load and a high number of connections might point to increased load on MySQL, but also to some kind of contention. An unexpected pattern might be good to investigate, so you know what is going on.

See the Operational Reports live in action during our release webinar

This is the first release of this feature, we’ll be working on it to make it more flexible and even more useful. We’d love to hear your feedback on what you’d like to have included in the report, what’s missing and what is not needed.

by Severalnines at May 20, 2016 09:59 AM

May 19, 2016

Peter Zaitsev

Fixing MySQL scalability problems with ProxySQL or thread pool

Fixing MySQL scalability problems

In this blog post, we’ll discuss fixing MySQL scalability problems using either ProxySQL or thread pool.

In the previous post I showed that even MySQL 5.7 in read-write workloads is not able to maintain throughput. Oracle’s recommendation to play black magic with

innodb_thread_concurrency
 and
innodb_spin_wait_delay
 doesn’t always help. We need a different solution to deal with this scaling problem.

All the conditions are the same as in my previous run, but I will use:

  • ProxySQL limited to 200 connections to MySQL. ProxySQL has a capability to multiplex incoming connections; with this setting, even with 1000 connections to the proxy it will maintain only 200 connections to MySQL.
  • Percona Server with enabled thread pool, and a thread pool size of 64

You can see final results here:

Fixing MySQL scalability problems

There are good and bad sides for both solutions. With ProxySQL, there is a visible overhead on lower numbers of threads, but it keeps very stable throughput after 200 threads.

With Percona Server thread pool, there is little-to-no overhead if the number of connections is less than thread pool size, but after 200 threads it falls behind ProxySQL.

There is chart with response times

I would say the correct solution depends on your setup:

  • If you already use or plan to use ProxySQL, you may use it to prevent MySQL from saturation
  • If you use Percona Server, you might consider trying to adjust the thread pool

Summary https://github.com/Percona-Lab-results/201605-OLTP-RW-proxy-threadpool/blob/master/summary-OLTP-RW-proxy.md.

 

by Vadim Tkachenko at May 19, 2016 08:58 PM

Webinar Tuesday, May 24: Understanding how your MongoDB schema affects scaling, and when to consider sharding for help

mongodb-support-leaf

Please join David Murphy on Tuesday, May 24 at 10 am PDT (UTC-7) as he presents “Understanding how your MongoDB schema affects scaling, and when to consider sharding for help.”

David will discuss the pros and cons of a few MongoDB schema design patterns on a stand-alone machine, and then will look at how sharding affects them.  He’ll examine what assumptions did you make that could cause havoc on your CPU, memory and network during a scatter gather.   This webinar will help answer the questions:

  • Would you still use the same schema if you knew you were going to shard?
  • Are your fetches using the same shard, or employing parallelism to boost performance?
  • Are you following the golden rules of schema design?

Register for this webinar here.

MongoDB Schema ShardingDavid Murphy, MongoDB Practice Manager

David joined Percona in October 2015 as Practice Manager for MongoDB. Before that, David joined the ObjectRocket by Rackspace team as the Lead DBA in Sept 2013. With the growth involved with any recently acquired startup, David’s role covered a wide range of evangelism, research, run book development, knowledge base design, consulting, technical account management, mentoring and much more. Before the world of MongoDB, David was a MySQL and NoSQL architect at Electronic Arts working with some of the largest titles in the world like FIFA, SimCity, and Battle Field providing tuning, design, and technology choice responsibilities. David maintains an active interest in database speaking and exploring new technologies.

by Dave Avery at May 19, 2016 08:02 PM

Percona Server for MongoDB 3.0.11-1.6 is now available

Print

Percona is pleased to announce the release of Percona Server for MongoDB 3.0.11-1.6 on May 19, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.11-1.6 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.11, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The MongoRocks storage engine is still under development. There is currently no officially released version of MongoRocks that can be recommended for production.


This release includes all changes from MongoDB 3.0.11. Additionally, the following fixes were made:

  • Fixed memory over-allocation
  • PSMDB-56: Additional fixes related to this previously fixed bug.

The release notes are available in the official documentation.

 

by Alexey Zhebel at May 19, 2016 04:10 PM

Percona Server 5.5.49-37.9 is now available

Percona_ServerLogoVert_CMYK


Percona is glad to announce the release of Percona Server 5.5.49-37.9 on May 19, 2016. Based on MySQL 5.5.49, including all the bug fixes in it, Percona Server 5.5.49-37.9 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. Details of the release can be found in the 5.5.49-37.9 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

  • Percona Server is now built with system zlib library instead of the older bundled one. Bug fixed #1108016.
  • CREATE TABLE ... LIKE ... could create a system table with an unsupported enforced engine. Bug fixed #1540338.
  • The server will now show a more descriptive error message when Percona Server fails with errno == 22 "Invalid argument", if innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1578604.
  • apt-cache show command for percona-server-client was showing innotop included as part of the package. Bug fixed #1201074.
  • mysql-systemd would fail with PAM authentication and proxies due to regression introduced when fixing bug #1534825 in Percona Server 5.5.48-37.8. Bug fixed #1558312.

Other bugs fixed: #1578625 (upstream #81295), bug fixed #1553166, and bug fixed #1578303 (upstream #81324).

The release notes for Percona Server 5.5.49-37.9 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at May 19, 2016 03:39 PM

Jean-Jerome Schmidt

Planets9s - Download the new ClusterControl 1.3 for MySQL, MongoDB & PostgreSQL

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Download the new ClusterControl 1.3 for MySQL, MongoDB & PostgreSQL

This week we were excited to announce the release of ClusterControl 1.3. This release contains key new features, such as Key Management for MySQL, MariaDB, Percona Server and PostgreSQL, improved security, additional Operational Reports, along with performance improvements and bug fixes. Do check it out if you haven’t downloaded it yet, and let us know your feedback.

Download the new ClusterControl

Sign up for the ClusterControl 1.3 new features webinar

Join us for our webinar next Tuesday, May 24th, on ClusterControl 1.3, the one-stop console for your entire database infrastructure. We’ll be introducing the new features of this release as well as demonstrating them during a live demo.

Sign up for the webinar

Learn the difference between Multi-Master and Multi-Source replication

This new blog post discusses the lesser known Multi-Master and Multi-Source replication topologies. Even though they sound similar they are actually quite different. Here we illustrate their differences and provide insight into when to apply and how to best configure them.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at May 19, 2016 11:44 AM

May 18, 2016

Peter Zaitsev

Where is the MySQL 5.7 root password?

MySQL 5.7 root password

MySQL 5.7 root passwordIn this blog, we’ll discuss how to find the MySQL 5.7 root password.

While new MySQL software security features are always welcome, they can impact use and performance. Now by default, MySQL 5.7 creates a password for the root user (among other changes) so the installation itself can be considered secure. It’s a necessary change, but it has confused some customers and users. I see a lot of people on social networks (like Twitter) asking about this change.

Where is my root password?

The answer depends on the way you have installed MySQL 5.7 or Percona Server 5.7. I am going to show where to find the password depending on the installation method and the distribution used. For all these examples, I assume this is a new installation and you are using the default my.cnf.

Centos/Redhat – RPM Packages.

The password is not shown on screen during the installation. It is in the error log. The autogenerated my.cnf contains this line:

log-error=/var/log/mysqld.log

So, there is our password:

# cat /var/log/mysqld.log | grep "temporary password"
2016-05-16T07:09:49.796912Z 1 [Note] A temporary password is generated for root@localhost: 8)13ftQG5OYl

Debian/Ubuntu

During the packages installation, you get a prompt asking for the root password. If you don’t set it up, MySQL’s root user is created without a password. We can read the following line in package installation output:

2016-05-16T07:27:21.532619Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.

but it is configured with the auth_socket plugin. You will only be able to connect using the UNIX socket, therefore any attempt to connect using your local IP or the network fails. Later on, you can change the password to allow connections from the network (as explained in this blog post).

All distributions – Binary tarball

mysql_install_db
 has been deprecated since MySQL 5.7.6. You need to use mysqld to initialize all system databases (like mysql, it contains the users and password). You have two ways of doing it:

–initialize: this is the default and recommended option. It will create a mysql database including a random password that will be written in the error log.

# tail -n1 /var/log/mysql/error.log
2016-05-16T07:47:58.199154Z 1 [Note] A temporary password is generated for root@localhost: wzgds/:Kf2,g

If you don’t have error-log directive configured, or any my.cnf at all, then it will be in the datadir with host_name.err name.

–initialize-insecure: datadir will be initialized without setting a random password to the root user.

# tail -n1 /var/log/mysql/error.log
2016-05-16T07:51:28.506142Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.

Conclusion

Unfortunately, more security can also add more confusion. Depending on the installation method and distribution, the MySQL 5.7 root password process varies a lot, so keep an eye on the error log after every installation and also watch the installation process output shown on screen. In case you are really lost (or you have removed the error log for some reason), you can still start mysqld with

--skip-grant-tables
 to access the database and change the password.

by Miguel Angel Nieto at May 18, 2016 05:26 PM

Webinar Thursday May 19, 2016: MongoDB administration for MySQL DBA

MongoDB administration

MongoDB administrationPlease join Alexander Rubin, Percona Principal Consultant, for his webinar MongoDB administration for MySQL DBA on Thursday, May 19 at 10 am PDT (UTC-7).

If you are a MySQL DBA and want to learn MongoDB quickly – this webinar is for you. MySQL and MongoDB share similar concepts so it will not be hard to get up to speed with MongoDB.

In this talk I will explain the following MongoDB administration concepts:
  • Day to day operations for MongoDB
  • Storage engines and differences with MySQL storage engines
  • Databases, collections and documents
  • Replication in MongoDB and the difference with MySQL replication
  • Sharding in MongoDB
  • Backups in MongoDB

In the webinar, each slide will show a MySQL concept or operation (on the left) and the corresponding MongoDB one (on the right).

Register here.

MongoDB administrationAlexander Rubin, Principal Consultant

Alexander joined Percona in 2013. Alexander has worked with MySQL since 2000 as a DBA and Application Developer. Before joining Percona, he was a MySQL principal consultant for over seven years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.

by Dave Avery at May 18, 2016 04:21 PM

MariaDB AB

Planning for default changes in MariaDB Server 10.2

Colin Charles

MariaDB Server 10.2 has been a long time coming, as planning goes. We met in Amsterdam in October 2015 to start fleshing things out (and also managed a 10.1 GA then), and made a first alpha release in April 2016. If all goes well, 2016 will definitely see the GA release of MariaDB Server 10.2.

But this also means that it may be time to make changes in the server, and there is lively discussion on the maria-discuss/maria-developers mailing lists on some of these topics. In this post I'd like to bring your attention to removing reserved keywords, looking at syntax for aggregate stored functions as well as looking forward to default configuration changes in MariaDB Server 10.2.

One of the first discussions would be started by developer Alexander Barkov, as to why there are reserved keywords: UTC_TIME, UTC_DATE, UTC_DATETIME. The idea was to make them non-reserved keywords in MariaDB Server 10.2, and Peter Laursen and I started of with the idea that things remain compatible with MySQL, and maybe filing a bug to have them removed as reserved keywords there too. Sergei offered a good explanation as to why they were made reserved in the first place (i.e. a mistake), and in principle this made me OK with removing their reserved nature. Jean-François Gagné from Booking chimed in, suggesting that wider consensus would be a good idea -- hence this post! Will Fong, support engineer at MariaDB Corporation cites this as a migration issue (and if this is the case, I'd like to see examples, so please feel free to drop a comment here). While Sergei believes this is a bikeshed colour issue, and the change won't happen in MariaDB Server 10.2, I obviously think it deserves more attention.

For MariaDB Server 10.2 its also worth noting that if we're focused on syntax, aggregate stored functions will require this decision soon, since its part of Google Summer of Code 2016.

And while we're on planning MariaDB Server 10.2, Daniel Black has kicked off discussions on configuration changes to consider in MariaDB Server 10.2, with the attached Jira ticket, MDEV-7635 update defaults and simplify mysqld config parameters.

Looking forward to comments here, on the mailing lists, or even Jira tickets.

About the Author

Colin Charles's picture

Colin Charles is the Chief Evangelist for MariaDB since 2009, work ranging from speaking engagements to consultancy and engineering works around MariaDB. He lives in Kuala Lumpur, Malaysia and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He's well known on the conference track having spoken at many of them over the course of his career.

by Colin Charles at May 18, 2016 02:13 PM

Jean-Jerome Schmidt

We’re hiring - again! Looking for our new QA Automation Engineer! Apply within!

We’ve just announced the latest version of our flagship product, ClusterControl, this week and as our product grows, so does our team.

As a result, we’re now looking for a QA automation wizard, someone who’ll help us take our product development to where we want it to be: short, well-tested release cycles.

This role involves extensive programming, test execution, reporting, and setting technical direction for testing Severalnines software.

And it involves being part of a great (if we say so), distributed team of highly skilled and motivated colleagues building the next generation database cluster management applications.

These are some of the technologies that we currently use and that you’d be involved with:

  • JavaScript, PHP, C++, Python, HTML, CSS, NodeJS, Grunt, NPM, CMake, Jenkins, Git
  • CakePHP, jQuery, ExtJS4, Highcharts, AWS SDK
  • Bootstrap and custom css
  • PHP based REST API (db) and RPC based API (backend process)
  • Classic LAMP stack

View the full job description here.

To apply, please email jobs@severalnines.com with your CV and links to your Github and/or LinkedIn profiles.

We look forward to hearing from you!

by Severalnines at May 18, 2016 11:02 AM

May 17, 2016

Peter Zaitsev

MySQL 5.7 read-write benchmarks

MySQL 5.7 read-write benchmarks

In this post, we’ll look at the results from some MySQL 5.7 read-write benchmarks.

In my past blogs I’ve posted benchmarks on MySQL 5.5 / 5.6 / 5.7 in OLTP read-only workloads. For example:

Now, it is time to test some read-write transactional workloads. I will again use sysbench, and my scripts and configs are available here: https://github.com/Percona-Lab-results/201605-OLTP-RW.

A short description of the setup:

  • The client (sysbench) and server are on different servers, connected via 10Gb network
  • CPU: 56 logical CPU threads servers Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
  • sysbench ten tables x 10mln rows, Pareto distribution
  • OS: Ubuntu 15.10 (Wily Werewolf)
  • Kernel 4.2.0-30-generic
  • The storage device is Samsung SM863 SATA SSD, single device, with ext4 filesystem
  • MySQL versions: 5.7.12 , 5.6.30 and 5.5.48

InnoDB holds all data in memory, and InnoDB log files are big enough, so there are only IO writes (which happen in the background) and there is no pressure from InnoDB on the IO subsystem.

The results looked like the following:

MySQL 5.7 read-write benchmarks

The vertical line shows the variability of the throughput (standard deviation).

To show the difference for a lower numbers of threads, here is chart with relative performance normalized by MySQL 5.7 (MySQL 5.7 = 1 in the following chart):

MySQL 5.7 read-write benchmarks

So we can finally see significant improvements in MySQL 5.7, it scales much better in read-write workloads than previous versions.

In the lower numbers of threads, however, MySQL 5.7 throughput is still behind MySQL 5.5 and MySQL 5.6, this is where slower single thread performance in MySQL 5.7 and longer execution paths show themselves. The problem with low threads read-write performance is replication. I wonder how a slave 5.7 performs comparing to 5.6 slave – I am going to run this benchmark soon.

Another point to keep in mind is that we still see a “bell shape,” even for MySQL 5.7. After 430 threads, the throughput drops off a cliff. Despite Oracle’s claims that there is no need for a thread pool anymore, this is not the case – I am not able to prevent a throughput drop using magic tuning with

innodb_thread_concurrency
 and
innodb_spin_wait_delay
. No matter what, MySQL 5.7 is not able to maintain throughput on a high amount of threads (1000+) for this workload.

What can be done in this case? I have two solutions: Percona Server with thread pool functionality, or ProxySQL with connection multiplexing. I will show these results in the next post.

 

 

 

by Vadim Tkachenko at May 17, 2016 05:22 PM

Jean-Jerome Schmidt

ClusterControl 1.3 Released with New Features for MySQL, Percona Server, MariaDB, MongoDB & PostgreSQL

The Severalnines team is pleased to announce the release of ClusterControl 1.3.

This release contains key new features, such as Key Management for MySQL, MariaDB, Percona Server and PostgreSQL, improved security, additional Operational Reports, along with performance improvements and bug fixes.

Join us next week on Tuesday, May 24th, for a live demo!

Sign up for the webinar

Highlights

  • New for MySQL, MariaDB, Percona Server & PostgreSQL
    • Key Management
    • Additional Operational Reports
    • Improved Security
    • Create/Mirror Repository
  • New for MySQL
    • Deploy Production Setup of NDB/MySQL Cluster
  • New for MongoDB
    • Deploy Percona for MongoDB ReplicaSet Node

For additional details about the release:

Key Management: This new feature allows you to manage a set of SSL certificates and keys that can be provisioned on your clusters. Users can now create certificate authority certificates or self-signed certificates and keys, as well as easily enable and disable SSL encrypted client-server connections for MySQL and Postgres based clusters.

Additional Operational Reports for MySQL, MariaDB, Percona, MongoDB & PostgreSQL: In addition to the Operational Reports introduced in 1.2.12, users can now generate an Availability Summary of uptime/downtime for their managed clusters and see node availability and cluster state history during the reported period. It is also possible to generate a backup summary of backup success/failure rates for your managed clusters.

Improved security: We are now enforcing a unique Controller RPC API Token, which enables token authentication for your managed clusters. No user intervention is needed when upgrading older ClusterControl versions. An unique token will be automatically generated, set and enabled for existing clusters. See the ChangeLog for more details.

Create/Mirror Repository: Mirror your database vendor’s software repository without having to actually deploy a cluster. A mirrored local repository is used in scenarios where you cannot upgrade a cluster and must lock the db versions to use.

Deploy Production Setup of NDB/MySQL Cluster: Users can now create a production setup of NDB/MySQL Cluster from ClusterControl and deploy management, SQL/API and data nodes.

Deploy MongoDB ReplicaSet Node: Support for Percona MongoDB 3.x. Please also note that as of this version of ClusterControl, MongoDb 2.x is no longer supported.

There is a bunch of other features and improvements that we have not mentioned here. You can find all details in the ChangeLog.

We encourage you to test this latest release and provide us with your feedback. If you’d like a demo, feel free to request one.

With over 8,000 users to date, ClusterControl is the leading, platform independent automation and management solution for MySQL, MariaDB, Percona, MongoDB and PostgreSQL.

Thank you for your ongoing support, and happy clustering!

For additional tips & tricks, follow our blog: http://www.severalnines.com/blog/

by Severalnines at May 17, 2016 11:11 AM

May 16, 2016

Peter Zaitsev

MySQL “Got an error reading communication packet” errors

MySQL Got an error reading communication packet errors

MySQL Got an error reading communication packet errorsIn this blog post, we’ll discuss the possible reasons for MySQL “Got an error reading communication packet” errors, and how to address them.

In Percona’s managed services, we often receive customer questions on communication failure errors – where customers are faced with intermittent “Got an error reading communication packets” messages. I thought this topic deserved blog post so we can discuss possible reasons for this error, and how to remedy this problem. I hope this will help readers on how to investigate and resolve this problem.

First of all, whenever a communication error occurs it increments the status counter for either Aborted_clients or Aborted_connects, which describe the number of connections that were aborted because the client died without closing the connection properly and the number of failed attempts to connect to MySQL server (respectively). The possible reasons for both errors are numerous (see the Aborted_clients increments or Aborted_connects increments sections in the MySQL manual).

In the case of log_warnings>1, MySQL also writes this information to the error log (shown below):

[Warning] Aborted connection 305628 to db: 'db' user: 'dbuser' host: 'hostname' (Got an error reading communication packets)
[Warning] Aborted connection 305627 to db: 'db' user: 'dbuser' host: 'hostname' (Got an error reading communication packets)

In this case, MySQL increments the status counter for Aborted_clients, which could mean:

  • The client connected successfully but terminated improperly (and may relate to not closing the connection properly)
  • The client slept for longer than the defined wait_timeout or interactive_timeout seconds (which ends up causing the connection to sleep for wait_timeout seconds and then the connection gets forcibly closed by the MySQL server)
  • The client terminated abnormally or exceeded the max_allowed_packet for queries

The above is not an all-inclusive list.Now, how to identify what causing this problem and how to remedy this problem.

How do we identify what caused this problem, and how do we fix it?

To be honest, aborted connection errors are not easy to diagnose. But in my experience, it’s related to network/firewall issues most of the time. We usually investigate those issues with the help of Percona toolkit scripts, i.e. pt-summary / pt-mysql-summary / pt-stalk. The outputs from those scripts can be very helpful.

Some of the reasons can be:

  • A high rate of connections sleeping inside MySQL for hundred of seconds is one of the symptoms that applications aren’t closing connections after doing work, and instead relying on the
    wait_timeout
     to close them. I strongly recommend changing the application logic to properly close connections at the end of an operation.
  • Check to make sure the value of
    max_allowed_packet
     is high enough, and that your clients are not receiving a “packet too large” message. This situation aborts the connection without properly closing it.
  • Another possibility is
    TIME_WAIT
    . I’ve noticed many
    TIME_WAIT
     notifications from the netstat, so I would recommend confirming the connections are well managed to close on the application side.
  • Make sure the transactions are committed (begin and commit) properly, so that once the application is “done” with the connection it is left in a clean state.
  • You should ensure that client applications do not abort connections. For example, if PHP has option
    max_execution_time
     set to 5 seconds, increasing connect_timeout would not help because PHP will kill the script. Other programming languages and environments can have similar safety options.
  • Another cause for delay in connections is DNS problems. Check if you have skip-name-resolve enabled, and if hosts are authenticated against their IP address instead their hostname.
  • One way to find out where your application is misbehaving is to add some logging to your code that will save the application actions along with the MySQL connection ID. With that, you can correlate it to the connection number from the error lines. Enable the Audit log plugin, which logs connections and query activity, and check the Percona Audit Log Plugin as soon as you hit a connection abort error. You can check for the audit log to identify which query is the culprit. If you can’t use the Audit plugin for some reason, you can consider using the MySQL general log – however, this can be risky on a loaded server. You should enable the general log for at least a few minutes. While it puts a heavy burden on the server, the errors tend to happen fairly often, so you should be able to collect the needed data before the log grows too large. I recommend enabling the general log with an -f tail, then disable the general log when you see the next warning in the log. Once you find the query from the aborted connection, identify which piece of your application issues that query and co-relate the queries with portions of your application.
  • Try increasing the net_read_timeout and net_write_timeout values for MySQL and see if that reduces the number of errors.
    net_read_timeout
     is rarely the problem unless you have an extremely poor network. Try tweaking those values, however, because in most cases a query is generated and sent as a single packet to the server, and applications can’t switch to doing something else while leaving the server with a partially received query. There is a very detailed blog post on this topic from our CEO, Peter Zaitsev.

Aborted connections happen because a connection was not closed properly. The server can’t cause aborted connections unless there is a networking problem between the server and the client (like the server is half duplex, and the client is full duplex) – but that is the network causing the problem, not the server. In any case, such problems should show up as errors on the networking interface. To be extra sure, check the 

ifconfig -a
  output on the MySQL server to check if there are errors.

Another way to troubleshoot this problem is via

tcpdump
. You can refer to this blog post on how to track down the source of aborted connections. Look for potential network issues, timeouts and resource issues with MySQL.

I found this blog post useful in explaining how to use

tcpdump
 on busy hosts. It provides help for tracking down the TCP exchange sequence that led to the aborted connection, which can help you figure out why the connection broke.

For network issues, use a ping to calculate the round trip time (RTT) between machine where mysqld is located and the machine from where the application makes requests. Send a large file (1GB or more) to and from client and server machines, watch the process using 

tcpdump
, then check if an error occurred during transfer. Repeat this test few times. I also found this from my colleague Marco Tusa useful: Effective way to check network connection.

One other idea I can think of is to capture the 

netstat -s
 output along with a timestamp after every N seconds (e.g., 10 seconds so you can relate 
netstat -s
 output of BEFORE and AFTER an aborted connection error from the MySQL error log). With the aborted connection error timestamp, you can co-relate it with the 
netstat
 sample captured as per a timestamp of
netstat
, and watch which error counters increased under the TcpExt section of
netstat -s
.

Along with that, you should also check the network infrastructure sitting between the client and the server for proxies, load balancers, and firewalls that could be causing a problem.

Conclusion:
I’ve tried to cover communication failure errors, and how to identify and fix the possible aborted connections. Take into account, faulty ethernets, hubs, switches, cables, and so forth can cause this issue as well. You must replace the hardware itself to properly diagnose these issues.

by Muhammad Irfan at May 16, 2016 04:32 PM

Jean-Jerome Schmidt

Learn the difference between Multi-Master and Multi-Source replication

MySQL replication can have various topologies and the most well known topology is the Master/Slave topology, where one host is the master and the slave replicates all transactions from the master. Lesser known are Multi-Master and Multi-Source replication. Even though they sound similar they are actually quite different. This blog post will illustrate their differences.

Multi Master

Multi Master is similar to the Master/Slave topology, with the difference that both nodes are both master and replica at the same time. This means there will be circular replication between the nodes. It is advisable to configure both servers to log the transactions from the replication thread (log-slave-updates), but ignore its own already replicated transactions (set replicate-same-server-id to 0) to prevent infinite loops in the replication. This needs to be configured even with GTID enabled.

Multi master topologies can be configured to have either a so called active/passive setup, where only one node is writable and the other node is a hot standby. Then there is the active/active setup, where both nodes are writable.

Caution is needed with active/active as both masters are writing at the same time and this could lead to conflicts, if the same dataset is being written at the same time on both nodes. Generally this is handled on application level where the application is connecting to different schemas on the two hosts to prevent conflicts. Also as two nodes are writing data and replicating data at the same time they are limited in write capacity and the replication stream could become a bottleneck.

Multi source Replication

Multi source replication is supported as of MariaDB 10.0 and MySQL 5.7 . Basically this means that  a replica is allowed to replicate from multiple masters. To enable this, the replica should not have multiple masters writing to the same schema as this would lead to conflicts in the write set.

Multi source replication is currently not widely supported by replication tools. In general these tools use the output from SHOW SLAVE STATUS to determine the replication state of the replicas. With multi source replication, there are several replication channels and thereby multiple replication streams. Therefore it is not advised to use multi source replication in combination with a replication manager unless you have tested this thoroughly. The alternative is to make the slave an unmanaged replica.

Why would you need multi source replication? Multi source replication may be suitable for data warehousing needs, delayed slaves or data locality in some cases. Especially if multiple database clusters are not utilizing their full write capacity, it may save a few hosts by consolidating multiple slaves into one node.

This blog post is a condensed version of the replication topology chapter in our MySQL Replication Blueprint whitepaper. Download the whitepaper and learn more about MySQL replication end-to-end.

by Severalnines at May 16, 2016 12:31 PM

May 14, 2016

Colin Charles

London roadshow wrap-up, see you in Paris next week

Just a few days ago, I presented at the MariaDB Roadshow in London, and I had a lot of fun. While I had canned slides, I did know the topic intimately well, so it was good to get further in-depth. In addition, we had these MasterMind sessions, basically the place to get one-on-one time with Anders/Luisa/or me, I noticed that pretty much everyone said they were buying services afterwards (which more or less must mean the event was rather successful from that standpoint!).

In addition to that, I was happy to see that from attendee feedback, I did have the highest averages – thank you!

So here’s to repeating this in Paris next week — Gestion des données pour les applications vitales – MariaDB Roadshow Paris. I look forward to seeing you there, and I know we are repeating the MasterMind sessions. To fine-tune it, try to bring as much information as you possibly can so our time can be extremely effective.

by Colin Charles at May 14, 2016 02:05 PM

May 13, 2016

Peter Zaitsev

Benchmark MongoDB with sysbench

Benchmark MongoDB with sysbench

Benchmark MongoDB with sysbenchIn this blog post, we’ll discuss how to benchmark MongoDB with sysbench.

In an earlier post, I mentioned our use of sysbench-mongodb (via this fork) to run benchmarks of MongoDB servers. I now want to share our work extending sysbench to make it work with MongoDB.

If you’re not familiar with sysbench, it’s a great project developed by Alexey Kopytov that lets you run different types of benchmarks (referred to as “tests” by the tool), including database benchmarks. The database tests are implemented in Lua scripts, which means you can customize them as needed (or even write new ones from scratch) – something useful for simulating specific workloads.

All of the database tests in sysbench assume an SQL-based database, so instead of trying to shoehorn MongoDB tests into this framework I modified the connect/disconnect functions to handle MongoDB, and then implemented new functions specific for this database.

You can find the work (which is still in progress but usable, and in fact currently used by us in benchmarks) on the dev-mongodb-support-1.0 branch of our sysbench fork.

To use it, you just need to specify the –mongo-url argument (others too, as needed, but this is the one that must be present for sysbench to detect a MongoDB test is requested), and then provide the path to the Lua script you want to run. The following is an example:

sysbench
--mongo-write-concern=1
--mongo-url="mongodb://localhost"
--mongo-database-name=sbtest
--test=sysbench/sysbench/tests/mongodb/oltp.lua
--oltp_table_size=60000000
--oltp_tables_count=16
--num-threads=512
--rand-type=pareto
--report-interval=10
--max-requests=0
--max-time=600
--oltp-point-selects=10
--oltp-simple-ranges=1
--oltp-sum-ranges=1
--oltp-order-ranges=1
--oltp-distinct-ranges=1
--oltp-index-updates=1
--oltp-non-index-updates=1
--oltp-inserts=1 run

To build this branch, you’ll first need to build and install (or otherwise obtain) the mongo-c-driver project, as that is what we use to connect to MongoDB. Once that’s done, building is just a matter of running the following commands from the repo’s root:

./autogen.sh
./configure
make
sudo make install #optionally

The changes should not affect the other database tests in sysbench, though I have only verified that the MySQL ones continue to work.

Right now, the workload from sysbench-mongodb is implemented in Lua scripts (oltp.lua), and work is in progress to allow freeform operations to be created with new Lua scripts (by providing functions that take JSON as the argument). As an alternative, you may want to check out this much-less-tested (and currently unstable) branch based on luamongo. It already supports the creation of arbitrary workloads in Lua. In this case, you also need to build luamongo, which is included.

With either branch, you can add new tests by implementing new Lua scripts (though the dev-mongodb-support-1.0 branch still needs a few functions implemented on the C side to support arbitrary operations from the Lua side).

We think there are still some types of operations needed to improve sysbench’s usefulness for MongoDB, such as queries involving arrays, union, the $in operator, geospatial operators, and in place updates.

We hope you find this useful, and we welcome suggestions and bug reports to improve it.

Happy benchmarking!

by Fernando Ipar at May 13, 2016 04:17 PM

May 12, 2016

Peter Zaitsev

ProxySQL versus MaxScale for OLTP RO workloads

ProxySQL versus MaxScale for OLTP RO workloads

In this blog post, we’ll discuss ProxySQL versus MaxScale for OLTP RO workloads.

Continuing my series of READ-ONLY benchmarks (you can find the other posts here: https://www.percona.com/blog/2016/04/07/mysql-5-7-sysbench-oltp-read-results-really-faster/ and https://www.percona.com/blog/2016/03/28/mysql-5-7-primary-key-lookup-results-is-it-really-faster), in this post I want to see how much overhead a proxy adds. At this

In my opinion, there are only two solid proxy software options for MySQL at the moment: ProxySQL and MaxScale. In the past, there was also MySQL Proxy, but it is pretty much dead for now. Its replacement, MySQl Router, is still in the very early stages and seriously lacks any features that would compete with ProxySQL and MaxScale. This will most likely change in the future – when MySQL Router adds more features, I will reevaluate them then!

To test the proxies, I will start with a very simple setup to gauge basic performance characteristics. I will use a sysbench client and proxy running on the same box. Sysbench connects to the proxy via local socket (for minimal network and TCP overhead), and the proxy is connected to a remote MySQL via a 10Gb network. This way, the proxy and sysbench share the same server resources.

Other parameters:

  • CPU: 56 logical CPU threads servers Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
  • sysbench ten tables x 10mln rows, Pareto distribution
  • OS: Ubuntu 15.10 (Wily Werewolf)
  • MySQL 5.7
  • MaxScale version 1.4.1
  • ProxySQL version 1.2.0b

You can find more details about benchmarks, scripts and configs here: https://github.com/Percona-Lab/benchmark-results/tree/201603-mysql55-56-57-RO/remote-OLTP-proxy-may.

An important parameter to consider is how much of the CPU resources you allocate for a proxy. Both ProxySQL and MaxScale allow you to configure how many threads they can use to process user requests and to route queries. I’ve found that 16 threads for ProxySQL 8 threads for  MaxScale is optimal (I will also show 16 threads for MaxScale in this). Both proxies also allow you to setup simple load-balancing configurations, or to work in read-write splitting mode. In this case, I will use simple load balancing, since there are no read-write splitting requirements in a read-only workload).

ProxySQL

First result: How does ProxySQL perform compared to vanilla MySQL 5.7?

As we can see, there is a noticeable drop in performance with ProxySQL. This is expected, as ProxySQL does extra work to process queries. What is good though is that ProxySQL scales with increasing user connections.

One of the tricks that ProxySQL has is a “fast-forward” mode, which minimizes overhead from processing (but as a drawback, you can’t use many of the other features). Out of curiosity, let’s see how the “fast-forward” mode performs:

MaxScale

Now let’s see what happens with MaxScale. Before showing the next chart, let me not it contains “error bars,” which are presented as vertical bars. Basically, an “error bar” shows a standard deviation: the longer the bar, the more variation was observed during the experiment. We want to see less variance, as it implies more stable performance.

Here are results for MaxScale versus ProxySQL:

We can see that with lower numbers of threads both proxies are nearly similar, but MaxScale has a harder time scaling over 100 threads. On average, MaxScale’s throughput is worse, and there is a lot of variation. In general, we can see that MaxScale demands more CPU resources and uses more of the CPU per request (compared to ProxySQL). This holds true if we run MaxScale with 16 threads (instead of 8):

MaxScale with 16 threads does not handle the workload well, and there is a lot of variation along with some visible scalability issues.

To summarize, here is a chart with relative performance (vanilla MySQL 5.7 is shown as 1):

While this chart does show that MaxScale has less overhead from 1-6 threads, it doesn’t scale as user load increases.

by Vadim Tkachenko at May 12, 2016 05:52 PM

Henrik Ingo

Upgrading to more impressive presentations: impress.js

In terms of using an open source desktop, Sun releasing OpenOffice some 15 (?) years ago was an important milestone, comparable to Mozilla finally managing to produce a working browser in Firefox. It provided essentially feature parity with Microsoft office, and most importantly, decent compatibility with Microsoft's own proprietary file formats.

I've used OpenOffice, and now LibreOffice, for lots of non-trivial tasks, including writing a complete book. Sure, the UI toolkit was stuck in the 90's, and Sun wasn't really a good steward in pushing the code base into this century, but it did work.

read more

by hingo at May 12, 2016 02:42 PM

Jean-Jerome Schmidt

Planets9s - Download our MySQL Replication Blueprint Whitepaper

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Download our MySQL Replication Blueprint Whitepaper

Our new whitepaper, The MySQL Replication Blueprint, discusses all aspects of a MySQL Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies as ProxySQL, MaxScale and HAProxy. This is a great resource for anyone wanting to build or optimise a MySQL replication set up.

Download the whitepaper

ClusterControl New Features Webinar on May 24th

Join us for this new webinar on Tuesday, May 24th, where we’ll be discussing and demonstrating the latest version of ClusterControl, the one-stop console for your entire database infrastructure. We’ll be introducing some cool new features for MySQL and MongoDB users in particular, as well as walk you through the work we’ve recently done for improved security.

Sign up for the webinar

ClusterControl Tips & Tricks: MySQL Query Performance Tuning

Bad query performance is the most common problem DBA’s have to deal with and this new blog post looks into how ClusterControl can help you solve these problems by using the data available in it. We answer questions such ‘which of my queries takes the most time to execute’ and provide a few handy tricks that will help you improve query performance.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at May 12, 2016 12:00 PM

May 11, 2016

Federico Razzoli

MySQL/MariaDB cursors and temp tables

In MariaDB and MySQL, cursors create a temporary table.

Does this statement deserve a whole blog post? Apparently not. However, in some cases one does not expect a temporary table to be created:

  • SELECT ... FOR UPDATE: An exclusive lock is created, yes, but you still read data from a temporary table.
  • SELECT FROM a temporary table: you are reading from a temporary tables, yes, but an internal temporary table is created anyway.
  • Impossible WHERE and LIMIT 0.

A quick example:

CREATE TEMPORARY TABLE t
(
        id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY
);

CREATE PROCEDURE p()
BEGIN
        DECLARE c CURSOR FOR
                SELECT id FROM t WHERE 0 LIMIT 0 FOR UPDATE;
        OPEN c;
        CLOSE c;
END;

MySQL [test]> SHOW STATUS LIKE 'Created_tmp_tables';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Created_tmp_tables | 31    |
+--------------------+-------+
1 row in set (0.00 sec)

MySQL [test]> CALL p();
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> SHOW STATUS LIKE 'Created_tmp_tables';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Created_tmp_tables | 32    |
+--------------------+-------+
1 row in set (0.00 sec)

I am not complaining, and I don’t even know if this behavior can be changed. But one should certainly be aware of this behavior. For example, one could think that creating a temporary table one time and then loop on that table with cursors several times is an optimization – but that’s not the case.

Federico


by Federico at May 11, 2016 10:51 PM

Jean-Jerome Schmidt

Join Us On May 24th: ClusterControl New Features Webinar

We’re excited to be presenting the latest version of ClusterControl, the one-stop console for your entire database infrastructure, during our next live webinar on May 24th. We’ll be introducing some cool new features for MySQL and MongoDB users in particular as well as walk you through the work we’ve recently done for improved security.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, May 24th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, May 24th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Our colleagues Johan Andersson (CTO), Vinay Joosery (CEO) and Ashraf Sharif (System Support Engineer) will be demonstrating how you can deploy, monitor, manage and scale your databases on the technology stack of your choice with ClusterControl.

Agenda

  • ClusterControl overview
  • New features deep-dive
    • For MySQL-based systems
    • For MongoDB-bases systems
    • Improved security
    • And more…
  • Live Demo
  • Q&A

Speakers

Johan Andersson is CTO at Severalnines, a company that enables developers to easily deploy, manage, monitor and scale highly-available MySQL clusters in the data center, in hosted environments and on the cloud. Prior to Severalnines, Johan worked at MySQL/Sun/Oracle and was the Principal Consultant and lead of the MySQL Clustering and High Availability consulting group, where he designed and implemented large-scale MySQL systems at key customers.

Vinay Joosery is a passionate advocate and builder of concepts and businesses around Big Data computing infrastructures. Prior to co-founding Severalnines, Vinay held the post of Vice-President EMEA at Pentaho Corporation - the Open Source BI leader. He has also held senior management roles at MySQL / Sun Microsystems / Oracle, where he headed the Global MySQL Telecoms Unit, and built the business around MySQL's High Availability and Clustering product lines. Prior to that, Vinay served as Director of Sales & Marketing at Ericsson Alzato, an Ericsson-owned venture focused on large scale real-time databases.

Ashraf Sharif is a System Support Engineer at Severalnines. He was previously worked as principal consultant and head of support team and delivered clustering solutions for big websites in the South East Asia region. His professional interests focus on system scalability and high availability.

 

We look forward to “seeing” you there and some good discussions!

by Severalnines at May 11, 2016 05:00 PM

Peter Zaitsev

Quick start MySQL testing using Docker (on a Mac!)

docker-logo

MySQL testing using DockerIn this post, we’ll discuss how you can quick start MySQL testing using Docker, specifically in a Mac environment.

Like a lot of people, I’m hearing a lot about Docker and it’s got me curious. The Docker ecosystem seems to be moving quickly, however, and simple “getting started” or “how-to” type articles that are easy to find for well-established technologies seem to be out-of-date or non-existent for Docker. I’ve been playing with Docker on Mac for a bit, but it is definitely a second-class citizen in the Docker world. However, I saw Giuseppe’s blog on the new Docker beta for Mac and decided to try it for myself. These steps work for the beta version on a Mac (and probably Windows), but they should work with Linux as well (using the GA release, currently Docker 1.11.1).

The new Docker beta for Mac requires that you register for the beta program, and receive a download code from Docker. I got mine in about a day, but I would assume it won’t be long before the full version is released.

Once installed, I needed to setup some Docker containers for common MySQL versions so that I can easily have some sandboxes. The method I used is below:

jayj@~ [510]$ docker network create test
90005b3ffa9fef1f817ee4965e794a567404c9a8d5bf07320514e7d848d59ff9
jayj@~ [511]$ docker run --name=mysql57 --net=test -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql/mysql-server:5.7
6c80fa89610dbd5418ba474ad7d5451cd061f80a8a72ff2e718341827a08144b
jayj@~ [512]$ docker run -it --rm --net=test -e MYSQL_HOST=mysql57 mysql/shell init
Creating a Classic Session to root@mysql57:3306
Enter password:
No default schema selected.
enableXProtocol: Installing plugin mysqlx...
enableXProtocol: done

A quick summary of what I did above:

  1. I created a network called “test” for my containers to share, essentially this is a dedicated private network between containers.  I like this because multiple containers can listen on the same port and I don’t have to fight with ports on my host OS.
  2. I started a MySQL 5.7 image from Oracle’s official MySQL Docker container bound to that test network.
  3. I used the MySQL/shell image (also from Oracle) to initialize the mysqlx plugin on my 5.7 server. Notice I didn’t enter a password because I created the server without one (insecure, but it’s a sandbox).

The shell init uses a temporary container that is removed (–rm) after the run, so you don’t pollute your

docker ps -a
 a output.

So, now I want to be able to use the standard MySQL command line and/or the new MySQL shell to access this container.  To  make this really clean, I added some bash aliases:

alias mysqlsh='docker run -it --rm --net=test mysql/shell'
alias mysql='docker run -it --rm -e MYSQL_ALLOW_EMPTY_PASSWORD=yes --net=test --entrypoint="mysql" mysql/mysql-server:5.7'

With these in effect, I can call them directly and pass normal command line options to connect to my mysql57 image just as if I was using a native MySQL CLI binary.

Using the MySQL CLI from the 5.7 image:

jayj@~ [524]$ mysql -h mysql57
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 4
Server version: 5.7.12 MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> show schemas;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+
4 rows in set (0.01 sec)

Using the MySQL shell:

jayj@~ [527]$ mysqlsh -h mysql57 -u root --session-type=node
Creating a Node Session to root@mysql57:33060
Enter password:
No default schema selected.
Welcome to MySQL Shell 1.0.3 Development Preview
Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help', 'h' or '?' for help.
Currently in JavaScript mode. Use sql to switch to SQL mode and execute queries.
mysql-js> sql
Switching to SQL mode... Commands end with ;
mysql-sql> show schemas;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+
4 rows in set (0.00 sec)
mysql-sql>

Now if I want to run check MySQL 5.5 for something, I can just do this:

jayj@~ [530]$ docker run --name=mysql55 --net=test -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql/mysql-server:5.5
Unable to find image 'mysql/mysql-server:5.5' locally
5.5: Pulling from mysql/mysql-server
a3ed95caeb02: Already exists
ffe36b360c6d: Already exists
646f220a8b5d: Pull complete
ed65e4fea7ed: Pull complete
d34b408b18dd: Pull complete
Digest: sha256:12f0b7025d1dc0e7b40fc6c2172106cdf73b8832f2f910ad36d65228d9e4c433
Status: Downloaded newer image for mysql/mysql-server:5.5
6691dd9d42c73f53baf2968bcca92b7f4d26f54bb01d967be475193305affd4f
jayj@~ [531]$ mysql -h mysql55
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 1
Server version: 5.5.49 MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> show schemas;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
+--------------------+
3 rows in set (0.00 sec)

or, Percona Server:

jayj@~ [534]$ docker run --name=ps57 --net=test -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d percona/percona-server:5.7
Unable to find image 'percona/percona-server:5.7' locally
5.7: Pulling from percona/percona-server
a3ed95caeb02: Pull complete
a07226856d92: Pull complete
eee62d87a612: Pull complete
4c6755120a98: Pull complete
10eab0da5972: Pull complete
d5159a6502a4: Pull complete
e595a1a01d00: Pull complete
Digest: sha256:d57f0ce736f5403b1714ff8d1d6b91d5a7ee7271f30222c2bc2c5cad4b4e6950
Status: Downloaded newer image for percona/percona-server:5.7
9db503852747bc1603ab59455124663e8cedf708ac6d992cff9b43e2fbebd167
jayj@~ [537]$ mysql -h ps57
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 2
Server version: 5.7.10-3 Percona Server (GPL), Release 3, Revision 63dafaf
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql>

So all this is nice – once the images are cached locally, spinning new containers up and down is painless and fast. All this sandbox work is cleanly separated from my workstation OS. There are probably other things I’d want to be able to do with this setup that I haven’t figured out yet (e.g., loading data files, running code to connect to these containers, etc.) – but I’ll figure those out in the future.

by Jay Janssen at May 11, 2016 04:38 PM

May 10, 2016

Peter Zaitsev

Query Rewrite plugin can harm performance

Query Rewrite plugin can harm performance

In this blog post, we’ll discuss how the Query Rewrite plugin can harm performance.

MySQL 5.7 comes with Query Rewrite plugin, which allows you to modify queries coming to the server. (You can view the details here: https://dev.mysql.com/doc/refman/5.7/en/rewriter-query-rewrite-plugin.html.)

It is based on the audit plugin API, and unfortunately it suffers from serious scalability issues (which seems to be the case for all API-based audit plugins).

I want to share the results for sysbench OLTP RO with and without the query rewrite plugin — but with one very simple rewrite rule, which doesn’t affect any queries. This is the rule from the documentation:

INSERT INTO query_rewrite.rewrite_rules (pattern, replacement)
    -> VALUES('SELECT ?', 'SELECT ? + 1');

There are results for both cases:

Query Rewrite plugin can harm performance

As you can see, the server with the Query Rewrite plugin can’t scale after 100 threads.

When we look at the PMP profile, it shows the following:

170 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_unlock_list,mysql_a
udit_release,handle_connection,pfs_spawn_thread,start_thread,clone
 164 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_foreach_with_mask,m
ysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,dispatch_command,do_command,handle_connec
tion,pfs_spawn_thread,start_thread,clone
77 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_lock,acquire_plugin
s,plugin_foreach_with_mask,mysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,dispatch_comm
and,do_command,handle_connection,pfs_spawn_thread,start_thread,clone
12 __lll_unlock_wake,__pthread_mutex_unlock_usercnt,__GI___pthread_mutex_unlock,native_mutex_unlock,my_mutex_unlock,inline_mysq
l_mutex_unlock,plugin_unlock_list,mysql_audit_release,handle_connection,pfs_spawn_thread,start_thread,clone
 10 __lll_unlock_wake,__pthread_mutex_unlock_usercnt,__GI___pthread_mutex_unlock,native_mutex_unlock,my_mutex_unlock,inline_mysq
l_mutex_unlock,plugin_lock,acquire_plugins,plugin_foreach_with_mask,mysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_
rewrite_plugins,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone
 10 __lll_unlock_wake,__pthread_mutex_unlock_usercnt,__GI___pthread_mutex_unlock,native_mutex_unlock,my_mutex_unlock,inline_mysq
l_mutex_unlock,plugin_foreach_with_mask,mysql_audit_acquire_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,
dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone
7 __lll_lock_wait,__GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,Table_cache::lock,open_tab
le,open_and_process_table,open_tables,open_tables_for_query,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command
,do_command,handle_connection,pfs_spawn_thread,start_thread,clone
 6 __GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_unlock_list,mysql_audit_release,han
dle_connection,pfs_spawn_thread,start_thread,clone
 6 __GI___pthread_mutex_lock,native_mutex_lock,my_mutex_lock,inline_mysql_mutex_lock,plugin_foreach_with_mask,mysql_audit_acqui
re_plugins,mysql_audit_notify,invoke_pre_parse_rewrite_plugins,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_t
hread,start_thread,clone

So clearly it’s related to a mutex acquired in the audit plugin API code. I filed a bug (https://bugs.mysql.com/bug.php?id=81298), but it’s discouraging to see that while the InnoDB code is constantly being improved for better scaling, other parts of the server can still suffer from global mutexes.

by Vadim Tkachenko at May 10, 2016 05:53 PM

MariaDB Foundation

MariaDB 10.1.14 and Connector/J 1.4.4 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.14, and MariaDB Connector/J 1.4.4. See the release notes and changelogs for details on these releases. Download MariaDB 10.1.14 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/J 1.4.4 Release Notes Changelog About MariaDB Connector/J […]

The post MariaDB 10.1.14 and Connector/J 1.4.4 now available appeared first on MariaDB.org.

by Daniel Bartholomew at May 10, 2016 02:07 PM

Jean-Jerome Schmidt

ClusterControl Tips & Tricks: MySQL Query Performance Tuning

Bad query performance is the most common problem DBA’s have to deal with. There are numerous ways to collect, process and analyze the data related to query performance - we’ve covered one of the most popular tools, pt-query-digest, in some of our previous blog posts:

Become a MySQL DBA blog series

When you use ClusterControl though, this is not always needed. You can use the data available in ClusterControl to solve your problem. In this blog post, we’ll look into how ClusterControl can help you solve problems related to query performance.

It may happen that a query cannot complete in a timely manner. The query may be stuck due to some locking issues, it may be not optimal or not indexed properly or it may be too heavy to complete in a reasonable amount of time. Keep in mind that a couple of not indexed joins can easily scan billions of rows if you have a large production database. Whatever happened, the query is probably using some of the resources - be it CPU or I/O for a non-optimized query or even just row locks. Those resources are required also for other queries and it may seriously slows things down. One of very simple yet important tasks would be to pinpoint the offending query and stop it.

It is pretty easily done from the ClusterControl interface. Go to the Query Monitor tab -> Running Queries section - you should see an output similar to the screenshot below.

As you can see, we have a pile of queries stuck. Usually the offending query is the one which takes the long time, you might want to kill it. You may also want to investigate it further to make sure you pick the correct one. In our case, we clearly see a SELECT … FOR UPDATE which joins a couple of tables and which is in the ‘Sending data’ state meaning it is processing the data, for last 90 seconds.

Another type of question a DBA may need to answer is - which queries take most time to execute? This is a common question, as such queries may be a low hanging fruit - they may be optimizable, and the more execution time a given query is responsible for in a whole query mix, the larger is the gain from its optimization. It is a simple equation - if a query is responsible for 50% of total execution time, making it 10x faster will give much better result than optimizing a  query which is responsible for just 1% of the total execution time.

ClusterControl can help you answer such questions, but first we need to ensure the Query Monitor is enabled. You can check it in Settings -> Query Monitor Settings, making sure that the Query Sample Time is set to something else than -1.

The Query Monitor in ClusterControl works in two modes, depending on whether you have the Performance Schema available with the required data on the running queries or not. If it is available (and this is true by default in MySQL 5.6 and newer), Performance Schema will be used to collect query data, minimizing the impact on the system. Otherwise, the slow query log will be used and all of the settings visible in the above screenshot are used. Those are pretty well explained in the UI, so there’s no need to do it here. When the Query Monitor uses Performance Schema, those settings are not used (except for Query Sample Time which can be set to -1 to disable data collection).

When you confirmed that the Query Monitor is enabled in ClusterControl, you can go to Query Monitor -> Top Queries, where you’ll be presented with a screen similar to the below:

What you can see here is a list of the most expensive queries (in terms of execution time) that hit our cluster. Each of them has some further details - how many times it was executed, how many rows were examined or sent to the client, how execution time varied, how much time the cluster spent on executing a given type of query. Queries are grouped by query type and schema.

You may be surprised to find out that the main place where execution time is spent is a ‘COMMIT’ query. Actually, this is fairly typical for quick OLTP queries executed on Galera cluster. Committing a transaction is an expensive process because certification has to happen. This leads to COMMIT being one of the most time-consuming queries in the query mix.

When you click on a query, you can see an EXPLAIN output for it - pretty useful to identify if something’s wrong with it. In our example we’ve checked a SELECT … FOR UPDATE with high number of rows examined. As expected, this query is an example of terrible SQL - a JOIN which does not use any index. You can see on the EXPLAIN output that no index is used, not a single one was even considered possible to use. No wonder this query seriously impacted the performance of our cluster.

Another way to get some insight into query performance is to look at Query Monitor -> Query Histogram. This basically is a list of queries whose performance significantly differ from their average.

As you can see in the above screenshot, the first query took 0.0246s (time is shown in microseconds) where average execution time for that query is much lower (0.0008s). We have also some additional statistical info on standard deviation and maximum query execution time. Such list of queries may seem to be not very useful - it’s not really true. When you see a query on this list, it means that something was different from the usual - query did not complete in regular time. It may be an indication of some performance issues on your system and a signal that you should investigate other metrics, and check if anything else happened at that time.

People tend to focus on achieving max performance, forgetting that it is not enough to have high throughput - it also has to be consistent. Users like performance to be stable - you may be able to squeeze more transactions per second from your system but if it means that some transactions will start to stall for seconds, that’s not worth it. Looking at the Query Histogram in ClusterControl helps you identify such consistency issues in your query mix.

Happy query monitoring!

PS.: To get started with ClusterControl, click here!

by Severalnines at May 10, 2016 11:01 AM

May 09, 2016

Peter Zaitsev

Percona Server 5.7 parallel doublewrite

Percona Server 5.7 parallel doublewrite

In this blog post, we’ll discuss the ins and outs of Percona Server 5.7 parallel doublewrite.

After implementing parallel LRU flushing as described in the previous post, we went back to benchmarking. At first, we tested with the doublewrite buffer turned off. We wanted to isolate the effect of the parallel LRU flusher, and the results validated the design. Then we turned the doublewrite buffer back on and saw very little, if any, gain from the parallel LRU flusher. What happened? Let’s take a look at the data:

5710.3.pfs.all

We see that the doublewrite buffer mutex is gone as expected and that the top waiters are the rseg mutexes and the index lock (shouldn’t this be fixed in 5.7?). Then we checked PMP:

2678 nanosleep(libpthread.so.0),...,buf_LRU_get_free_block(buf0lru.cc:1435),...
867 pthread_cond_wait,...,log_write_up_to(log0log.cc:1293),...
396 pthread_cond_wait,...,mtr_t::s_lock(sync0rw.ic:433),btr_cur_search_to_nth_level(btr0cur.cc:1022),...
337 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2325),...
240 poll(libc.so.6),...,Protocol_classic::read_packet(protocol_classic.cc:810),...

Again we see that PFS is not telling the whole story, this time due to a missing annotation in XtraDB. Whereas the PFS results might lead us to leave the flushing analysis and focus on the rseg/undo/purge or check the index lock, PMP clearly shows that a lack of free pages is the biggest source of waits. Turning on the doublewrite buffer makes LRU flushing inadequate again. This data, however, doesn’t tell us why that is.

To see how enabling the doublewrite buffer makes LRU flushing perform worse, we collect PFS and PMP data only for the server flusher (cleaner coordinator, cleaner worker, and LRU flusher) threads and I/O completion threads:

5710.3.flushers.only

If we zoom in from the whole server to the flushers only, the doublewrite mutex is back. Since we removed its contention for the single page flushes, it must be the batch doublewrite buffer usage by the flusher threads that causes it to reappear. The doublewrite buffer has a single area for 120 pages that is shared and filled by flusher threads. The page add to the batch action is protected by the doublewrite mutex, serialising the adds, and results in the following picture:

dblw_mysql_1

By now we should be wary of reviewing PFS data without checking its results against PMP. Here it is:

139 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2448),LinuxAIOHandler::poll(os0file.cc:2594),...
56 pthread_cond_wait,...,os_event_wait_low(os0event.cc:534),buf_dblwr_add_to_batch(buf0dblwr.cc:1111),...,buf_flush_LRU_list_batch(buf0flu.cc:1555),...,buf_lru_manager(buf0flu.cc:2334),...
25 pthread_cond_wait,...,os_event_wait_low(os0event.cc:534),buf_flush_page_cleaner_worker(buf0flu.cc:3482),...
21 pthread_cond_wait,...,PolicyMutex<TTASEventMutex<GenericPolicy>(ut0mutex.ic:89),buf_page_io_complete(buf0buf.cc:5966),fil_aio_wait(fil0fil.cc:5754),io_handler_thread(srv0start.cc:330),...
8 pthread_cond_timedwait,...,buf_flush_page_cleaner_coordinator(buf0flu.cc:2726),...

As with the single-page flush doublewrite contention and the wait to get a free page in the previous posts, here we have an unannotated-for-Performance Schema doublewrite OS event wait (same bug 80979):

if (buf_dblwr->batch_running) {
		/* This not nearly as bad as it looks. There is only
		page_cleaner thread which does background flushing
		in batches therefore it is unlikely to be a contention
		point. The only exception is when a user thread is
		forced to do a flush batch because of a sync
		checkpoint. */
		int64_t	sig_count = os_event_reset(buf_dblwr->b_event);
		mutex_exit(&buf_dblwr->mutex);
		os_event_wait_low(buf_dblwr->b_event, sig_count);
		goto try_again;
	}

This is as bad as it looks (the comment is outdated). A running doublewrite flush blocks any doublewrite page add attempts from all the other flusher threads for the duration of the flush (up to 120 data pages written twice to storage):

dblw_ms_2 (2)

The issue also occurs with MySQL 5.7 multi-threaded flusher but becomes more acute with the PS 5.7 multi-threaded LRU flusher. There is no inherent reason why all the parallel flusher threads must share the single doublewrite buffer. Each thread can have its own private buffer, and doing so allows us to add to the buffers and flush them independently. This means a lot of synchronisation simply disappears. Adding pages to parallel buffers is fully asynchronous:dblw_ps_1

And so is flushing them:

dblw_ps_2

This behavior is what we shipped in the 5.7.11-4 release, and the performance results were shown in a previous post. To see how the private doublewrite buffer affects flusher threads, let’s look at isolated data for those threads again.

Performance Schema:

5711.flusher.only

It shows the redo log mutex as the current top contention source from the PFS point of view, which is not caused directly by flushing.

PMP data looks better too:

112 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2455),...,io_handler_thread(srv0start.cc:330),...
54 pthread_cond_wait,...,buf_dblwr_flush_buffered_writes(buf0dblwr.cc:1287),...,buf_flush_LRU_list(buf0flu.cc:2341),buf_lru_manager(buf0flu.cc:2341),...
35 pthread_cond_wait,...,PolicyMutex<TTASEventMutex<GenericPolicy>(ut0mutex.ic:89),buf_page_io_complete(buf0buf.cc:5986),...,io_handler_thread(srv0start.cc:330),...
27 pthread_cond_wait,...,buf_flush_page_cleaner_worker(buf0flu.cc:3489),...
10 pthread_cond_wait,...,enter(ib0mutex.h:845),buf_LRU_block_free_non_file_page(ib0mutex.h:845),buf_LRU_block_free_hashed_page(buf0lru.cc:2567),...,buf_page_io_complete(buf0buf.cc:6070),...,io_handler_thread(srv0start.cc:330),...

The

buf_dblwr_flush_buffered_writes
 now waits for its own thread I/O to complete and doesn’t block other threads from proceeding. The other top mutex waits belong to the LRU list mutex, which is again not caused directly by flushing.

This concludes the description of the current flushing implementation in Percona Server. To sum up, in these post series we took you through the road to the current XtraDB 5.7 flushing implementation:

  • Under high concurrency I/O-bound workloads, the server has a high demand for free buffer pages. This demand can be satisfied by either LRU batch flushing, either single page flushing.
  • Single page flushes cause a lot of doublewrite buffer contention and are bad even without the doublewrite.
  • Same as in XtraDB 5.6, we removed the single page flushing altogether.
  • Existing cleaner LRU flushing could not satisfy free page demand.
  • Multi-threaded LRU flushing design addresses this issue – if the doublewrite buffer is disabled.
  • If the doublewrite buffer is enabled, MT LRU flushing contends on it, negating its improvements.
  • Parallel doublewrite buffers address this bottleneck.

by Laurynas Biveinis at May 09, 2016 08:35 PM

Jean-Jerome Schmidt

Sign up for our free whitepaper: MySQL Replication Blueprint

We recently introduced the Severalnines Blueprint for MySQL Replication, a new whitepaper which discusses all aspects of a MySQL Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies as ProxySQL, MaxScale and HAProxy.

If you haven’t got your copy yet, do sign up for it here:
Get your free copy of our new whitepaper

The MySQL Replication Blueprint is about having a complete ops-ready solution from end to end including:

  • Installation and configuration of master/slave MySQL servers, load balancers, Virtual IP and failover rules
  • Management of the topology, including failure detection, failover, repair and subsequent reconfiguration of components
  • Managing topology changes when adding, removing or maintaining servers
  • Managing configuration changes
  • Backups
  • Monitoring of all components from one single point

This whitepaper discusses the following core topics in depth:

  • Why the need for a Replication Blueprint
  • Introducing the MySQL Replication Blueprint
  • Monitoring
  • Management
  • Load Balancing

Get your free copy of our new whitepaper

by Severalnines at May 09, 2016 12:02 PM

MariaDB AB

MariaDB and Docker use cases, Part 1

Guillaume Lefranc

Some of the most common questions asked by our users are regarding MariaDB support in Docker, and in particular how it can be used in specific development or production deployments. This series of articles will try to cover a few Docker and MariaDB use cases.

Why choose Docker for MariaDB?

  • Docker containers can be used to test, deploy and release applications within any environment.
  • Docker deployments can be automated easily, creating deployment environments and reproducing them easily in staging and production.
  • Docker is lightweight virtualization. Hypervisors are not needed, and a MariaDB Docker container should perform just as well as a normal MariaDB installation without any noticeable overhead.
  • Docker is agnostic - once you’ve installed Docker on your OS, the instructions for running containers are exactly the same, whether you’re running CentOS, Debian or Ubuntu, or even Mac OS X and Windows.

A few important points about Docker containers

  • Docker containers are immutable. They can't be easily modified after start (unless you attach to it and break everything).
  • By default and because of the above, data is not persistent. Docker uses data volumes to remedy to this. The MariaDB container uses a volume to preserve data (more about this later).

State of MariaDB in Docker

MariaDB has always been very well supported in Docker for a couple of years, thanks to many efforts by the Docker team and community. To this day, Docker supports all three MariaDB releases: 5.5, 10.0 and 10.1. The MariaDB docker container has the following particularities:

  • The MariaDB root password can be set or generated through environment variables.
  • A new user and an empty database can be created through the same process as above.
  • The instance has a default /var/lib/mysql persistent data volume, which you can let docker manage internally or mount to a directory of your choice.
  • The container instance can be mounted on an existing data volume (for example a backup).
  • Network ports can be bound to arbitrary ports on the host side.
  • The MariaDB knowledge base has an extensive documentation article about docker. Read it!

Docker use case #1: Multi Tenancy

A common use case for MariaDB and Docker is running several instances of MariaDB on the same physical hosts. There are already existing solutions such as MySQL Sandbox and others, however none of them provide the flexibility, ease of use and power that Docker offers.

To illustrate our point let's start three different instances of MariaDB, each of those running a different major version:

docker run -d -p 3301:3306 -v ~/mdbdata/mdb55:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=admin --name mdb55 mariadb:5.5
docker run -d -p 3302:3306 -v ~/mdbdata/mdb10:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=admin --name mdb10 mariadb:10.0
docker run -d -p 3303:3306 -v ~/mdbdata/mdb11:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=admin --name mdb11 mariadb:10.1

Docker will automatically pull the official mariadb images from the repository and launch them. Now we can simply connect to any of those instances using the provided port and password:

$ mysql -u root -padmin -h 127.0.0.1 -P3302
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 10.0.22-MariaDB-1~jessie mariadb.org binary distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

Note that each of our instances will use a persistent data volume located under the ~/mdbdata directory - Docker will automatically create this directory tree for us.

Now that we've done that, let's delve into advanced features of Docker. Docker supports Linux control groups (cgroups), which can be used to limit, account or isolate resource usage. Let's say that we want our MariaDB 10.1 instance (named mdb11) to have a higher CPU priority than the two other instances. In this case we can lower the CPU shares of mdb10 and mdb55. Each instance has 1024 max CPU shares by default, so let's recreate our mdb55 and mdb10 containers with 512 CPU shares each.

In the preamble we said that Docker containers are immutable. If we want to change our containers' parameters we need to remove them. This is not an issue because we've defined persistent data volumes in ~/mdbdata, so the actual contents of our database will persist when we recreate the containers.

docker rm -f mdb55 mdb10
docker run -d -p 3301:3306 --cpu-shares=512 -v ~/mdbdata/mdb55:/var/lib/mysql --name mdb55 mariadb:5.5
docker run -d -p 3302:3306 --cpu-shares=512 -v ~/mdbdata/mdb10:/var/lib/mysql --name mdb10 mariadb:10.0

We have recreated the two MariaDB instances with 512 CPU shares each. This is a soft limit though, and is only enforced when processes compete for CPU cycles. If the other instances are idle, any instance is able to use up to 100% of all CPUs. In practice, this means that if all three instances use the CPU concurrently, each of the two first containers, which have 512 shares each, (mdb55 and mdb10) will be able to use up to 25% of all CPUs, whereas the third container, which has 1024 shares, will be able to use up to 50% of all CPUs.

Another option is to bind the instance to a specific CPU core, so let's recreate the containers and do that:

docker rm -f mdb55 mdb10 mdb11
docker run -d -p 3301:3306 --cpuset-cpus=0 -v ~/mdbdata/mdb55:/var/lib/mysql --name mdb55 mariadb:5.5
docker run -d -p 3302:3306 --cpuset-cpus=1 -v ~/mdbdata/mdb10:/var/lib/mysql --name mdb10 mariadb:10.0
docker run -d -p 3303:3306 --cpuset-cpus=2-3 -v ~/mdbdata/mdb10:/var/lib/mysql --name mdb11 mariadb:10.1

In the example above, given a 4 CPU Core system, my containers mdb55 and mdb10 will each run on a separate, single CPU core whereas mdb11 will both remaining cores.

We can also control the way our containers access disk and memory resources, which is definitely useful on a busy system - you don't want a runaway development query using all the disk of your load testing instances, for example. Whereas memory limits are straightforward, block IO shares work in a similar fashion as the CPU shares do, except that the default block IO share is of 500 in a 10 to 1000 range.

Let's limit our two first containers to 512M of memory and 250 block IO shares:

docker rm -f mdb55 mdb10
docker run -d -p 3301:3306 --blkio-weight=250 --memory=512M -v ~/mdbdata/mdb55:/var/lib/mysql --name mdb55 mariadb:5.5
docker run -d -p 3302:3306 --blkio-weight=250 --memory=512M  -v ~/mdbdata/mdb10:/var/lib/mysql --name mdb10 mariadb:10.0

Similarly to what we have seen in the CPU shares example, if the three instances compete for IO, each of the two first containers will be limited to 25% of available IO capacity, the third container being limited to the remaining capacity, e.g. 50%.

There is much more to Docker runtime constraints that what we have been talking about here in this article. Please read the extensive Docker run reference to know about all possible options.

About the Author

Guillaume Lefranc's picture

Guillaume Lefranc is managing the MariaDB Remote DBA Services Team, delivering performance tuning and high availability services worldwide. He's a believer in DevOps culture, Agile software development, and Craft Brewing.

by Guillaume Lefranc at May 09, 2016 08:26 AM

May 06, 2016

Peter Zaitsev

CPU governor performance

CPU governor performance

In this blog, we’ll examine how CPU governor performance affects MySQL.

It’s been a while since we looked into CPU governors and with the new Intel CPUs and new Linux distros, I wanted to check how CPU governors affect MySQL performance.

Before jumping to results, let’s review what drivers manage CPU frequency. Traditionally, the default driver was “acpi-cpufreq”, but for the recent Intel CPUs and new Linux kernel it was changed to “intel_pstate”.

To check what driver is being used, run the command

cpupower frequency-info
 .

cpupower frequency-info
analyzing CPU 0:
driver: acpi-cpufreq
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 10.0 us.
hardware limits: 1.20 GHz - 2.00 GHz
available frequency steps: 2.00 GHz, 2.00 GHz, 1.90 GHz, 1.80 GHz, 1.70 GHz, 1.60 GHz, 1.50 GHz, 1.40 GHz, 1.30 GHz, 1.20 GHz
available cpufreq governors: conservative, ondemand, userspace, powersave, performance
current policy: frequency should be within 1.20 GHz and 2.00 GHz.
The governor "ondemand" may decide which speed to use
within this range.
current CPU frequency is 1.20 GHz (asserted by call to hardware).
cpufreq stats: 2.00 GHz:29.48%, 2.00 GHz:0.00%, 1.90 GHz:0.00%, 1.80 GHz:0.00%, 1.70 GHz:0.00%, 1.60 GHz:0.00%, 1.50 GHz:0.00%, 1.40 GHz:0.00%, 1.30 GHz:0.37%, 1.20 GHz:70.15% (7)
boost state support:
Supported: yes
Active: yes

In this case, we can see that the driver is “acpi-cpufreq”, and the governor is “ondemand”.

On my server (running Ubuntu 16.04, running “Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz” CPUs), I get following output by default settings:

analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 0.97 ms.
hardware limits: 1.20 GHz - 3.00 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 1.20 GHz and 3.00 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 1.50 GHz (asserted by call to hardware).
boost state support:
Supported: yes
Active: yes

So, it’s interesting to see that “intel_pstate” with the “performance” governor is chosen by default, and the CPU frequency range is 1.20GHz to 3.00GHz (even though the CPU specification is 2.ooGHz). If we check CPU specification page, it says that 2.00GHz is the “base frequency” and “3.00GHz” is the “Max Turbo” frequency.

In contrast to “intel_pstate”, “acpi-cpufreq” says “frequency should be within 1.20 GHz and 2.00 GHz.”

Also, “intel_pstate” only supports “performance” and “powersave” governors, while “acpi-cpufreq” has a wider range. For this blog, I only tested “ondemand” and “performance”.

Switching between CPU drivers is not easy, as it requires a server reboot — you need to pass a parameter to the kernel startup line. In Ubuntu, you can do this in /etc/default/grub by changing

GRUB_CMDLINE_LINUX_DEFAULT
 to
GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable"
, which will disable intel_pstate and will load acpi-cpufreq.

Is there a real difference in performance between different CPU drivers and CPU governors? To check , I took a sysbench OLTP read-only workload over a 10Gb network, where the data fits into memory (so it is CPU-burning workload).

The results are as follows. This is a chart for absolute throughput:

CPU governor performance

And to better understand relative performance, here is a chart on how other governors perform compared to “intel-pstate” with the performance governor. In this case, I showed relative performance to “PSTATE performance”, which equals “1”. In the chart, the orange bar is “PSTATE powersave” and shows the relative difference between “PSTATE powersave” and “PSTATE performance” (=1):

CPU governor performance

Here are the takeaways I see:

  • The combination of CPU driver and CPU governors still affect performance
  • ACPI ondemand might be not the best choice to achieve the best throughput
  • Intel_pstate “powersave” is slower on a fewer number of threads (I guess the Linux scheduler assign execution to “sleeping” CPU cores)
  • Both ACPI and Intel_pstate “performance” governor shows the best (and practically identical) performance
  • My Ubuntu 16.04 starts with “intel_pstate” + “performance” governor by default, but you still may want to check what the settings are in your case (and change to “performance” if it is not set)

by Vadim Tkachenko at May 06, 2016 10:53 PM

Jean-Jerome Schmidt

ClusterControl new features webinar on May 24th

Join us for this new webinar on Tuesday, May 24th, where we’ll be discussing and demonstrating the latest version of ClusterControl, the one-stop console for your entire database infrastructure. We’ll be introducing some cool new features for MySQL and MongoDB users in particular as well as walk you through the work we’ve recently done for improved security.

Our colleagues Johan Andersson (CTO), Vinay Joosery (CEO) and Ashraf Sharif (System Support Engineer) will be demonstrating how you can deploy, monitor, manage and scale your databases on the technology stack of your choice with ClusterControl.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, May 24th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, May 24th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • ClusterControl overview
  • New features deep-dive
    • For MySQL-based systems
    • For MongoDB-bases systems
    • Improved security
    • And more…
  • Live Demo
  • Q&A

Speakers

Johan Andersson is CTO at Severalnines, a company that enables developers to easily deploy, manage, monitor and scale highly-available MySQL clusters in the data center, in hosted environments and on the cloud. Prior to Severalnines, Johan worked at MySQL/Sun/Oracle and was the Principal Consultant and lead of the MySQL Clustering and High Availability consulting group, where he designed and implemented large-scale MySQL systems at key customers.

Vinay Joosery is a passionate advocate and builder of concepts and businesses around Big Data computing infrastructures. Prior to co-founding Severalnines, Vinay held the post of Vice-President EMEA at Pentaho Corporation - the Open Source BI leader. He has also held senior management roles at MySQL / Sun Microsystems / Oracle, where he headed the Global MySQL Telecoms Unit, and built the business around MySQL's High Availability and Clustering product lines. Prior to that, Vinay served as Director of Sales & Marketing at Ericsson Alzato, an Ericsson-owned venture focused on large scale real-time databases.

Ashraf Sharif is a System Support Engineer at Severalnines. He was previously worked as principal consultant and head of support team and delivered clustering solutions for big websites in the South East Asia region. His professional interests focus on system scalability and high availability.

 

We look forward to “seeing” you there and some good discussions!

by Severalnines at May 06, 2016 11:36 AM

Colin Charles

London in May 2016

I’m happy to be back in London in May 2016, to talk at two events:

  1. The London MySQL Meetup GroupMonday May 9 – 6.30 PM – 9.00 PM – options for High Availability in the ecosystem that one would consider today. This is a cut down from my Percona Live tutorial, which had about 88 registered attendees and about that amount showed up and asked questions even through the break and after. I had a lot of fun, and I expect I will have similar fun in London talking about this area that has changed a lot in recent times.
  2. Data for the Enterprise – MariaDB Roadshow in LondonWednesday, May 11, 2016 from 9:30 AM to 2:00 PM – this should be a fun gathering, and you would think its just for MariaDB Corporation customers, but it isn’t – all are welcome and you should register!

Looking forward to seeing you at these events. And if you want to chat about MariaDB Server, MySQL or anything related in the opensource world, don’t hesitate to drop me a line via email or Twitter @bytebot.

by Colin Charles at May 06, 2016 05:11 AM

May 05, 2016

MariaDB AB

OpenCPS: Vietnam's Public Sector goes Open Source

Colin Charles

I'm now in Hanoi, Vietnam, for the launch of OpenCPS. What, might you ask, is OpenCPS? OpenCPS translates to Open Core Public Services, as Vietnam is providing online public services and OpenCPS should sit at its core. Naturally, all of this will be open source, and AGPL licensed. OpenCPS is the first open source project to realize the development of e-government services in Vietnam.

Why does this matter to us? Because at the core of its infrastructure is of course, MariaDB Server as the database of choice, with Red Hat being the Linux provider of choice.

I met the interim project lead, Truong Anh Tuan, quite sometime ago, but the tipping point was a keynote presentation on MariaDB Server at FOSSASIA 2015, in Singapore (so thanks again to the organisers for ensuring I keynoted about MariaDB Server there). So when they approached me at FOSSASIA 2016, I was extremely happy to learn that MariaDB Server was a key part of their infrastructure.

MariaDB Server will of course provide the crucial data storage layer, as one would think that data in databases should last beyond a lifetime. We are in good company on the infrastructure layer, with Red Hat Enterprise Linux/CentOS providing the base OS, and the fact that they are also going to be using Docker containers and OpenLDAP. The front-end is the NukeViet CMS.

Some of the features of MariaDB Server that they will be using and benefitting from are not limited to encryption at rest, the PAM plugin, GIS, but also down the line window functions from an analytics standpoint.

The project is on Github, you can find them on Freenode IRC at #opencps, and they also have a mailing list.

I look forward to closely collaborating with OpenCPS going forward. Its always a good day when a government of a nation chooses open source software.

If you're looking for some pictures, there are some tweets with them -- when something official comes, we'll update the post. encryption, tweet thread, Google Summer of Code 2017?, MariaDB 5.2.

About the Author

Colin Charles's picture

Colin Charles is the Chief Evangelist for MariaDB since 2009, work ranging from speaking engagements to consultancy and engineering works around MariaDB. He lives in Kuala Lumpur, Malaysia and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He's well known on the conference track having spoken at many of them over the course of his career.

by Colin Charles at May 05, 2016 03:33 PM

Peter Zaitsev

Percona Server 5.7: multi-threaded LRU flushing

multi-threaded LRU flushing

In this blog post, we’ll discuss how to use multi-threaded LRU flushing to prevent bottlenecks in MySQL.

In the previous post, we saw that InnoDB 5.7 performs a lot of single-page LRU flushes, which in turn are serialized by the shared doublewrite buffer. Based on our 5.6 experience we have decided to attack the single-page flush issue first.

Let’s start with describing a single-page flush. If the working set of a database instance is bigger than the available buffer pool, existing data pages will have to be evicted or flushed (and then evicted) to make room for queries reading in new pages. InnoDB tries to anticipate this by maintaining a list of free pages per buffer pool instance; these are the pages that can be immediately used for placing the newly-read data pages. The target length of the free page list is governed by the innodb_lru_scan_depth parameter, and the cleaner threads are tasked with refilling this list by performing LRU batch flushing. If for some reason the free page demand exceeds the cleaner thread flushing capability, the server might find itself with an empty free list. In an attempt to not stall the query thread asking for a free page, it will then execute a single-page LRU flush (

buf_LRU_get_free_block
 calling
buf_flush_single_page_from_LRU
 in the source code), which is performed in the context of the query thread itself.

The problem with this flushing mode is that it will iterate over the LRU list of a buffer pool instance, while holding the buffer pool mutex in InnoDB (or the finer-grained LRU list mutex in XtraDB). Thus, a server whose cleaner threads are not able to keep up with the LRU flushing demand will have further increased mutex pressure – which can further contribute to the cleaner thread troubles. Finally, once the single-page flusher finds a page to flush it might have trouble in getting a free doublewrite buffer slot (as shown previously). That suggested to us that single-page LRU flushes are never a good idea.  The flame graph below demonstrates this:

multi-threaded LRU flushing

Note how a big part of the server run time is attributed to a flame rooted at JOIN::optimize, whose run time in turn is almost fully taken by

buf_dblwr_write_single_page
 in two branches.

The easiest way not to avoid a single-page flush is, well, simply not to do it! Wait until a cleaner thread finally provides some free pages for the query thread to use. This is what we did in XtraDB 5.6 with the innodb_empty_free_list_algorithm server option (which has a “backoff” default). This is also present in XtraDB 5.7, and resolves the issues of increased contentions for the buffer pool (LRU list) mutex and doublewrite buffer single-page flush slots. This approach handles the the empty free page list better.

Even with this strategy it’s still a bad situation to be in, as it causes query stalls when page cleaner threads aren’t able to keep up with the free page demand. To understand why this happens, let’s look into a simplified scheme of InnoDB 5.7 multi-threaded LRU flushing:

multi-threaded LRU flushing

The key takeaway from the picture is that LRU batch flushing does not necessarily happen when it’s needed the most. All buffer pool instances have their LRU lists flushed first (for free pages), and flush lists flushed second (for checkpoint age and buffer pool dirty page percentage targets). If the flush list flush is in progress, LRU flushing will have to wait until the next iteration. Further, all flushing is synchronized once per second-long iteration by the coordinator thread waiting for everything to complete. This one second mark may well become a thirty or more second mark if one of the workers is stalled (with the telltale sign: “InnoDB: page_cleaner: 1000ms intended loop took 49812ms”) in the server error log. So if we have a very hot buffer pool instance, everything else will have to wait for it. And it’s long been known that buffer pool instances are not used uniformly (some are hotter and some are colder).

A fix should:

  • Decouple the “LRU list flushing” from “flush list flushing” so that the two can happen in parallel if needed.
  • Recognize that different buffer pool instances require different amounts of flushing, and remove the synchronization between the instances.

We developed a design based on the above criteria, where each buffer pool instance has its own private LRU flusher thread. That thread monitors the free list length of its instance, flushes, and sleeps until the next free list length check. The sleep time is adjusted depending on the free list length: thus a hot instance LRU flusher may not sleep at all in order to keep up with the demand, while a cold instance flusher might only wake up once per second for a short check.

The LRU flushing scheme now looks as follows:

multi-threaded LRU flushing

This has been implemented in the Percona Server 5.7.10-3 RC release, and this design the simplified the code as well. LRU flushing heuristics are simple, and any LRU flushing is now removed from the legacy cleaner coordinator/worker threads – enabling more efficient flush list flushing as well. LRU flusher threads are the only threads that can flush a given buffer pool instance, enabling further simplification: for example, InnoDB recovery writer threads simply disappear.

Are we done then? No. With the single-page flushes and single-page flush doublewrite bottleneck gone, we hit the doublewrite buffer again. We’ll cover that in the next post.

by Laurynas Biveinis at May 05, 2016 01:34 PM

Jean-Jerome Schmidt

Planets9s - Sign up for our ClusterControl New Features Webinar

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Sign up for our ClusterControl New Features Webinar

Join us for this new webinar on Tuesday, May 24th, where we’ll be discussing and demonstrating the latest version of ClusterControl, the one-stop console for your entire database infrastructure. We’ll be introducing some cool new features for MySQL and MongoDB users in particular as well as walk you through the work we’ve recently done for improved security.

Sign up for the webinar

Download our new whitepaper: The MySQL Replication Blueprint

We’re excited to introduce the Severalnines Blueprint for MySQL Replication, a new whitepaper which discusses all aspects of a MySQL Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies as ProxySQL, MaxScale and HAProxy.

Download the whitepaper

Become a MongoDB DBA: provisioning and deployment

If you are a MySQL DBA you may ask yourself why you would install MongoDB? That is actually a very good question as MongoDB and MySQL have been in a flame-war a couple of years ago. But there are many cases where you simply have to. And if you’re in that situation, this new blog series gives you an excellent starting point to get yourself prepared for MongoDB.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at May 05, 2016 10:49 AM

May 04, 2016

Peter Zaitsev

MySQL High Availability: The Road Ahead for Percona and XtraDB Cluster

MySQL High Availability

MySQL High AvailabilityThis blog post discusses what is going on in the MySQL high availability market, and what Percona’s plans are for helping customers with high availability solutions.

One thing I like to tell people is that you shouldn’t view Percona as a “software” company, but as a “solution” company. Our goal has always been to provide the best solution that meets each customer’s situation, rather than push our own software, regardless of whether it is the best fit or not. As a result, we have customers running all kinds of MySQL “flavors”: MySQL, MariaDB, Percona Server, Amazon RDS and Google Cloud SQL. We’re happy to help customers be successful with the technology of their choice, and advise them on alternatives when we see a better fit.

One area where I have been increasingly uneasy is our advanced high availability support with Percona XtraDB Cluster and other Galera-based technologies. In 2011, when we started working on Percona XtraDB Cluster together with Codership, we needed to find a way to arrange investment into the development of Galera technology to bring it to market. So we made a deal, which, while providing needed development resources, also required us to price Percona XtraDB Cluster support as a very expensive add-on option. While this made sense at the time, it also meant few companies could afford XtraDB Cluster support from Percona, especially at large scale.

As a few years passed, the Galera technology became the mainstream high-end high availability option. In addition to being available in Percona XtraDB Cluster, it has been included in MariaDB, as well as Galera Cluster for MySQL. Additionally, the alternative technology to solve the same problem – MySQL Group Replication – started to be developed by the MySQL Team at Oracle. With these all changes, it was impossible for us to provide affordable support for Percona XtraDB Cluster due to our previous commercial agreement with Codership that reflected a very different market situation than we now find ourselves facing.

As a result, over a year ago we exited our support partnership agreement with Codership and moved the support and development function in-house. These changes have proven to be positive for our customers, allowing us to better focus on their priorities and provide better response time for issues, as these no longer require partner escalation.

Today we’re taking the next natural step – we will no longer require customers to purchase Percona XtraDB Cluster as a separate add-on. Percona will include support for XtraDB Cluster and other Galera-based replication technologies in our Enterprise and Premier support levels, as well as our Percona Care and Managed Services subscriptions. Furthermore, we are going to support Oracle’s MySQL Group Replication technology at no additional cost too, once it becomes generally available, so our customers have access to the best high availability technology for their deployment.

As part of this change, you will also see us focusing on hardening XtraDB Cluster and Galera technology, making it better suited for demanding business workloads, as well as more secure and easier to use. All of our changes will be available as 100% open source solutions and will also be contributed back to the Galera development team to incorporate into their code base if they wish.

I believe making the Galera code better is the most appropriate action for us at this point!

by Peter Zaitsev at May 04, 2016 02:07 PM

MariaDB AB

MariaDB MaxScale 1.4.3 GA is available for download

Johan Wikman

We are pleased to announce that MariaDB MaxScale 1.4.3 GA is now available for download!

If MariaDB MaxScale is new to you, we recommend reading this page first.

1.4.3 is a bugfix release, not bringing any new features but fixing certain issues found in 1.4.2.

MariaDB MaxScale 1.4 brings:

  1. The Firewall Filter has been extended and can now be used for either black-listing or white-listing queries. In addition it is capable of logging both queries that match and queries that do not match.
  2. Client-side SSL has been available in MariaDB MaxScale for a while, but it has been somewhat unreliable. We now believe that client side SSL is fully functional and usable.

Additional improvements:

  • POSIX Extended Regular Expression Syntax can now be used in conjunction with qlafilter, topfilter and namedserverfilter.
  • Improved user grant detection.
  • Improved password encryption.

The release notes can be found here and the binaries can be downloaded here.

In case you want to build the binaries yourself, the source can be found at GitHub, tagged with 1.4.3.

We hope you will download and use this release, and we encourage you to create a bug report in Jira for any bugs you might encounter.

Tags: 

About the Author

Johan Wikman's picture

Johan Wikman is a senior developer working on MaxScale at MariaDB Corporation. 

by Johan Wikman at May 04, 2016 07:17 AM

May 03, 2016

MariaDB Foundation

MariaDB Galera Cluster 10.0.25 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 10.0.25. See the release notes and changelog for details on this release. Download MariaDB Galera Cluster 10.0.25 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB Galera Cluster 10.0.25 now available appeared first on MariaDB.org.

by Daniel Bartholomew at May 03, 2016 09:02 PM

Peter Zaitsev

MySQL 5.7: initial flushing analysis and why Performance Schema data is incomplete

Performance Schema data is incomplete

In this post, we’ll examine why in an initial flushing analysis we find that Performance Schema data is incomplete.

Having shown the performance impact of Percona Server 5.7 patches, we can now discuss their technical reasoning and details. Let’s revisit the MySQL 5.7.11 performance schema synch wait graph from the previous post, for the case of unlimited InnoDB concurrency:

 Performance Schema data is incomplete

First of all, this graph is a little “nicer” than reality, which limits its diagnostic value. There are two reasons for this. The first one is that page cleaner worker threads are invisible to Performance Schema (see bug 79894). This alone limits PFS value in 5.7 if, for example, one tries to select only the events in the page cleaner threads or monitors low concurrency where the cleaner thread count is non-negligible part of the total threads.

To understand the second reason, let’s look into PMP for the same setting. Note that selected intermediate stack frames were removed for clarity, especially in the InnoDB mutex implementation.

660 pthread_cond_wait,enter(ib0mutex.h:850),buf_dblwr_write_single_page(ib0mutex.h:850),buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page(buf0flu.cc:1096),buf_flush_single_page_from_LRU(buf0flu.cc:2217),buf_LRU_get_free_block(buf0lru.cc:1401),...
631 pthread_cond_wait,buf_dblwr_write_single_page(buf0dblwr.cc:1213),buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page(buf0flu.cc:1096),buf_flush_single_page_from_LRU(buf0flu.cc:2217),buf_LRU_get_free_block(buf0lru.cc:1401),...
337 pthread_cond_wait,PolicyMutex<TTASEventMutex<GenericPolicy>(ut0mutex.ic:89),get_next_redo_rseg(trx0trx.cc:1185),trx_assign_rseg_low(trx0trx.cc:1278),trx_set_rw_mode(trx0trx.cc:1278),lock_table(lock0lock.cc:4076),...
324 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2448),LinuxAIOHandler::poll(os0file.cc:2594),...
241 pthread_cond_wait,PolicyMutex<TTASEventMutex<GenericPolicy>(ut0mutex.ic:89),trx_write_serialisation_history(trx0trx.cc:1578),trx_commit_low(trx0trx.cc:2135),...
147 pthread_cond_wait,enter(ib0mutex.h:850),trx_undo_assign_undo(ib0mutex.h:850),trx_undo_report_row_operation(trx0rec.cc:1918),...
112 pthread_cod_wait,mtr_t::s_lock(sync0rw.ic:433),btr_cur_search_to_nth_level(btr0cur.cc:1008),...
83 poll(libc.so.6),Protocol_classic::get_command(protocol_classic.cc:965),do_command(sql_parse.cc:935),handle_connection(connection_handler_per_thread.cc:301),...
64 pthread_cond_wait,Per_thread_connection_handler::block_until_new_connection(thr_cond.h:136),...

The top wait in both PMP and the graph is the 660 samples of enter mutex in

buf_dblwr_write_single_pages
, which is the doublewrite mutex. Now try to find the nearly as hot 631 samples of event wait in
buf_dblwr_write_single_page
 in the PFS output. You won’t find it because InnoDB OS event waits are not annotated in Performance Schema. In most cases this is correct, as OS event waits tend to be used when there is no work to do. The thread waits for work to appear, or for time to pass. But in the report above, the waiting thread is blocked from proceeding with useful work (see bug 80979).

Now that we’ve shown the two reasons why PFS data is not telling the whole server story, let’s take PMP data instead and consider how to proceed. Those top two PMP waits suggest 1) the server is performing a lot of single page flushes, and 2) those single page flushes have their concurrency limited by the eight doublewrite single-page flush slots available, and that the wait for a free slot to appear is significant.

Two options become apparent at this point: either make the single-page flush doublewrite more parallel or reduce the single-page flushing in the first place. We’re big fans of the latter option since version 5.6 performance work, where we configured Percona Server to not perform single-page flushes at all by introducing the innodb_empty_free_list_algorithm option, with the “backoff” default.

The next post in the series will describe how we removed single-page flushing in 5.7.

by Laurynas Biveinis at May 03, 2016 07:22 PM

Best Practices for Configuring Optimal MySQL Memory Usage

Optimal MySQL Memory Usage

Optimal MySQL Memory UsageIn this blog post, we’ll discuss some of the best practices for configuring optimal MySQL memory usage.

Correctly configuring the use of available memory resources is one of the most important things you have to get right with MySQL for optimal performance and stability. As of MySQL 5.7, the default configuration uses a very limited amount of memory – leaving defaults is one of the worst things you can do. But configuring it incorrectly can result in even worse performance (or even crashes).

The first rule of configuring MySQL memory usage is you never want your MySQL to cause the operating system to swap. Even minor swapping activity can dramatically reduce MySQL performance. Note the keyword “activity” here. It is fine to have some used space in your swap file, as there are probably parts of the operating system that are unused when MySQL is running, and it’s a good idea to swap them out. What you don’t want is constant swapping going on during the operation, which is easily seen in the “si” and “so” columns in the

vmstat
 output.

Optimal MySQL Memory Usage

Example: No Significant Swapping

Optimal MySQL Memory Usage

Example:  Heavy Swapping Going

If you’re running Percona Monitoring and Management, you can also look into the Swap Activity graph in System Overview Dashboard.

Optimal MySQL Memory Usage

If you have spikes of more than 1MB/sec, or constant swap activity, you might need to revisit your memory configuration.

MySQL Memory allocation is complicated. There are global buffers, per-connection buffers (which depend on the workload), and some uncontrolled memory allocations (i.e., inside Stored Procedures), all contributing to difficulties in computing how much memory MySQL will really use for your workload. It is better to check it by looking at the virtual memory size (VSZ) that MySQL uses. You can get it from “top”, or by running

ps aux | grep mysqld
.

mysql     3939 30.3 53.4 11635184 8748364 ?    Sl   Apr08 9106:41 /usr/sbin/mysqld

The 5th column here shows VSZ usage (about 11GB).

Note that the VSZ is likely to change over time. It is often a good idea to plot it in your monitoring system and set an alert to ping you when it hits a specified threshold. Don’t allow the mysqld process VSZ exceed 90% of the system memory (and less if you’re running more than just MySQL on the system).

It’s a good idea to start on the safe side by conservatively setting your global and per connections buffers, and then increase them as you go. Many can be set online, including innodb_buffer_pool_size in MySQL 5.7.

So how do you decide how much memory to allocate to MySQL versus everything else? In most cases you shouldn’t commit more than 90% of your physical memory to MySQL, as you need to have some reserved for the operating system and things like caching binary log files, temporary sort files, etc.

There are cases when MySQL should use significantly less than 90% of memory:

  • If there are other important processes running on the same server, either all the time or periodically. If you have heavy batch jobs run from cron, which require a lot of memory, you’ll need to account for that.
  • If you want to use OS caching for some storage engines. With InnoDB, we recommend
    innodb_flush_method=O_DIRECT
     
     in most cases, which won’t use Operating System File Cache. However, there have been cases when using buffered IO with InnoDB made sense. If you’re still running MyISAM, you will need OS cache for the “data” part of your tables. With TokuDB, using OS cache is also a good idea for some workloads.
  • If your workload has significant demands, Operating System Cache – MyISAM on disk temporary tables, sort files, some other temporary files which MySQL creates the need to be well-cached for optimal performance.

Once you know how much memory you want the MySQL process to have as a whole, you’ll need to think about for what purpose the memory should be used inside MySQL.The first part of memory usage in MySQL is workload related – if you have many connections active at the same time that run heavy selects using a lot of memory for sorting or temporary tables, you might need a lot of memory (especially if Performance Schema is enabled). In other cases this amount of memory is minimal. You’ll generally need somewhere between 1 and 10GB for this purpose.

Another thing you need to account for is memory fragmentation. Depending on the memory allocation library you’re using (glibc, TCMalloc, jemalloc, etc.), the operating system settings such as Transparent Huge Pages (THP) and workload may show memory usage to grow over time (until it reaches some steady state). Memory fragmentation can also account for 10% or more of additional memory usage.

Finally, let’s think about various global buffers and caching. In typical cases, you mainly only have

innodb_buffer_pool_size
 to worry about. But you might also need to consider
key_buffer_size
tokudb_cache_size
,
query_cache_size
  as well as
table_cache
 
and
table_open_cache
. These are also responsible for global memory allocation, even though they are not counted in bytes. Performance _Schema may also take a lot of memory, especially if you have a large number of connections or tables in the system.

When you specify the size of the buffers and caches, you should determine what you’re specifying. For

innodb_buffer_pool_size
, remember there is another 5-10% of memory that is allocated for additional data structures – and this number is larger if you’re using compression or set
innodb_page_size
 smaller than 16K. For
tokudb_cache_size
, it’s important to remember that the setting specified is a guide, not a “hard” limit: the cache size can actually grow slightly larger than the specified limit.

For systems with large amounts of memory, the database cache is going to be by far the largest memory consumer, and you’re going to allocate most of your memory to it. When you add extra memory to the system, it is typically to increase the database cache size.

Let’s do some math for a specific example. Assume you have a system (physical or virtual) with 16GB of memory. We are only running MySQL on this system, with an InnoDB storage engine and use

innodb_flush_method=O_DIRECT
, so we can allocate 90% (or 14.4GB) of memory to MySQL. For our workload, we assume connection handling and other MySQL connection-based overhead will take up 1GB (leaving 13.4GB). 0.4GB is likely to be consumed by various other global buffers (
innodb_log_buffer_size
, Table Caches, other miscellaneous needs, etc.), which now leaves 13GB. Considering the 5-7% overhead that the InnodB Buffer Pool has, a sensible setting is
innodb_buffer_pool_size=12G
 – what we very commonly see working well for systems with 16GB of memory.

Now that we have configured MySQL memory usage, we also should look at the OS configuration. The first question to ask is if we don’t want MySQL to swap, should we even have the swap file enabled?  In most cases, the answer is yes – you want to have the swap file enabled (strive for 4GB minimum, and no less than 25% of memory installed) for two reasons:

  • The operating system is quite likely to have some portions that are unused when it is running as a database server. It is better to let it swap those out instead of forcing it to keep it in memory.
  • If you’ve made a mistake in the MySQL configuration, or you have some rogue process taking much more memory than expected, it is usually a much better situation to lose performance due to a swap then to kill MySQL with an out of memory (OOM) error – potentially causing downtime.

As we only want the swap file used in emergencies, such as when there is no memory available or to swap out idle processes, we want to reduce Operating System tendency to swap   (

echo 1 >  /proc/sys/vm/swappiness
). Without this configuration setting you might find the OS swapping out portions of MySQL just because it feels it needs to increase the amount of available file cache (which is almost always a wrong choice for MySQL).

The next thing when it comes to OS configuration is setting the Out Of Memory killer. You may have seen message like this in your kernel log file:

Apr 24 02:43:18 db01 kernel: Out of memory: Kill process 22211 (mysqld) score 986 or sacrifice child

When MySQL itself is at fault, it’s pretty rational thing to do. However, it’s also possible the real problem was some of the batch activities you’re running: scripts, backups, etc. In this case, you probably want those processes to be terminated if the system does not have enough memory rather than MySQL.

To make MySQL a less likely candidate to be killed by the OOM killer, you can adjust the behavior to make MySQL less preferable with the following:

echo '-800' > /proc/$(pidof mysqld)/oom_score_adj

This will make the Linux kernel prefer killing other heavy memory consumers first.

Finally on a system with more than one CPU socket, you should care about NUMA when it comes to MySQL memory allocation. In newer MySQL versions, you want to enable

innodb_numa_interleave=1
. In older versions you can either manually run
numactl --interleave=all
  before you start MySQL server, or use the numa_interleave configuration option in Percona Server.

 

by Peter Zaitsev at May 03, 2016 02:26 PM

MariaDB AB

MariaDB Security at the Open Source Conference in Rome

M. Luisa Raviol

At the 2016 Open Source Conference in Rome, held a few weeks ago, MariaDB was present along with many of the most relevant players in the Open Source community. This included the Italian Public Administration. They are required by law to give preference to free software and open source software. The goal of the Italian Open Source Observatory is to promote the use of open source software in Italian public administrations. They monitor and collect information on cases of open source adoption and best practices related to open technology usage within institutions.

Our focus at the conference this year was in particular on the new important MariaDB security features. They're part of our open source offering starting with the release of MariaDB 10.1. It's extremely important that these features are accessible to all MariaDB users. However, it's very relevant for the Italian Public Administration since they collect, manage and share sensitive Italian citizen data. This data needs to be protected and secure from any kind of unauthorized access.

With MariaDB 10.1 Security Capabilities, the Italian Public Administration can access these database security features (e.g., firewalls, encryption and database activity logging) while still relying on an open source solution in compliance with Italian law.

Every organization is concerned about threats to its data and the consequences that such threats represent. Every citizen wants to be sure that his privacy is protected and private data (i.e., health, insurance, family and financial data) are safely managed, transferred and accessed. But it's also important that all of the services provided by the Public Administration through their web portals are always accessible and that those web sites and their database are protected against attacks that aim to deny those services.

Protecting data through encryption is essential, but it's also important to make every effort possible to prevent data from being accessed. MariaDB provides an excellent set of features to support all of the new requirements in terms of database authentication, like Single Sign On which is preferred in most enterprises.

Private and public companies—even though they may have different objectives—also need to secure their services whenever they are provided through the Internet. This can be handled by MariaDB through the firewall features provided with MariaDB MaxScale. It can prevent attacks which are derived from unauthorized access, as well as denial of service attacks and SQL Injections.

See also our Blog on MariaDB Database Security.
MariaDB Security at the Open Source Conference in Rome
Tags: 

About the Author

M. Luisa Raviol's picture

Maria-Luisa Raviol is a Senior Sales Engineer with over 20 years industry experience.

by M. Luisa Raviol at May 03, 2016 12:35 PM

Shlomi Noach

Solving the Facebook-OSC non-atomic table swap problem

We present a way to use an atomic, blocking table swap in the Facebook Online-Schema-Change solution, as well as in a rumored, other Online-Schema-rumored-Change solution. Update: also a caveat.

Quick recap (really quick)

pt-online-schema-change and facebook-osc are two popular online-schema-change solutions for MySQL. They both use triggers, but in different ways. While the Percona tool uses synchronous table updates, such that any INSERT|UPDATE|DELETE on the modified table causes an INSERT|UPDATE|DELETE on a ghost table, in the Facebook tool all cause an INSERT on a changelog table, which is then iterated, read, having entries applied on the ghost table.

The TL;DR is that DMLs on the table propagate synchronously, within same transaction in the Percona tool, and asynchronously, with lag, in the Facebook tool.

What's the problem with the table swap?

In the Percona tool, once the logic is satisfied the copy is complete, we issue this query:

RENAME TABLE tbl TO tbl_old, tbl_new TO tbl;

This is an atomic, two table RENAME operation.

However with the asynchronous nature of the Facebook tool, such a RENAME would be a mistake. We must first block writes to the modified table, then make sure we have iterated the changelog table to the point of lock, apply those changes onto the ghost table, and only then do the swap.

The problem is: you cannot RENAME TABLES while one of them is LOCKed.

This is silly, and inconsistent, because:

> LOCK TABLES tbl WRITE;
Query OK, 0 rows affected (0.00 sec)

> RENAME TABLE tbl TO tbl_old, tbl_new TO tbl;
ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction

> ALTER TABLE tbl RENAME TO tbl_old;
Query OK, 0 rows affected (0.00 sec)

Why would the RENAME fail where the ALTER works?

Small thing, but critical to the operation of the online-schema-change. From the Facebook OSC documentation:

Since alter table causes an implicit commit in innodb, innodb locks get released after the first alter table. So any transaction that sneaks in after the first alter table and before the second alter table gets a 'table not found' error. The second alter table is expected to be very fast though because copytable is not visible to other transactions and so there is no need to wait.

What the FB solution means

It means for a very brief duration, the table is not there. Your app will get errors.

Of course, we should be able to handle errors anytime, aywhere. But the honest truth is: we (as in the world) do not. Many apps will fail ungracefully should they get a table not found error.

An atomic swap, as compared, would make for briefly blocking operations, making the app ignorant of the swap.

Rumor

Rumor has it that we at GitHub are developing a new, triggerless, Online Schema Change tool. It is rumored to be based off binary logs and is rumored to have lots of interesting rumored implications.

Such rumored implementation would have to be asynchronous by nature, or so rumors say. And as such, it would fall for the same non-atomic table swap problem.

Solution

Once we heard it was rumored we were working on a triggerless online schema change tool, we realized we would have to solve the non-atomic swap problem. What we did was to gossip about it in between ourselves, which led to three different rumors of a solution, eventually manifested as three different working solutions. All three solutions make for blocking queries on the app's side. I will present one of these solution here, based on voluntary locks.

The idea is to make a table locked without actually issuing a LOCK TABLE statement, such that we would be able to run a RENAME TABLE operation, that would wait until we say it's good to complete.

Let's assume:

  • Our table is tbl
  • Ghost table (table onto which we've actually made the changes) is tbl_new
  • Our app continuously writes to tbl
  • We have 3 connections on our tables, aptly named #1, #2, #3

We issue the following, in this order:

  1. #1:
    SELECT GET_LOCK('ding', 0);

    Lock acquired, no problems

  2. #2:
    SELECT RELEASE_LOCK('ding') FROM tbl WHERE GET_LOCK('ding', 999999)>=0 LIMIT 1;

    Ignore the RELEASE_LOCK for now, this is merely cleanup. The query attempts to read one row from tbl where GET_LOCK('ding')>=0. But 'ding' is locked, hence the entire query blocks.
    Otherwise, other queries on tbl (both reads and writes) are running fine.

  3. #3:
    RENAME TABLE tbl TO tbl_old, tbl_new TO tbl;

    Now the magic begins. The RENAME operation cannot proceed while queries are executing on tbl. In particular, it waits on #2 to complete. But #2 is blocked on #1, so it does not complete. Our RENAME is also blocked!
    There are further two consequences that work to our advantage:

    • Any further incoming INSERT|UPDATE|DELETE on tbl is now unable to proceed; such queries will now wait for the RENAME to complete. So no further updated on tbl are being applied. App is blocked
    • tbl_new is nonblocked! And this is because how RENAME works internally. Since it couldn't satisfy the first clause, it doesn't even check for the second, and does not place a LOCK on tbl_new.
  4. OSC:
    Now that no further writes are possible on tbl, we satisfy ourselves that we've iterated to the last of the changelog entries and have applied changes to tbl_new. Exactly how we satisfy ourselves is a matter of implementation. Rumor is that we use a rumor that the last entry was handled in our rumored solution. That last part is actually not a pun.
    We are now content that all changes have been applied to tbl_new.
  5. #1:
    SELECT RELEASE_LOCK('ding');

    Ding! Connection #2 gets released, reads some row from tbl (but no one is actually interested in the result set) and completes. The #3 RENAME is not blocking on anything anymore. It executes. The tables are swapped. Once they are swapped, any INSERT|UPDATE|DELETEs that were pending on tbl are released and App is unblocked.
    The atomic swap is complete.

Implementation

Agony. This workaround is agonizing. Is agonization a word? By rumor written in Go, our OSC has this implemented via goroutines, and the code is one of those atrocities you are ashamed to look at. Well, it's OK under the circumstances. But really, implementing this is painful, and actually more complicated than the above description. Why is that?

In the above we make fro two blocking operations: #2 and #3. We must not proceed to #3 before #2 is applied, and we must not proceed to OSC completion before #3 is applied. But how does our code know that it's being blocked? If it's being blocked, it can't tell me anything, because it's blocked. If it's not blocked yet, it can tell me it's not blocked yet, but I'm really interested in knowing the time it gets blocked.

But preferably the exact time, or near exact, because one we start blocking, App suffers. Connections accumulate. We really want to make the swap as quick as possible (and by rumor we have a rollback & retry mechanism for this operation if it exceeds X seconds).

Unfortunately the solution involves polling. That is, Once we issue #2 (asynchronously, right? It's blocking), we aggressively poll SHOW PROCESSLIST and look for that blocked query. And the same for #3. Polling is a form of necessary ugliness in this flow.

Other solutions

The other two solutions do not use a voluntary lock. Instead:

  1. Use a LOCK on some yet another table and a query involving that table JOINed with tbl
  2. A SELECT ... FOR UPDATE on yet another table followed by a SELECT on the locked row on that table JOINed with tbl.

We leave the implementation as an exercise for the reader.

Can't we just make the RENAME work under LOCK?

Yeah. That's what the Facebook people said. "Hey, we can just fix this".

Update: caveat

Should connection #1 or connection #2 die unexpectedly before we are satisfied the events have all been applied, the `RENAME` gets unblocked due to the collapse of locks, and we end up with a premature swap of the tables, potentially before we have applied the latest entries from the changelog table. This was noted by my colleague Gillian Gunson, and we keep looking into this.

by shlomi at May 03, 2016 06:42 AM

May 02, 2016

MariaDB AB

Modelling databases in a web browser

Colin Charles

When folk in the MariaDB world think about a GUI front-end, they tend to think of HeidiSQL, which has been shipping as part of MariaDB Server since 5.2.7. If one has a MariaDB Enterprise subscription, you get the Visual Query Editor which is SQLyog by Webyog. In fact the Knowledge Base lists a fairly lengthy list of Graphical and Enhanced Clients.

This past week I got a quick demo of Vertabelo. It is a SaaS based database design tool, that allows you to model within a web browser. They do have an on-site option too that costs a bit more. Now we don't have a direct offering for this, but I've always thought that this is the kind of complementary product today's developers will ask for: data modelling with a friendly UI, having a chat with people about a query (so if you the DBA find that the query looks odd, you can immediately start a chat with the developer to get it fixed and all this gets logged), and one of the cool things is the live validation (I saw this and enjoyed the hints it also provides you). And naturally there is also the idea of "reverse engineering" between PostgreSQL, SQL Server, Oracle, DB2, and others. The feature list is wide and varied.

I'm hoping to see more support of MariaDB Server within their offering, after all they did talk about the history of MySQL as well as Problems With Naming a Product: SOLVED in the past.

Tags: 

About the Author

Colin Charles's picture

Colin Charles is the Chief Evangelist for MariaDB since 2009, work ranging from speaking engagements to consultancy and engineering works around MariaDB. He lives in Kuala Lumpur, Malaysia and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He's well known on the conference track having spoken at many of them over the course of his career.

by Colin Charles at May 02, 2016 01:50 PM

MariaDB Foundation

MariaDB in Google Summer of Code 2016

And for the fourth year in a row, MariaDB Foundation participates in the Google Summer of Code! The MariaDB Organization in GSoC is an umbrella organization for all projects that belongs to the MariaDB ecosystem, be it MariaDB Server, MariaDB Connectors, or MariaDB MaxScale. The complete list of our suggested project ideas is in MariaDB […]

The post MariaDB in Google Summer of Code 2016 appeared first on MariaDB.org.

by Sergei at May 02, 2016 01:43 PM

MariaDB AB

ColumnStore Architecture & Use-case

Nishant Vyas

In the previous blog, I've talked about, Why is ColumnStore important?. Let's look at MariaDB ColumnStore architecture and use-cases.

ColumnStore Architecture

MariaDB ColumnStore is built on a three-tier scalable architecture that supports the kind of growth that MariaDB users have grown accustomed to. Queries are processed by user modules, which assign tasks to parallel performance modules that access the columnar distributed storage layer below. Performance modules scale almost infinitely, providing both performance and capacity growth as you add servers. These modules don’t process queries; they just take instructions from the user modules, which organize and deliver the results.

At the very high-level, when a query arrives at MariaDB Database Server; It gets the data from storage engine, applies projection, filtering, sorting, aggregation etc before returning results to client. This approach is monolithic and serial, thus slow in performance when processing large amount of data.

Whereas, in MariaDB ColumnStore, query interface hands over the query to UM. UM runs on MariaDB server as ExeMgr. ExeMgr from MariaDB Server perspective is the storage engine. The ExeMgr is multi-threaded and pushes down query operations to PM nodes. As ExecMgr on UM node is multithreaded, it can handle the results coming from PM in different order, without blocking one PM while waiting for intermediate data back from another PM.

As UM can push down the query operations to distributed and multithreaded PM nodes to access data in parallel, apply filter and projection before sending it back to query interface. Also, If more than one PM are involved to access the data, ExeMgr will do the final aggregation before sending the results back to client. This would greatly improve the performance of large and complex analytical queries. For those familiar with Hadoop can easily relate MPP technology with Map-Reduce.

Much of our work integrating InfiniDB with MariaDB was adding the robustness that our customers have come to expect. This includes SSL support, audit and authentication plug-ins and role-base access. Our product roadmap includes additional features integration, such as support for Apache Spark, a regression window function and full integration with MaxScale, our dynamic data routing platform that provides for minimum downtime, security, scalability and interoperability beyond MariaDB and MySQL.

ColumnStore doesn’t support unstructured data natively. This was a conscious decision. There has been a lot of active development in the Hadoop and Spark ecosystems that leverage analytics and data discovery on unstructured data. We would like to integrate Apache Spark’s in-memory compute ability with its machine learning libraries like MLLib to allow our users to take full advantage of it.

We believe ColumnStore is more advanced in traditional OLAP use cases than other NoSQL alternatives. For example, Cassandra is a popular open-source database for handling write-intensive operations that involve many data types. It’s a good decision-support database, but it isn’t fully SQL compliant, it lacks in-database analytics and it does not have the ability to perform joins. Also, Cassandra is billed as a “column index” database, but it doesn’t have a true columnar structure. It requires manual partitioning, which restricts performance benefits to one single partition column. Basically, it’s much slower than ColumnStore for analytics operations.

A good example of ColumnStore’s speed and scalability is advertising serving and analytics use-case, an application with huge data volumes and the need for split-second speed. A hosted advertising serving platform can deliver hundreds of billions of ad impressions per month, a task that involves inserting nearly 100 million rows per day on tables with many columns and an uncompressed database size of hundreds of terabytes with just 7-10 nodes.

With MariaDB ColumnStore’s data ingestion rate of up to one million rows per second, the ad impression data can be made available to end users in near-real-time. Using the familiar SQL interface, end users can perform ad hoc and analytics reporting, as well as query the stock of available inventory with three-second response times. That translates into better-targeted ads, higher conversion rates and more sales.

We are thrilled to bring MariaDB ColumnStore to you and can’t wait to see what our customers do with it. Please share your own experiences so we can continue to evolve according to the needs of our community.

ColumnStore will be open for beta testing by end of May. You can Sign up for notification of Beta availability. You can also talk to our sales or setup a call with our solution architects to discuss more about ColumnStore and how it can fit your use-case.

About the Author

Nishant Vyas's picture
Nishant joins MariaDB as Head of Product and Strategy from LinkedIn, where he was one of the early employees. During his almost nine-year tenure at LinkedIn, he contributed to building, scaling and operating production data stores using technologies like Oracle, MySQL, NoSQL and more. Nishant has extensive experience as a database engineer, database architect and DBA, and has held various leadership roles. He holds a bachelor's degree in engineering from Gujarat University and a master's degree in computer science from the University of Bridgeport. Based in the San Francisco bay area, Nishant is co-author of a patent in waterwheel sharding.

by Nishant Vyas at May 02, 2016 12:08 PM

April 30, 2016

Valeriy Kravchuk

Fun with Bugs #42 - Bugs Fixed in MySQL 5.7.12

MySQL 5.7.12 was released more than 2 weeks ago. New features introduced there in a form of "rapid plugins" are widely discussed, but I am more interested in bugs reported by MySQL Community users that are fixed there. Unfortunately I do not see MySQL Community Release Notes by Morgan (like this) for quite a some time, so I have to continue describing key bug fixes and name people who reported and verified bugs in my "Fun with Bugs" series.

As usual, let's start with InnoDB bugs fixed:
  • Bug #80070 - "allocated_size and file_size differ if create general tablespace outside datadir". It was reported by my former colleague from Percona Shahriyar Rzayev and verified by Bogdan Kecman. Nice to see more people from Oracle involved in processing community bug reports!
  • Bug #79185 - "Innodb freeze running REPLACE statements". This bug (that affected many users, also on versions 5.5.x and 5.6.x, and was a kind of a regression) was reported by Will Bryant and verified (probably) and fixed by Shaohua Wang. The fix is also included into versions 5.5.49 and 5.6.30.
  • Bug #73816 - ''MySQL instance stalling “doing SYNC index”". It was reported by Denis Jedig and a lot of additional evidence was provided by my former colleague Aurimas Mikalauskas. This bug was fixed (and probably verified) by Shaohua Wang.
  • Bug #79200 - "InnoDB: "data directory" option of create table fails with pwrite() OS error 22", is a widely noted regression (I've seen customer issue with a potentially related MariaDB problem this week). This bug was reported by Frank Ullrich and verified by Bogdan Kecman. It is also fixed in MySQL 5.6.30.
  • Bug #79725 - "Check algorithm=innodb on crc32 checksum mismatch before crc32(big endian)". This bug was created to track the patch contributed by Daniel Black at GitHub. It was verified by Umesh.
Next, let's review replication bugs fixed in 5.7.12:
  • Bug #79504 - "STOP SLAVE IO THREAD prints wrong LOST CONNECTION message in error log file". It was reported by Venkatesh Duggirala.
  • Bug #78722 - "Relay log info currently_executing_gtid is not properly initialized or protected". This bug was reported by Pedro Gomes. It contains a nice simple test case and fix suggested.
  • Bug #78445 is private. So, I can only quote the release notes:
    "RESET SLAVE ALL could delete a channel even when master_pos_wait and wait_until_sql_thread_after_gtid were still waiting for binlog to be applied. This could cause a MySQL server exit when the functions tried to access the channel that was deleted. Now, a channel reference counter was added that is increased if the channel should not be deleted when the functions are running. RESET SLAVE ALL will wait for no reference, and then it will delete the channel."
    I am not sure this crash is a "security" bug of any kind, but what do I know...
  • Bug #78352 - "Slow startup of 5.7.x slave with relay_log_recovery = ON and many relay logs". I reported it based on regression comparing to 5.6.x reported by a customer of Percona, and verified by Umesh. Nice to see it fixed, as it was really annoying for almost anyone who upgraded production replication setup to 5.7.
  • Bug #78133 - "Slave_worker::write_info() incorrect DBUG_ENTER (contribution)". This bug was created to track the patch contributed by Stewart Smith at GitHub. It was verified by Umesh.
  • Bug #77740 - "silent failure to start if mysql.gtids_executed gets HA_ERR_LOCK_WAIT_TIMEOUT ". It was reported and verified by Shane Bester.
  • Bug #77237 - "Multi-threaded slave log spamming on failure". This bug was reported by Davi Arnaut and verified by Umesh. Fix is also included in MySQL 5.6.30.
  • Bug #78963 - "super_read_only aborts STOP SLAVE if relay_log_info_repository=TABLE, dbg crash". It was reported by my former colleague in Percona Laurynas Biveinis and verified by Umesh. Check also related Bug #79328 - "super_read_only broken as a server option".
  • Bug #77684 - "DROP TABLE IF EXISTS may brake replication if slave has replication filters". This bug was reported by my former colleague in Percona Fernando Laudares Camargos for MySQL 5.6.x and verified by Umesh. MySQL 5.6.30 also got this fixed.
We all remember that Performance Schema is perfect and the next greatest thing after sliced bread, but sometimes bugs are noted even there. Check Bug #79934 - "i_perfschema.table_leak random result failure" reported and verified by Magnus Blåudd. Another example is Bug #79784 - "update setup_instruments do not affect the global mutex/rwlock/cond" reported by Zhang Yingqiang and verified by Umesh. The later, IMHO, is related to or a super set of my good old report, Bug #68097 - "Manual does not explain that some P_S instruments must be enabled at startup" that remains open as a feature request (after some changes in the manual) for more than 3 years already. I truly hope 5.7.12 fixed this for a general case - it's truly important to be able to enable instruments dynamically if we expect Performance Schema to be used as a main tool for troubleshooting.

I'd also want to highlight a couple of fixes related to optimizer:
  • Bug #77209 - "Update may use index merge without any reason (increasing chances for deadlock)". It was reported and verified by my former colleagues from Oracle, Andrii Nikitin. MySQL 5.6.30 also includes the fix.
  • Bug #72858 - "EXPLAIN .. SELECT .. FOR UPDATE takes locks". This bug was reported by my former colleague in Percona (and, I hope, my colleague again soon) Justin Swanhart, who has a birthday today. Happy Birthday to you, Justin! The bug was verified by Umesh and is also fixed in MySQL 5.6.30. Justin had reported another bug fixed in 5.7.12, Bug #69375 - "LOAD DATA INFILE claims to be holding 'System Lock' in processlist".
Several more bugs reported by community were also fixed, but they were in the areas (or for platforms) I am not particularly interested in.

To summarize, MySQL 5.7.12 contains important bug fixes in replication and InnoDB and it makes sense to consider upgrade even if you do not care about any "rapid plugins", X protocol, encryption of data at rest, MySQL Keyring and other "cool" new shiny features.

by Valeriy Kravchuk (noreply@blogger.com) at April 30, 2016 04:17 PM

MariaDB Foundation

MariaDB 10.0.25 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.25. See the release notes and changelog for details on this release. Download MariaDB 10.0.25 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.0.25 now available appeared first on MariaDB.org.

by Daniel Bartholomew at April 30, 2016 04:10 PM

Valeriy Kravchuk

Building MaxScale 1.4.2 from GitHub on Fedora 23

MariaDB MaxScale is mentioned in many blog posts recently. It's Application of the Year 2016 after all! I'd like to test it, follow posts like this etc, all that on my favorite and readily available testing platforms that are now Ubuntu of all kinds and, surely, Fedora 23 (on my wife's workstation, the most powerful hardware at hand).

My old habits force me to build open source software I test from source, and I do not want to even discuss the topic of "MaxScale binaries availability" that was quite "popular" some time ago. So, after building MaxScale 1.4.1 on CentOS 6.7 back on March 31, 2016 (mostly just following MariaDB KB article on the topic) using libmysqld.a from MariaDB 10.0.23, this morning I decided to check new branch, 1.4.2, and build it on Fedora 23, following that same KB article (that unfortunately does not even mention Fedora after the fix to MXS-248). Thing is, Fedora is not officially supported as a platform for MaxScale 1.4.x, but why should we, those who can build things from source for testing purposes, care about this?

I started with cloning MaxScale:

git clone https://github.com/mariadb-corporation/MaxScale.git
cd MaxScale
and then:

[openxs@fc23 MaxScale]$ git branch -r
...
  origin/HEAD -> origin/develop
...
  origin/release-1.4.2
...
I remember spending enough time fighting with develop branch while building on CentOS 6.7, mostly with sqlite-related things it contained, so this time I proceed immediately to the branch I want to build:

[openxs@fc23 MaxScale]$ git checkout release-1.4.2
Branch release-1.4.2 set up to track remote branch release-1.4.2 from origin.
Switched to a new branch 'release-1.4.2'
[openxs@fc23 MaxScale]$ git branch
  develop
* release-1.4.2

[openxs@fc23 MaxScale]$ mkdir build
[openxs@fc23 MaxScale]$ cd build

Last two steps originate from the KB article. We are almost ready for building, but what about the prerequisites? I've collected all the packages required for CentOS in that article and tried to install them all:
[openxs@fc23 build]$ sudo yum install mariadb-devel mariadb-embedded-devel libedit-devel gcc gcc-c++ ncurses-devel bison flex glibc-devel cmake libgcc perl make libtool openssl-devel libaio libaio-devel librabbitmq-devel libcurl-devel pcre-devel rpm-build[sudo] password for openxs:
Yum command has been deprecated, redirecting to '/usr/bin/dnf install mariadb-devel mariadb-embedded-devel libedit-devel gcc gcc-c++ ncurses-devel bison flex glibc-devel cmake libgcc perl make libtool openssl-devel libaio libaio-devel librabbitmq-devel libcurl-devel pcre-devel rpm-build'.
See 'man dnf' and 'man yum2dnf' for more information.
To transfer transaction metadata from yum to DNF, run:
'dnf install python-dnf-plugins-extras-migrate && dnf-2 migrate'

Last metadata expiration check: 0:26:04 ago on Wed Apr 27 10:43:24 2016.
Package gcc-5.3.1-6.fc23.x86_64 is already installed, skipping.
...
Package pcre-devel-8.38-7.fc23.x86_64 is already installed, skipping.
Dependencies resolved.
================================================================================
 Package                  Arch     Version                      Repository
                                                                           Size
================================================================================
Installing:
 autoconf                 noarch   2.69-21.fc23                 fedora    709 k
 automake                 noarch   1.15-4.fc23                  fedora    695 k
 dwz                      x86_64   0.12-1.fc23                  fedora    106 k
 flex                     x86_64   2.5.39-2.fc23                fedora    328 k
 ghc-srpm-macros          noarch   1.4.2-2.fc23                 fedora    8.2 k
 gnat-srpm-macros         noarch   2-1.fc23                     fedora    8.4 k
 go-srpm-macros           noarch   2-3.fc23                     fedora    8.0 k
 libcurl-devel            x86_64   7.43.0-6.fc23                updates   590 k
 libedit-devel            x86_64   3.1-13.20150325cvs.fc23      fedora     34 k
 librabbitmq              x86_64   0.8.0-1.fc23                 updates    43 k
 librabbitmq-devel        x86_64   0.8.0-1.fc23                 updates    52 k
 libtool                  x86_64   2.4.6-8.fc23                 updates   707 k
 mariadb-common           x86_64   1:10.0.23-1.fc23             updates    74 k
 mariadb-config           x86_64   1:10.0.23-1.fc23             updates    25 k
 mariadb-devel            x86_64   1:10.0.23-1.fc23             updates   869 k
 mariadb-embedded         x86_64   1:10.0.23-1.fc23             updates   4.0 M
 mariadb-embedded-devel   x86_64   1:10.0.23-1.fc23             updates   8.3 M
 mariadb-errmsg           x86_64   1:10.0.23-1.fc23             updates   199 k
 mariadb-libs             x86_64   1:10.0.23-1.fc23             updates   637 k
 ocaml-srpm-macros        noarch   2-3.fc23                     fedora    8.1 k
 patch                    x86_64   2.7.5-2.fc23                 fedora    123 k
 perl-Thread-Queue        noarch   3.07-1.fc23                  updates    22 k
 perl-generators          noarch   1.06-1.fc23                  updates    15 k
 perl-srpm-macros         noarch   1-17.fc23                    fedora    9.7 k
 python-srpm-macros       noarch   3-7.fc23                     updates   8.1 k
 redhat-rpm-config        noarch   36-1.fc23.1                  updates    59 k
 rpm-build                x86_64   4.13.0-0.rc1.13.fc23         updates   137 k

Transaction Summary
================================================================================
Install  27 Packages

Total download size: 18 M
Installed size: 64 M
Is this ok [y/N]: Y

...

Complete!
Now, let's try simple approach:

[openxs@fc23 build]$ cmake ..
...
-- MySQL version: 10.0.23
-- MySQL provider: MariaDB
-- Looking for pcre_stack_guard in MYSQL_EMBEDDED_LIBRARIES_STATIC-NOTFOUND
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
MYSQL_EMBEDDED_LIBRARIES_STATIC
    linked by target "cmTC_2494a" in directory /home/openxs/git/MaxScale/build/CMakeFiles/CMakeTmp

CMake Error: Internal CMake error, TryCompile configure of cmake failed
-- Looking for pcre_stack_guard in MYSQL_EMBEDDED_LIBRARIES_STATIC-NOTFOUND - not found
-- PCRE libs: /usr/lib64/libpcre.so
-- PCRE include directory: /usr/include
-- Embedded mysqld does not have pcre_stack_guard, linking with system pcre.
CMake Error at cmake/FindMySQL.cmake:115 (message):
  Library not found: libmysqld.  If your install of MySQL is in a non-default
  location, please provide the location with -DMYSQL_EMBEDDED_LIBRARIES=<path
  to library>
Call Stack (most recent call first):
  CMakeLists.txt:37 (find_package)


-- Configuring incomplete, errors occurred!
See also "/home/openxs/git/MaxScale/build/CMakeFiles/CMakeOutput.log".
See also "/home/openxs/git/MaxScale/build/CMakeFiles/CMakeError.log".

Failure, cmake can not find libmysqld.a it seems. Let me try to find it:

[openxs@fc23 build]$ sudo find / -name libmysqld.a 2>/dev/null
/home/openxs/git/percona-xtrabackup/libmysqld/libmysqld.a
/home/openxs/dbs/5.7/lib/libmysqld.a
/home/openxs/dbs/p5.6/lib/libmysqld.a
/home/openxs/dbs/fb56/lib/libmysqld.a
/home/openxs/10.1.12/lib/libmysqld.a
That's all, even though I installed all packages that looked as required based on the article! I have the library in many places (in my own builds and even in sandbox with MariaDB 10.1.12), but it's not installed where expected. Some more desperate tries (installing MariaDB server with sudo yum install mariadb-server, searches for package that provides libmysqld.a etc), chat with engineers of MariaDB and I've ended up with the fact that my packages are from Fedora (not MariaDB) and they just do not include the static library. Looks like a bug in Fedora packaging, if you ask me.

I was not ready to add MariaDB's repository at the moment (to get MariaDB-devel etc, something KB article also suggests for supported platforms), so I decided that it would be fair just to build current MariaDB 10.1.13 from source and use everything needed from there. Last time I built 10.2 branch, so I had to check out 10.1 first:
[openxs@fc23 server]$ git checkout 10.1
Switched to branch '10.1'
Your branch is behind 'origin/10.1' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
[openxs@fc23 server]$ git pull
Updating 1cf852d..071ae30
Fast-forward
 client/mysqlbinlog.cc                    | 523 ++++++++++++++++++++++---------
 mysql-test/r/mysqlbinlog_raw_mode.result | 274 ++++++++++++++++
 mysql-test/t/mysqlbinlog_raw_mode.test   | 387 +++++++++++++++++++++++
 sql/sql_priv.h                           |   3 +-
 storage/innobase/dict/dict0boot.cc       |  20 +-
 storage/xtradb/dict/dict0boot.cc         |  20 +-
 6 files changed, 1062 insertions(+), 165 deletions(-)
 create mode 100644 mysql-test/r/mysqlbinlog_raw_mode.result
 create mode 100644 mysql-test/t/mysqlbinlog_raw_mode.test
 Then I've executed the following while in server directory:

make clean
rm CMakeCache.txt
cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DWITH_EMBEDDED_SERVER=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1

make

make install && make clean
Note that I've explicitly asked to build embedded server. I checked that the library is in the location I need:

[openxs@fc23 server]$ sudo find / -name libmysqld.a 2>/dev/null
/home/openxs/git/percona-xtrabackup/libmysqld/libmysqld.a
/home/openxs/dbs/maria10.1/lib/libmysqld.a
/home/openxs/dbs/5.7/lib/libmysqld.a
/home/openxs/dbs/p5.6/lib/libmysqld.a
/home/openxs/dbs/fb56/lib/libmysqld.a
/home/openxs/10.1.12/lib/libmysqld.a
Then I moved back to MaxScale/build directory and explicitly pointed out the location of headers, library and messages that I want to use with MaxScale:

[openxs@fc23 build]$ cmake .. -DMYSQL_EMBEDDED_INCLUDE_DIR=/home/openxs/dbs/maria10.1/include/mysql -DMYSQL_EMBEDDED_LIBRARIES=/home/openxs/dbs/maria10.1/lib/libmysqld.a -DERRMSG=/home/openxs/dbs/maria10.1/share/english/errmsg.sys -DCMAKE_INSTALL_PREFIX=/home/openxs/maxscale -DWITH_MAXSCALE_CNF=N
...
-- Build files have been written to: /home/openxs/git/MaxScale/build

[openxs@fc23 build]$ make
...
[ 95%] [BISON][ruleparser] Building parser with bison 3.0.4
ruleparser.y:34.1-13: warning: deprecated directive, use Б-?%name-prefixБ-? [-Wdeprecated]
 %name-prefix="dbfw_yy"
 ^^^^^^^^^^^^^
[ 96%] Building C object server/modules/filter/dbfwfilter/CMakeFiles/dbfwfilter.dir/ruleparser.c.o
[ 96%] Building C object server/modules/filter/dbfwfilter/CMakeFiles/dbfwfilter.dir/token.c.o
[ 97%] Linking C shared library libdbfwfilter.so
[ 97%] Built target dbfwfilter
Scanning dependencies of target maxadmin
[ 98%] Building C object client/CMakeFiles/maxadmin.dir/maxadmin.c.o
[ 98%] Linking C executable maxadmin
[100%] Built target maxadmin

It seems build completed without problems this time. We can try to test it (some tests do fail):







[openxs@fc23 build]$ make testcore
...
 1/22 Test  #1: Internal-TestQueryClassifier .....***Exception: Other  0.35 sec
      Start  2: Internal-CanonicalQuery
 2/22 Test  #2: Internal-CanonicalQuery ..........***Failed    0.25 sec
      Start  3: Internal-CanonicalQuerySelect
 3/22 Test  #3: Internal-CanonicalQuerySelect ....***Failed    0.04 sec
      Start  4: Internal-CanonicalQueryAlter
 4/22 Test  #4: Internal-CanonicalQueryAlter .....***Failed    0.04 sec
      Start  5: Internal-CanonicalQueryComment
 5/22 Test  #5: Internal-CanonicalQueryComment ...***Failed    0.04 sec
      Start  6: Internal-TestAdminUsers
 6/22 Test  #6: Internal-TestAdminUsers ..........   Passed    0.44 sec
      Start  7: Internal-TestBuffer
 7/22 Test  #7: Internal-TestBuffer ..............   Passed    0.01 sec
      Start  8: Internal-TestDCB
 8/22 Test  #8: Internal-TestDCB .................   Passed    0.01 sec
      Start  9: Internal-TestFilter
 9/22 Test  #9: Internal-TestFilter ..............   Passed    0.03 sec
...
(As a side note, make install in my case had NOT installed anything to /home/openxs/maxscale, something to deal with later, as on CentOS 6.7 it worked...)


In any case, I now have binaries to work with, of version 1.4.2:

[openxs@fc23 build]$ ls bin/
maxadmin  maxbinlogcheck  maxkeys  maxpasswd  maxscale
[openxs@fc23 build]$ bin/maxscale --version


MariaDB Corporation MaxScale 1.4.2      Wed Apr 27 13:24:01 2016
------------------------------------------------------
MaxScale 1.4.2

[openxs@fc23 build]$ bin/maxadmin --version
bin/maxadmin Version 1.4.2
To be continued one day... Stay tuned!




by Valeriy Kravchuk (noreply@blogger.com) at April 30, 2016 01:08 PM

April 29, 2016

Jean-Jerome Schmidt

New Whitepaper: MySQL Replication Blueprint

MySQL Replication has become an essential component of scale-out architectures in LAMP environments. When there is a necessity to scale out, MySQL offers a multitude of solutions, the most common being to add read replicas. The major bottleneck for our data is generally not so much oriented around writing our data but more around reading back this data. Therefore the easiest way to scale MySQL is to add replicas for reading.

And with today’s cloud environments, where resources are dynamically allocated and deallocated, systems need the ability to automatically adapt to sudden changes. For MySQL Replication, this includes tasks like detecting failures, promoting a slave to master, failing over slaves, and so on. A load balancer with Virtual IP can also help mask topology changes from the application, and dispatches read and write traffic appropriately.

With that in mind, we’d like to introduce the Severalnines Blueprint for MySQL Replication, a new whitepaper which discusses all aspects of a MySQL Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies as ProxySQL, MaxScale and HAProxy.

The MySQL Replication Blueprint is about having a complete ops-ready solution from end to end including:

  • Installation and configuration of master/slave MySQL servers, load balancers, Virtual IP and failover rules
  • Management of the topology, including failure detection, failover, repair and subsequent reconfiguration of components
  • Managing topology changes when adding, removing or maintaining servers
  • Managing configuration changes
  • Backups
  • Monitoring of all components from one single point

This whitepaper discusses the following core topics in depth:

  • Why the need for a Replication Blueprint
  • Introducing the MySQL Replication Blueprint
  • Monitoring
  • Management
  • Load Balancing

Get your free copy of our new whitepaper

by Severalnines at April 29, 2016 12:14 PM

MariaDB AB

Why is ColumnStore important?

Nishant Vyas

Relational databases store data in rows because a typical SQL query looks for multiple fields within a record. For example, if you ask for name, zip code and email address of all your customers in New York, the result is presented in rows, with each row containing several fields from a single record. Row structures are also well optimized to handle a lot of inserts and updates.

Row-Orientated vs Column-Orientated

But analytic queries are better handled with a column structure because they are more likely to go deep on a column of data, most queries relate only to a tiny subset of all the available columns and they’re also mostly read-only. For example, retrieving daily sales data for all your stores in California for the past two years is a columnar operation because it cuts across many records to retrieve data from a specific field. A typical ad-hoc aggregation query doesn’t care about most fields, just the trends in one field.

MariaDB ColumnStore is not only optimized for columnar operations, but also simplifies management. There is no need for indexing; metadata is stored in memory. That eliminates a cumbersome tuning process. ColumnStore, when paired with MariaDB, supports just about any query you want to throw at it. You can even join a MariaDB ColumnStore table and a InnoDB or remote MySQL table, a feature for unified simplicity. But there is much more.

In the last decade, we kept hearing that SQL is not needed for data processing or analytics, yet in the last few years every single OLTP and analytics solution is building a SQL layer. SQL is the most proven way of processing data. So, MariaDB ColumnStore is compatible with standard SQL using the MariaDB interface. Full SQL compliance means MariaDB ColumnStore works out-of-the-box with your existing business intelligence tools and SQL queries. In fact, it would work with most popular business intelligence tools, like Tableau and Business Objects, as well as anything that supports ODBC/JDBC. For data scientists, it works with R for advanced statistical analysis.

At the same time, we realize that SQL is not the best choice for machine learning and data discovery use-cases. We want to integrate Apache SPARK libraries like MLLib into ColumnStore to complete the picture.

Most importantly, MariaDB ColumnStore is based on an Open-Source GPLv2 fork of InfiniDB community project. We believe community driven software development is the new mandate of our time. We want to leverage our community strength in building MariaDB ColumnStore.

Read the next part on ColumnStore, ColumnStore Architecture & Use-cases.

About the Author

Nishant Vyas's picture
Nishant joins MariaDB as Head of Product and Strategy from LinkedIn, where he was one of the early employees. During his almost nine-year tenure at LinkedIn, he contributed to building, scaling and operating production data stores using technologies like Oracle, MySQL, NoSQL and more. Nishant has extensive experience as a database engineer, database architect and DBA, and has held various leadership roles. He holds a bachelor's degree in engineering from Gujarat University and a master's degree in computer science from the University of Bridgeport. Based in the San Francisco bay area, Nishant is co-author of a patent in waterwheel sharding.

by Nishant Vyas at April 29, 2016 08:22 AM

April 28, 2016

MariaDB AB

Getting Microsoft SQL server data into MariaDB with the CONNECT storage engine

Michaël DeGroot

MariaDB works with many clients to migrate Microsoft SQL and Oracle to MariaDB. With the CONNECT storage engine we can access any ODBC data source in MariaDB. Here's a small HOWTO for those who want to give it a quick try. In this example we use MSSQL, though the same principle should be possible with Oracle ODBC servers.

We start with a clean MariaDB installation, no ODBC drivers installed yet. In this example we used CentOS7. It's important to start at a point where unixODBC is not yet installed, because the Microsoft installation package wants to install its own unixODBC.

Step 1

Microsoft is kind enough to supply us with an ODBC driver for Linux. We download it and unpack it.

Step 2

We now install unixODBC using the build_dm.sh script provided by the package. It needs bash to run:

bash build_dm.sh
....
The script is provided as a convenience to you as-is, without any express
or implied warranties of any kind.  Microsoft is not liable for any issues
arising out of your use of the script.

Enter 'YES' to have this script continue: YES

Verifying processor and operating system ................................... OK
Verifying wget is installed ................................................ OK
Verifying tar is installed ................................................. OK
Verifying make is installed ................................................ OK
Downloading unixODBC 2.3.0 DriverManager ................................... OK
Unpacking unixODBC 2.3.0 DriverManager ..................................... OK
Configuring unixODBC 2.3.0 DriverManager ................................... OK
Building unixODBC 2.3.0 DriverManager ...................................... OK
Build of the unixODBC 2.3.0 DriverManager complete.

Run the command 'cd /tmp/unixODBC.32235.28222.16428/unixODBC-2.3.0; make install' to install the driver manager.

Afterwards we install unixODBC with cd /tmp/unixODBC.32235.28222.16428/unixODBC-2.3.0; make install, we go back with cd $OLDPWD. uu As you can see it checks for some dependencies, if these fail you can install them easily with yum install yum-utils && yum-builddep unixODBC.

Warning: Installing the MSSQL driver will fail if we install the newest package from the yum repository.

Step 3

We now install the MSSQL package, the installation on CentOS/RHEL can be executed with checks:

bash install.sh install

We accept the license agreement by typing capitalized YES and see it installed:

Enter YES to accept the license or anything else to terminate the installation: YES

Checking for 64 bit Linux compatible OS ..................................... OK
Checking required libs are installed ................................. NOT FOUND
unixODBC utilities (odbc_config and odbcinst) installed ............ NOT CHECKED
unixODBC Driver Manager version 2.3.0 installed .................... NOT CHECKED
unixODBC Driver Manager configuration correct ...................... NOT CHECKED
Microsoft SQL Server ODBC Driver V1.0 for Linux already installed .. NOT CHECKED
Microsoft SQL Server ODBC Driver V1.0 for Linux files copied ................ OK
Symbolic links for bcp and sqlcmd created ................................... OK
Microsoft SQL Server ODBC Driver V1.0 for Linux registered ........... INSTALLED

We can now also install/upgrade the ODBC packages from our own repository so it will automatically update with our other packages: yum install unixODBC. In /etc/odbcinst.ini we verify it MSSQL still exists.

Step 4

We install the CONNECT engine and load it into the MariaDB server: yum install MariaDB-connect-engine. In the MySQL console: INSTALL SONAME 'ha_connect.so';

Step 5

We create our table using the same ODBC identifier as we read in /etc/odbcinst.ini:

CREATE TABLE mssql_table ENGINE=CONNECT DEFAULT CHARSET=latin1 CONNECTION='Driver=SQL Server Native Client 11.0;Server=ms-sql01;Database=old_version_migration_data;UID=mariadb;PWD=password' `TABLE_TYPE`='ODBC'

For more information

See the Knowledge Base for more information about the CONNECT storage engine, or if you are considering a migration to MariaDB, you might want to consider asking about MariaDB Enterprise or our migration consulting services.

by Michaël DeGroot at April 28, 2016 10:03 AM

MariaDB MaxScale 1.4.2 GA is available for download

Johan Wikman

URGENT UPDATE: Do not take 1.4.2 into use. We have found a problem and will provide a 1.4.3 release shortly.

We are pleased to announce that MariaDB MaxScale 1.4.2 GA is now available for download!

If MariaDB MaxScale is new to you, we recommend reading this page first.

1.4.2 is a bugfix release, not bringing any new features but fixing certain issues found in 1.4.1.

MariaDB MaxScale 1.4 brings:

  1. The Firewall Filter has been extended and can now be used for either black-listing or white-listing queries. In addition it is capable of logging both queries that match and queries that do not match.
  2. Client-side SSL has been available in MariaDB MaxScale for a while, but it has been somewhat unreliable. We now believe that client side SSL is fully functional and usable.

Additional improvements:

  • POSIX Extended Regular Expression Syntax can now be used in conjunction with qlafilter, topfilter and namedserverfilter.
  • Improved user grant detection.
  • Improved password encryption.

The release notes can be found here and the binaries can be downloaded here.

In case you want to build the binaries yourself, the source can be found at GitHub, tagged with 1.4.2.

We hope you will download and use this release, and we encourage you to create a bug report in Jira for any bugs you might encounter.

On behalf of the MariaDB MaxScale team.

About the Author

Johan Wikman's picture

Johan Wikman is a senior developer working on MaxScale at MariaDB Corporation. 

by Johan Wikman at April 28, 2016 08:23 AM

MariaDB ColumnStore, a new beginning

Nishant Vyas

MariaDB’s new analytics engine – MariaDB ColumnStore - has been in the works for some time. What is it and how did it come about? This post outlines our thinking in choosing the engine and features we implemented.

Databases are expanding from their roots as systems of record into new analytics applications that some people call “systems of intelligence.” That means that instead of just storing and querying transactional data, databases are increasingly being used to yield insights, predict the future and prescribe actions users should take. Led by the open-source Hadoop ecosystem, online analytic processing (OLAP) is making its way out of the corporate data center and into the hands of everyone who needs it.

In the last decade, as data analytics became more important, solving these problems became more challenging. Everyone was led to believe, through hype and skewed opinions, that scale-out, big clusters and data processing without using SQL is probably the only way to do data analytics. We were made to believe that solving analytics would require either scale-out or spending big money on proprietary solutions.

On one end, there are traditional OLAP data warehouses, which are powerful and SQL rich BUT too costly, proprietary and often black box appliances. On the other end, we saw the rise of hadoop ecosystems, which challenged traditional OLAP providers, paved a way to machine learning and data discovery that was not easy with traditional solutions but came with the complexity of scale out and lacked SQL interfaces.

We know our users choose MariaDB because they value performance, scalability, reliability, security and extensibility through open source, with 100% SQL compatibility. We wanted our OLAP choice to reflect the same values.

One commercial product that caught our eye is SAP’s HANA. It’s a database appliance that supports both transactional and analytical applications on the same platform. HANA has gotten rave reviews, not only for functionality but also for simplicity. But HANA appliances are expensive and they’re really intended for organizations that are using SAP (price: don’t ask). We knew there was an open-source, scalable, software-only columnar analytic DBMS alternative: InfiniDB.

We thought InfiniDB would be a terrific fit with our OLAP and big data strategy. It provides a columnar, massively parallel storage engine with performance that scales with the addition of database servers, Performance Modules. Redundancy and high availability are built in, and InfiniDB supports full SQL syntax including joins, window functions and cross engine joins. It also works with Hadoop’s HDFS file system to provide incredible scalability.

MariaDB ColumnStore

We saw a unique opportunity to fill the need for high performance analytic data engine in the open-source market by binding InfiniDB’s OLAP engine to MariaDB’s OLTP engine to enable users to run analytic queries on production data in near real-time. Much of our development work has been enabling that tight integration.

Read the next part on ColumnStore, Why is ColumnStore important?.

About the Author

Nishant Vyas's picture
Nishant joins MariaDB as Head of Product and Strategy from LinkedIn, where he was one of the early employees. During his almost nine-year tenure at LinkedIn, he contributed to building, scaling and operating production data stores using technologies like Oracle, MySQL, NoSQL and more. Nishant has extensive experience as a database engineer, database architect and DBA, and has held various leadership roles. He holds a bachelor's degree in engineering from Gujarat University and a master's degree in computer science from the University of Bridgeport. Based in the San Francisco bay area, Nishant is co-author of a patent in waterwheel sharding.

by Nishant Vyas at April 28, 2016 07:33 AM

Jean-Jerome Schmidt

Planets9s - Watch the replay: Become a MongoDB DBA (if you’re re really a MySQL user)

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch the replay: Become a MongoDB DBA (for MySQL users)

Thanks to everyone who participated in this week’s webinar on ‘Become a MongoDB DBA’! Our colleague Art van Scheppingen presented from the perspective of a MySQL DBA who might be called to manage a MongoDB database. Art also did a live demo on how to carry out the relevant DBA tasks using ClusterControl. The replay is now available to watch in cased you missed it or simply would like to see it again in your own time.

Watch the replay

Severalnines expands the reach of European scientific discovery for CNRS

We’re excited to announce our latest customer, the National Center for Scientific Research (CNRS), which is a subsidiary of the French Ministry of Higher Education and Research. As the largest fundamental research organization in Europe, CNRS carries out research in all fields of knowledge. Find out how we help CNRS keep costs down whilst increasing the potential of their open source systems. And how ClusterControl helps them both manage and use their LAMP applications, as well as cloud services.

Read the press release

Infrastructure Automation - Ansible Role for ClusterControl

If you are automating your server infrastructure with Ansible, then this blog is for you. We recently announced the availability of an Ansible Role for ClusterControl. It is available at Ansible Galaxy. And as a reminder, for those of you who are automating with Puppet or Chef, we already published a Puppet Module and Chef Cookbook for ClusterControl.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at April 28, 2016 07:32 AM

April 27, 2016

Peter Zaitsev

Upgrading to MySQL 5.7, focusing on temporal types

temporal types

temporal typesIn this post, we’ll discuss how MySQL 5.7 handles the old temporal types during an upgrade.

MySQL changed the temporal types in MySQL 5.6.4, and it introduced a new feature: microseconds resolution in the TIME, TIMESTAMP and DATETIME types. Now these parameters can be set down to microsecond granularity. Obviously, this means format changes, but why is this important?

Are they converted automatically to the new format?

If we had tables in MySQL 5.5 that used TIME, TIMESTAMP or DATETIME are these fields are going to be converted to the new format when upgrading to 5.6? The answer is “NO.” Even if we run mysql_upgrade, it does not warn us about the old format. If we check the MySQL error log, we cannot find anything regarding this. But the newly created tables are going to use the new format so that we will have two different types of temporal fields.

How can we find these tables?

The following query gives us a summary on the different table formats:

SELECT CASE isc.mtype
 WHEN '6' THEN 'OLD'
 WHEN '3' THEN 'NEW'
 END FORMAT,
 count(*) TOTAL
FROM information_schema.tables AS t
INNER JOIN information_schema.columns AS c ON c.table_schema = t.table_schema
AND c.table_name = t.table_name
LEFT OUTER JOIN information_schema.innodb_sys_tables AS ist ON ist.name = concat(t.table_schema,'/',t.table_name)
LEFT OUTER JOIN information_schema.innodb_sys_columns AS isc ON isc.table_id = ist.table_id
AND isc.name = c.column_name
WHERE c.column_type IN ('time','timestamp','datetime')
 AND t.table_schema NOT IN ('mysql','information_schema','performance_schema')
 AND t.table_type = 'base table'
 AND (t.engine = 'innodb')
GROUP BY isc.mtype;

+--------+-------+
| FORMAT | TOTAL |
+--------+-------+
| NEW    | 1     |
| OLD    | 9     |
+--------+-------+

Or we can use show_old_temporals, which will highlight the old formats during a

show create table
.

CREATE TABLE `mytbl` (
  `ts` timestamp /* 5.5 binary format */ NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `dt` datetime /* 5.5 binary format */ DEFAULT NULL,
  `t` time /* 5.5 binary format */ DEFAULT NULL
) DEFAULT CHARSET=latin1

MySQL can handle both types, but with the old format you cannot use microseconds, and the default DATETIME takes more space on disk.

Can I upgrade to MySQL 5.7?

Of course you can! But when

mysql_upgrade
 is running it is going to convert the old fields into the new format by default. This basically means an
alter table
 on every single table, which will contain one of the three types.

Depending on the number of tables, or the size of the tables, this could take hours – so you may need to do some planning.

....
test.t1
error : Table rebuild required. Please do "ALTER TABLE `t1` FORCE" or dump/reload to fix it!
test.t2
error : Table rebuild required. Please do "ALTER TABLE `t2` FORCE" or dump/reload to fix it!
test.t3
error : Table rebuild required. Please do "ALTER TABLE `t3` FORCE" or dump/reload to fix it!
Repairing tables
mysql.proxies_priv OK
`test`.`t1`
Running : ALTER TABLE `test`.`t1` FORCE
status : OK
`test`.`t2`
Running : ALTER TABLE `test`.`t2` FORCE
status : OK
`test`.`t3`
Running : ALTER TABLE `test`.`t3` FORCE
status : OK
Upgrade process completed successfully.
Checking if update is needed.

Can we avoid this at upgrade?

We can run

alter tables
 or use pt-online-schema-schange (to avoid locking) before an upgrade, but even without these preparations we can still avoid incompatibility issues.

My colleague Daniel Guzman Burgos pointed out that 

mysql_upgrade
 has an option called upgrade-system-tables. This will only upgrade the system tables, and nothing else.

Can we still write these fields?

The following query returns the schema and the table names that still use the old formats.

SELECT CASE isc.mtype
           WHEN '6' THEN 'OLD'
           WHEN '3' THEN 'NEW'
       END FORMAT,
       t.schema_name,
       t.table_name
FROM information_schema.tables AS t
INNER JOIN information_schema.columns AS c ON c.table_schema = t.table_schema
AND c.table_name = t.table_name
LEFT OUTER JOIN information_schema.innodb_sys_tables AS ist ON ist.name = concat(t.table_schema,'/',t.table_name)
LEFT OUTER JOIN information_schema.innodb_sys_columns AS isc ON isc.table_id = ist.table_id
AND isc.name = c.column_name
WHERE c.column_type IN ('time','timestamp','datetime')
    AND t.table_schema NOT IN ('mysql','information_schema','performance_schema')
    AND t.table_type = 'base table'
    AND (t.engine = 'innodb');

+--------+--------------+------------+
| FORMAT | table_schema | table_name |
+--------+--------------+------------+
| OLD    | test         | t          |
| OLD    | test         | t          |
| OLD    | test         | t          |
| NEW    | sys          | sys_config |
+--------+--------------+------------+
4 rows in set (0.03 sec)
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.11-4  |
+-----------+
1 row in set (0.00 sec)

As we can see, we’re using 5.7 and table “test.t” still has the old format.

The schema:

CREATE TABLE `t` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`t1` time DEFAULT NULL,
`t2` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`t3` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1

mysql> select * from t;
+----+----------+---------------------+---------------------+
| id |    t1    |       t2            |        t3           |
+----+----------+---------------------+---------------------+
| 1  | 20:28:00 | 2016-04-09 01:41:58 | 2016-04-23 22:22:01 |
| 2  | 20:28:00 | 2016-04-09 01:41:59 | 2016-04-23 22:22:02 |
| 3  | 20:28:00 | 2016-04-09 01:42:01 | 2016-04-23 22:22:03 |
| 4  | 20:28:00 | 2016-04-09 01:42:03 | 2016-04-23 22:22:04 |
| 5  | 20:28:00 | 2016-04-09 01:42:08 | 2016-04-23 22:22:05 |
+----+----------+---------------------+---------------------+

Let’s try to insert a new row:

mysql> insert into `t` (t1,t3) values ('20:28','2016:04:23 22:22:06');
Query OK, 1 row affected (0.01 sec)
mysql> select * from t;
+----+----------+---------------------+---------------------+
| id |    t1    |         t2          |          t3         |
+----+----------+---------------------+---------------------+
| 1  | 20:28:00 | 2016-04-09 01:41:58 | 2016-04-23 22:22:01 |
| 2  | 20:28:00 | 2016-04-09 01:41:59 | 2016-04-23 22:22:02 |
| 3  | 20:28:00 | 2016-04-09 01:42:01 | 2016-04-23 22:22:03 |
| 4  | 20:28:00 | 2016-04-09 01:42:03 | 2016-04-23 22:22:04 |
| 5  | 20:28:00 | 2016-04-09 01:42:08 | 2016-04-23 22:22:05 |
| 6  | 20:28:00 | 2016-04-09 01:56:38 | 2016-04-23 22:22:06 |
+----+----------+---------------------+---------------------+
6 rows in set (0.00 sec)

It was inserted without a problem, and we can’t see any related info/warnings in the error log.

Does the Replication work?

In many scenarios, when you are upgrading a replicaset, the slaves are upgraded first. But will the replication work? The short answer is “yes.” I configured row-based replication between MySQL 5.6 and 5.7. The 5.6 was the master, and it had all the temporal types in the old format. On 5.7, I had new and old formats.

I replicated from old format to old format, and from old format to new format, and both are working.

Conclusion

Before upgrading to MySQL 5.7, tables should be altered to use the new format. If it isn’t done, however, the upgrade is still possible without altering all the tables – the drawbacks are you cannot use microseconds, and it takes more space on disk. If you had to upgrade to 5.7, however, you could change the format later using

alter table
 or pt-online-schema-schange.

 

by Tibor Korocz at April 27, 2016 08:30 PM

MariaDB AB

MariaDB Server growth bolstered by OpenStack User Survey

colin

While many are at the ongoing OpenStack Summit in Austin, it came to my attention that the OpenStack project has recently published their latest user survey. From there, it is clear that there is growth in the usage of MariaDB Server -- up 6 points from the previous survey. MariaDB Galera Cluster dropped by 1 point, but it's worth noting that in MariaDB Server 10.1, you're getting it all as an integrated download so it is quite likely that people are just referring to it as "MariaDB" now.

Which databases are used for OpenStack components

(Screenshot taken from the user survey)

While MySQL still dominates at 35 percentage points, the largest pie of production OpenStack deployments are either based on MariaDB Server or MariaDB Galera Cluster which are developed by both the MariaDB Corporation & the MariaDB Foundation.

Some may say this is because "defaults matter" (as MariaDB Server gets to be the default in many Linux distributions), but the majority of OpenStack deployments today are using Ubuntu Server, which ships MySQL as a default provider when one requests for MySQL. So this is a conscious choice people are making to go with MariaDB Server or MariaDB Galera Cluster.

Thank you OpenStack deployers! If you ever want to talk about your deployment, feel free to drop me a line or even tweet @bytebot.

About the Author

colin's picture

Colin Charles is the Chief Evangelist for MariaDB since 2009, work ranging from speaking engagements to consultancy and engineering works around MariaDB. He lives in Kuala Lumpur, Malaysia and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He's well known on the conference track having spoken at many of them over the course of his career.

by colin at April 27, 2016 02:20 PM

Jean-Jerome Schmidt

Watch the replay: Become a MongoDB DBA (if you’re really a MySQL user)

Thanks to everyone who participated in this week’s webinar on ‘Become a MongoDB DBA’! Our colleague Art van Scheppingen presented from the perspective of a MySQL DBA who might be called to manage a MongoDB database, which included a live demo on how to carry out the relevant DBA tasks using ClusterControl.

The replay and the slides are now available online in case you missed Tuesday’s live session or simply would like to see it again in your own time.

Watch the replay Read the slides

This was the first session of our new webinar series: ‘How to Become a MongoDB DBA’ to answer the question: ‘what does a MongoDB DBA do’?

In this initial webinar, we went beyond the deployment phase and demonstrated how you can automate tasks, monitor a cluster and manage MongoDB; whilst also automating and managing your MySQL and/or PostgreSQL installations. Watch out for invitations for the next session in this series!

This Session's Agenda

  • Introduction to becoming a MongoDB DBA
  • Installing & configuring MongoDB
  • What to monitor and how
  • How to perform backups
  • Live Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

This series is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a ClusterControl DBA’ blog series.

by Severalnines at April 27, 2016 12:49 PM

MariaDB AB

Automated server management with opensource tools

colin

In a recent presentation by Balazs Pocze from Gawker Media LLC made clear that there are many automation strategies for server management that would be worth packaging up into an extended solution like MariaDB Enterprise.

Automation is good because there are way less "ops people" than "dev people" at many companies. Man-hours are expensive, so you don't want to waste them on routine tasks. Automated tasks also mean that there is a lot less possibility to make mistakes. "Everybody needs automation," says Pocze.

Gawker still uses physical infrastructure, powered by Linux, though there is some migration to the cloud. Cobbler is their Linux installation server of choice, where the hosts are defined, and there is no manual DHCP or DNS management; this tool not only manages but also defines their environment.

For provisioning the hosts, Gawker's tool of choice is Puppet. Puppet manages all the hosts as well as defines the hosts, and Pocze said, "If it doesn't exist in Puppet, it doesn't exist at all". From a developer enablement standpoint, MariaDB Enterprise provides the MariaDB Enterprise Chef Cookbook, which we could say is as close as can be.

In a bit of a twist, they also make use of Ansible for running commands on hosts; all commands are organised into playbooks, and it is all about agentless installation and management. They do create modules and contribute them, and all nodes are managed over SSH.

To wrap it all up, they use Jenkins for continuous integration. The demo of how Gawker's Kinja is maintained by Jenkins, with the Ansible playbooks for slave management was also a highlight.

It would be great if more shared what their backend infrastructure and ops ran like. Because its clear that there's plenty of opportunity to have tooling finesse.

Don't forget to check out banyek on github as well as gawkermedia on github. During the talk, an audience member said it might also be a great idea to checkout Rundeck - a job scheduler and runbook automation, something I have not heard of before but will definitely look at in the near future.

About the Author

colin's picture

Colin Charles is the Chief Evangelist for MariaDB since 2009, work ranging from speaking engagements to consultancy and engineering works around MariaDB. He lives in Kuala Lumpur, Malaysia and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He's well known on the conference track having spoken at many of them over the course of his career.

by colin at April 27, 2016 09:27 AM

April 26, 2016

Peter Zaitsev

How We Made MySQL Great Again, or Upgrading MySQL with Orchestrator

Upgrading MySQL with Orchestrator

Upgrading MySQL with OrchestratorIn this blog post, we’ll discuss upgrading MySQL with Orchestrator.

I recently had a client, Life360, that wanted to upgrade from Percona Server 5.5 to Percona Server 5.6, and implement GTID in their high transaction environment. They had co-masters and multiple read slaves.

Orchestrator made this job much easier for us. My colleague, Tibi, recently posted about Orchestrator here and here.

Daniel from Life360 saw Orchestrator and was very interested. So here is how he setup Orchestrator in his own words:

I did some initial testing with the vagrant boxes provided in the Orchestrator repo, to see how to configure the agents and get the Orchestrator server to do what we want.

I then moved to install the Orchestrator server, Orchestrator backend on RDS, and deploy the clients on the slaves and masters in our Amazon VPC MySQL instances.

Once the server setup was done, the clients were auto-detected through CNAME discovery of the masters, and the agents talked to the server (it took a while as CNAMES wasn’t working as expected, but that’s fixed in the new server version).

We were pretty amazed at the number of actions you can do through orchestrator itself, such as: moving slaves to a different master through drag and drop, enabling GTID on a node with the push of a button, setting up GTID based failover, taking LVM snapshots using Orchestrator Agent, etc.

We went ahead and tested the master change on drag and drop, and after a few successful attempts, we even brought it back to where it was initially. After those tests, we were pretty confident that we could leverage Orchestrator as one of our main tools to assist in the coming upgrade.

Here is a screenshot of the initial setup:

image02

Manjot: Once Daniel had Orchestrator setup, he wanted to leverage it to help with the MySQL upgrade. We set out to create a plan that worked within his constraints and still kept best practices in mind.

First, we installed Percona Server 5.6 fresh on our dedicated backup slave. That first 5.6 slave was created with MyDumper to achieve forward compatibility and not have any legacy tablespaces. Since MyDumper was already installed with the Percona Backup Service that Life360 has, this was fairly easy to accomplish.

The MyDumper slave rebuild works in the following way:

To take a mydumper backup:

  1. Go to your desired backups directory
  2. Install mydumper (sudo apt-get install mydumper)
  3. mydumper -t 8 -L mydumper.log –compress

To restore:

  1. Make sure MyDumper is installed: sudo apt-get install mydumper
  2. Copy the MyDumper backups over to a backups dir
  3. Export your BACKUP_DIR as env var
  4. Run this to restore with MyLoader (from https://gist.github.com/Dnile/4658b338d4a101cbe2eeb5080ebddf8e):
    #!/usr/bin/env sh
    cd $BACKUP_DIR
    export DESTHOST=127.0.0.1
    export BACKUP_DIR=/vol_mysql/backups
    mysqld --skip-grant-tables &
    for i in `ls -1 *-schema.dump.gz | cut -d'-' -f1`; do mysql -h $DESTHOST -e "CREATE DATABASE IF NOT EXISTS $i"; zcat $i-schema.dump.gz | mysql -h $DESTHOST $i; zcat $i-schema-post.dump.gz | mysql -h $DESTHOST $i; done
    /usr/bin/myloader --host=$DESTHOST --directory=$BACKUP_DIR --enable-binlog --threads=10 --queries-per-transaction=20 -v 3
    chown -R mysql:mysql /var/lib/mysql/

Once the first 5.6 slave was caught up, we used Xtrabackup to backup 5.6 and then restored to each slave, cycling them out of the read slave pool one at a time.

Once all the slaves were upgraded, we created a new 5.6 master and had it replicate off our primary 5.5 master.

Then we moved all of the slaves to replicate off the new 5.6 master.

Life360 had long cron jobs that ran on the second 5.5 master. We moved the cron applications to write to the primary 5.5 master, and locked all tables. We then stopped replication on the second co-master. Daniel stopped MySQL and decommissioned it.

We then moved all application writes to the new 5.6 master. While Orchestrator can use external scripts to move IPs, we used a manual process here to change application DSNs and HAProxy configuration.

On the 5.5 master that remained, we used Orchestrator to set it to read only.

image01

Daniel says this didn’t do a whole lot to get rid of connections that were still open on this server.

On the new master, we used the stop slave and reset slave buttons in the Orchestrator panel so it would no longer slave from the old master.

Once some of the thousands of connections had moved to the new master, we stopped MySQL on the 5.5 master, which took care of the rest and the application “gracefully” reconnected to the new 5.6 master.

There was some write downtime, as some connections did not drop off until they were forced to because php-fpm refused to let go. There is also always a high volume of transactions in this environment.

At this point our topology looks like this (ignore the globe icons for now):

image00

But as always Daniel wanted MOAR. It was time for GTID. While we could have done this during the upgrade, Life360 wanted to manage risk and not make too many production changes at one time.

We followed Percona’s guide, Online GTID Deployment, but used Orchestrator to shuffle the old and new masters and toggle read_only on and off. This made our job a lot easier and faster, and saved us from any downtime.

The globes in the topology screenshot above show that the slaves are now using GTID replication.

Orchestrator makes upgrades and changes much easier than before, just use caution and understand what it is doing in the background.

 

by Manjot Singh at April 26, 2016 07:56 PM

Jean-Jerome Schmidt

Top Nine Stand Out Facts About Percona Live in Santa Clara

We like our 9s at Severalnines, so here is our list of Top 9 Facts (in no particular order) that made the trip to Santa Clara worthwhile once again for this year’s Percona Live Conference. Thanks to the team at Percona and everyone else involved in keeping the Santa Clara tradition going by organising another great conference this year.

1. The conference opened its arms not only to the “traditional” MySQL community, but also put a focus on MongoDB users.

This year in particular, there was a real intent to provide a place to learn and mingle for those database users who deal with not only one datastore, but manage a mixed database environment. More and more users are called to operate in polyglot database landscapes and thus it was good to see this year’s conference provide a platform for that. Percona themselves of course now serve both the MySQL and MongoDB user community with their tools and offerings, and companies like ourselves strive to help organisations with polyglot IT environments with our solutions and resources as well. So it was great to see a good mix of tutorial and talk topics as well as of sponsors at the conference.

2. Galera Cluster continues to be a popular topic in the MySQL universe

With plenty of talks and tutorials discussing Galera Cluster and how to make best use of it, the technology developed by the Codership team confirmed its growing popularity with MySQL users. We ourselves see a lot of interest in it and we blog / talk about Galera Cluster on a regular basis, so it was good to hear about the latest and greatest on it and talk to users directly about their clustering needs.

3. The Oracle MySQL Team is very present in the community

As per last year, it was good to see the Oracle MySQL Team so present at the conference, perhaps even more so than previously. Some great work has been done with MySQL 5.7 and it was good to hear about the MySQL Document Store. The MySQL community has been evolving over the years with its particular history, and it seems that there is a good balance now. We’re looking forward of course to see what the team have in store next for MySQL!

4. Facebook’s RocksDB is gaining traction

Facebook developed RocksDB by forking and improving LevelDB. The RocksDB storage engine is available for both MongoDB and MySQL. Many of the talks at the conference were about the ins- and outs of MongoRocks and MyRocks. We foresee a great future and adoption for this innovation by Facebook.

5. MySQL In Containers

A number of talks addressed the issues users can be faced with when deploying MySQL in containerized environments. Containers allow for rapid creation and deletion, and a lot of care has to be taken when deploying MySQL; especially in production. Going forward it will most likely be common to see MySQL servers running in container environments.

6. Bill Nye’s keynote

After Steve Wozniak last year, this year's invited keynote speaker was Bill Nye. Bill is better known as The Science Guy and with his vision he gives a twist on tech and science. His keynote gave a lot of food for thought, but just in case you have missed it you can watch the recording here: https://www.youtube.com/watch?v=iQDrK3rEtWg

7. Did we mention that MongoDB was a big topic this year?

For everyone who’s been following Percona’s news, you’ll know that Percona is investing in MongoDB following their acquisition of Tokutek last year with the result of the new Percona Server for MongoDB. In addition to that, quite a few MongoDB related talks had been submitted (and accepted) for this year’s conference, which was good to see. One of them being our talk on how to manage MongoDB as a MySQL DBA. So while the prevalence of MongoDB as a topic might still seem unusual to those used to the MySQL user conference, it’s an interesting development that we’re keeping an eye on.

8. On a self-interested note: we hosted a ‘How to become a MySQL DBA’ tutorial this year!

Our colleagues Art van Scheppingen and Ashraf Sharif hosted a tutorial on ‘How to become a MySQL DBA’. It was great to spend the day with database enthusiasts who were keen to learn how to manage MySQL and share experiences with them. The tutorial was based on same-titled blog series, which has proven quite popular with our blog followers. Feel free to check it out.

9. Meet the friends!

We kind of have to part on that note: of course the conference is always a week to look forward to in the year, as it’s really the place where we get to see a lot of friends from past and present in the MySQL (and now MongoDB) community. It really is a grand tradition and it’s good to see that so many members of the community are to keen to continue to keep that going.

So here’s to seeing you all next year again!

by Severalnines at April 26, 2016 07:34 PM

MariaDB Foundation

MariaDB Galera Cluster 5.5.49 and updated connectors now available

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.49, MariaDB Connector/J 1.4.3, and MariaDB Connector/C 2.2.3. See the release notes and changelogs for details on these releases. Download MariaDB Galera Cluster 5.5.49 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator Download MariaDB […]

The post MariaDB Galera Cluster 5.5.49 and updated connectors now available appeared first on MariaDB.org.

by Daniel Bartholomew at April 26, 2016 05:53 PM

April 25, 2016

MariaDB AB

Top database challenges for MariaDB

janlindstrom

The latest top research identifies a number of big trends and challenges for databases. Jan Lindström, developer at MariaDB, summarises the key issues and challenges and how MariaDB approaches them.

First of all, the opinions expressed in this article are the author's own and do not necessary reflect the view of the MariaDB Corporation. These views are based on the review article by Abadi, et.all.: The Beckman Report on Database Research, Communications of the ACM, Vol. 59, NO. 02, 02/2016. While this meeting with thirty leaders from the database research community met in October 2013, my view is that issues raised in this meeting are still more than valid.

The review article identifies big data as a defining challenge of our time. This is because it has become cheaper to generate data due to inexpensive storage, sensors, smart devices, social software, multiplayer games, and the Internet of Things. Additionally, it has become cheaper to process large amounts of data, due to advances in multicore CPUs, solid state storage, cheap cloud computing, and open source software.

By 2020, the International Data Corporation (IDC) predicts that the amount of digital information created and replicated in the world will grow to almost 40 zettabytes (ZB)—more than 50 times what existed in 2010 and amounting to 5,247 gigabytes for every person on the planet (see http://www.datacenterjournal.com/birth-death-big-data/).

This means that organizations have more and more unstructured and unused data that could contain valuable information for predicting business trends and making business decisions. Forbes predicted in 2015 that Buying and selling data will become the new business bread and butter.

In the recent years the database research and development community has strengthened core research and development in relational DBMSs and branched out into new directions: security, privacy, data pricing, data attribution, social and mobile data, spatiotemporal data, personalization and contextualization, energy constrained processing, and scientific data management.

These lofty research challenges must be taken down on a functionality, if not even feature level. Here are some features we're working on, in a chewing-an-elephant-a-bite-at-a-time fashion.

The review article identifies five big data challenges: scalable big/fast data infrastructures; coping with diversity in data management; end-to-end processing of data; cloud services; and the roles of the people in the data life cycle. Three of the challenges deal with the volume, velocity, and variety aspects of big data. The last two challenges deal with extending big data applications in the cloud and managing the involvement of people in these applications.

How can MariaDB address these challenges? By developing new storage engines for Big Data like MariaDB ColumnStore (earlier InfiniDB). MariaDB ColumnStore is a scalable columnar database management system built for big data analytics, business intelligence, data warehousing and other read-intensive application. Column-store architecture enables very quick load and query times. Its massive parallel processing (MPP) technology scales with any type of storage hardware.

Furthermore, MariaDB can support new datatypes and SQL-functions like:

Forbes predicted on 2015 that Security will become the killer app for big data analytics. Now that MariaDB 10.1 server provides tools for data at rest encryption, other storage engines can easily provide security feature for their data.

However, as seen from research challenges there is a lot of room for additional vision and development on relational database management systems like MariaDB.

Tags: 

by janlindstrom at April 25, 2016 11:54 AM

Jean-Jerome Schmidt

Webinar tomorrow: Become a MongoDB DBA (if you’re really a MySQL user)

Join us tomorrow, as we introduce a new webinar series: ‘How to Become a MongoDB DBA’ to answer the question: ‘what does a MongoDB DBA do?’.

This is a series of three webinars during which we will walk you through the most important tasks a MongoDB DBA routinely goes through and provide you with options on how to best complete these tasks.

It is not uncommon for MySQL DBAs, developers, network/system administrators or DevOps folks with general background to find themselves in a situation where they’ve been working with MySQL for a while and are now being asked to also properly maintain one or more MongoDB instances. In fact, with more organisations operating polyglot environments, it’s starting to become commonplace.

In this first webinar, we will show you how you can automate tasks, monitor a cluster and manage MongoDB; whilst also automating and managing your MySQL and/or PostgreSQL installations.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, April 26th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, April 26th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • Introduction to becoming a MongoDB DBA
  • Installing & configuring MongoDB
  • What to monitor and how
  • How to perform backups
  • Live Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

We look forward to “seeing” you there!

This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a ClusterControl DBA’ blog series.

by Severalnines at April 25, 2016 11:42 AM

MariaDB AB

AliSQL and some features that have made it into MariaDB Server

colin

AliSQL is the fork of MySQL (maintained by Alibaba) that powers all of the group from Tmall, Taobao, Alibaba.com, 1688.com, AliExpress, Cainiao and the payment's network Alipay.

AliSQL isn't new - they've had this fork since 2011: AliSQL 5.1 (Bugfixes for DDL, eliminate race conditions), 2012: AliSQL 5.5 (parallel replication, optimize hot SKUs), 2014: AliSQL 5.6 (Enhanced threadpool, SQL firewall). The team have found/reported/fixed over 40+ bugs, added 41 new features, and optimised 27 bottlenecks. All recent releases of MySQL have had AliSQL contributions (5.6.25 onwards, 5.7.8 onwards, and even 5.8 branch).

AliSQL benefitting MariaDB Server

While not mentioned in the talk, its clear that MariaDB Server 10.0+ has benefited from their tree as well -- our log of contributions lists Lixun Peng/Taobao as having contributed to multi-source replication (something MariaDB Server had in a shipping GA release since March 2014; a similar feature appears in MySQL 5.7 GA released October 2015). But that is not all, there is also per-thread memory counting and usage, fixed in MDEV-4011. Don't forget that they've constantly released many of their features as opensource.

So what is per-thread memory counting? It is exposed either via SHOW STATUS:

show status like 'memory_used'\G
*************************** 1. row ***************************
Variable_name: Memory_used
        Value: 73808
1 row in set (0.00 sec)

But that is obviously not all. Have you seen the INFORMATION_SCHEMA.PROCESSLIST? Compare the difference when it comes to MySQL 5.7.12 and MariaDB Server 10.1.13:

MySQL 5.7.12

select * from INFORMATION_SCHEMA.processlist\G
*************************** 1. row ***************************
     ID: 8
   USER: msandbox
   HOST: localhost
     DB: NULL
COMMAND: Query
   TIME: 0
  STATE: executing
   INFO: select * from INFORMATION_SCHEMA.processlist
1 row in set (0.00 sec)

MariaDB Server 10.1.13

select * from INFORMATION_SCHEMA.processlist\G
*************************** 1. row ***************************
           ID: 4
         USER: msandbox
         HOST: localhost
           DB: NULL
      COMMAND: Query
         TIME: 0
        STATE: Filling schema table
         INFO: select * from INFORMATION_SCHEMA.processlist
      TIME_MS: 0.464
        STAGE: 0
    MAX_STAGE: 0
     PROGRESS: 0.000
  MEMORY_USED: 84552
EXAMINED_ROWS: 0
     QUERY_ID: 24
  INFO_BINARY: select * from INFORMATION_SCHEMA.processlist
          TID: 22005
1 row in set (0.00 sec)

What can we look forward to in the future from AliSQL?

It was brought up in the talk that AliSQL is not currently opensource, but there are some very interesting features around it that would likely benefit many MariaDB Server users:

  • optimising hot rows, which is a common issue in Alibaba referred to as an "optimisation for hot SKUs", typical during a Single's Day. This has been powering AliSQL in several versions, and they are working on ensuring there is a row cache, a new InnoDB row lock type, as well as group updates of associated transactions.
  • They have split the redo log buffers into two, one for each reading and writing.
  • InnoDB column compression using ZLIB, which also supports online DDL.
  • The idea of flashback and a recycle bin, all related to the "time machine" concept that we can expect to see soon.
  • Another interesting feature is the idea of thinking in terms of a glass of water. There is low water mark protection in where they have an enhanced threadpool that buffers queries while the thread is running. If for some reason it exceeds the the low water level, send it to the high water mark where they have a kind of "SQL firewall" that denies queries via a blacklist strategy. Their rule syntax makes use of an Abstract Syntax Tree (AST) between the parser and optimiser of their server.
  • They also do binlog speed throttling, and have enhanced information around deadlocks.

All in, here's hoping that the AliSQL tree is open, so that we can look at features and see what ends up in a future release of MariaDB Server.

About the Author

colin's picture

Colin Charles is the Chief Evangelist for MariaDB since 2009, work ranging from speaking engagements to consultancy and engineering works around MariaDB. He lives in Kuala Lumpur, Malaysia and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He's well known on the conference track having spoken at many of them over the course of his career.

by colin at April 25, 2016 06:42 AM

April 24, 2016

Valeriy Kravchuk

Building MariaDB 10.1.x and Galera from Source for Multiple Node Cluster Testing Setup

My Facebook followers probably noted that I quit from Percona some time ago and work for MariaDB since March 1, 2016. I changed the company, but neither the job role (I am still a Support Engineer), nor the approach to do my job. I still prefer to test everything I suggest to customers and I usually use software I build from source myself for these tests.

While I try to avoid all kinds of clusters as much as possible for 15 years or so already (it does not matter if it's Oracle RAC, MySQL Cluster or Percona XtraDB Cluster, all of them), it's really hard to avoid Galera clusters while working for MariaDB. One of the reasons for this is that Galera, starting from MariaDB 10.1, can be easily "enabled"/used with any MariaDB 10.1.x instance, any time (at least when we speak about official binaries or those properly built - they are all "Galera ready"). Most of MariaDB customers do use Galera or can try to use it any time, so I have to be ready to test something Galera-specific any moment.

For simple cases I decided to use a setup with several (2 to begin with) cluster nodes on one box. This approach is described in the manual for Percona XtraDB Cluster and was also used by my former colleague Fernando Laudares for his blog post and many real life related tests.

So, I decided to proceed with the mix of ideas from the sources above and MariaDB's KB article on building Galera from source. As I decided to do this on my wife's Fedora 23 workstation, I checked this KB article for some details also. It lists prerequisites (boost-devel check-devel glibc-devel openssl-devel scons) and some of these packages (like scons in one of my cases) could be missing even on a system previosly used for builds for all kinds of MySQL related software. You can find something missing and fix the problem at later stage, but reading and following the manual or KB articles may help to save some time otherwise spent on trial and error.

I've started with making directories in my home directory (/home/openxs) for this Galera related testing setup, like these:
[openxs@fc23 ~]$ mkdir galera
[openxs@fc23 ~]$ cd galera
[openxs@fc23 galera]$ mkdir node1[openxs@fc23 galera]$ mkdir node2
[openxs@fc23 galera]$ mkdir node3
[openxs@fc23 galera]$ ls
node1  node2  node3
I plan to use 3 nodes one day, but for this blog post I'll set up only 2, to have the smallest possible and simplest cluster as a proof of concept.

Then I proceeded with cloning Galera from Codership's GitHub (this is supposed to be the latest and greatest). I changed current directory to my usual git repository and executed git clone https://github.com/codership/galera.git. When this command completed I've got a subdirectory named galera.

In that directory, assuming that all prerequisites are installed, to build current Galera library version it's enough to execute simple script while in galera directory, ./scripts/build.sh. I ended up with the following:
[openxs@fc23 galera]$ ls -l libgalera_smm.so
-rwxrwxr-x. 1 openxs openxs 40204824 Mar 31 12:21 libgalera_smm.so
[openxs@fc23 galera]$ file libgalera_smm.so
libgalera_smm.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=11457fa9fd69dabe617708c0dd288b218255a886, not stripped

[openxs@fc23 galera]$ pwd
/home/openxs/git/galera
[openxs@fc23 galera]$ cp libgalera_smm.so ~/galera/
and copied the library to the target directory for my testing setup (that should NOT conflict with whatever software I may have installed later from packages).

Now, time to build MariaDB properly to let it use Galera if needed. I already had recent (at the moment) 10.1.13 in the server subdirectory of my git repository. I've executed the following commands then:

[openxs@fc23 server]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1
-- Running cmake version 3.4.1
-- MariaDB 10.1.13
...

[openxs@fc23 server]$ time make -j 4...
real    9m28.164s
user    32m43.960s
sys     2m45.637s
This was my usual command line to build MariaDB 10.x with only 2 extra options added: -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON.After make completed, I've executed make install && make clean and was ready to use my shiny new Galera-ready MariaDB 10.1.13.

To take into account the directories I am going to use for my cluster nodes and make sure they can start and communicate as separate mysqld instances, I have to create configuration files for them. I've changed working directory to /home/openxs/dbs/mariadb10.1 and started with this configuration file for the first node:

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode1.cnf                    
[mysqld]
datadir=/home/openxs/galera/node1
port=3306
socket=/tmp/mysql-node1.sock
pid-file=/tmp/mysql-node1.pid
log-error=/tmp/mysql-node1.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!
wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node1
# wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no
It's one of the shortest possible. I had to specify unique datadir, error log location, pid file, port and socket for the instance, set binlog format and point out Galera library location, set cluster name and node name. With proper planning I was able to specify wsrep_cluster_address referring to all other nodes properly, but for initial setup of the first node I can have it "empty" as commented out in the above, so that we start as a new cluster node. There is one essential setting for MariaDB 10.1.x that is not needed for "cluster-specific" instances like Percona XtraDB Cluster or older 10.0.x Galera packages from MariaDB (where it's ON by default). This is wsrep_on=ON. Without it MariaDB works as normal, non-cluster instance and ignores anything cluster-related. You can save a lot of time in case of upgrade to 10.1.x if you put it in your configuration file explicitly right now, no matter what the version is used.

Then I copied and modified configuration file for the second node:
[openxs@fc23 maria10.1]$ cp /home/openxs/galera/mynode1.cnf /home/openxs/galera/mynode2.cnf
[openxs@fc23 maria10.1]$ vi /home/openxs/galera/mynode2.cnf                     

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode2.cnf                   
[mysqld]
datadir=/home/openxs/galera/node2
port=3307
socket=/tmp/mysql-node2.sock
pid-file=/tmp/mysql-node2.pid
log-error=/tmp/mysql-node2.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node2
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020

?pc.wait_prim=nowsrep_provider_options = "base_port=5020;"
Note that while Galera node uses 4 ports, I specified only 2 unique ones explicitly, port for MySQL clients and base port for all Galera-related communication like IST and SST, with base_port setting. Note also how I referred to all cluster nodes with wsrep_cluster_address - this same value can be used for the configuration file of the first node actually. We can just start it as the first node of a new cluster (see below).

Now we have configuration files for 2 nodes ready (we can always add node3 later in the same way). But before starting new cluster we have to install system databases. For node1 it was performed in the following way:
[openxs@fc23 maria10.1]$ scripts/mysql_install_db --defaults-file=/home/openxs/galera/mynode1.cnf
Installing MariaDB/MySQL system tables in '/home/openxs/galera/node1' ...
2016-03-31 12:51:34 139766046820480 [Note] ./bin/mysqld (mysqld 10.1.13-MariaDB) starting as process 28297 ...
...

[openxs@fc23 maria10.1]$ ls -l /home/openxs/galera/node1
-rw-rw----. 1 openxs openxs    16384 Mar 31 12:51 aria_log.00000001
-rw-rw----. 1 openxs openxs       52 Mar 31 12:51 aria_log_control
-rw-rw----. 1 openxs openxs 12582912 Mar 31 12:51 ibdata1
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile0
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile1
drwx------. 2 openxs openxs     4096 Mar 31 12:51 mysql
drwx------. 2 openxs openxs     4096 Mar 31 12:51 performance_schema
drwx------. 2 openxs openxs     4096 Mar 31 12:51 test
Then I started node1 as a new cluster:
[openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode1.cnf --wsrep-new-cluster &
and created a table, t1, with some data in it. After that I repeated installation of system tables etc for node2, just referencing proper configuration file, and started node2 that was supposed to join the cluster:
openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode2.cnf &
Let's check if we do have both instances running and communicating in Galera cluster:
[openxs@fc23 maria10.1]$ tail /tmp/mysql-node2.err                             
2016-03-31 13:40:29 139627414767744 [Note] WSREP: Signalling provider to continue.
2016-03-31 13:40:29 139627414767744 [Note] WSREP: SST received: c91d17b6-f72b-11e5-95de-96e95167f593:0
2016-03-31 13:40:29 139627117668096 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINER -> JOINED (TO: 0)
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Member 1.0 (node2) synced with group.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:29 139627414452992 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:29 139627414452992 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:29 139627414767744 [Note] /home/openxs/dbs/maria10.1/bin/mysqld: ready for connections.
Version: '10.1.13-MariaDB'  socket: '/tmp/mysql-node2.sock'  port: 3307  Source distribution

[openxs@fc23 maria10.1]$ tail /tmp/mysql-node1.err
2016-03-31 13:40:27 140071390934784 [Note] WSREP: Provider resumed.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node2) complete.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Member 0.0 (node1) synced with group.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:27 140072429247232 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:27 140072429247232 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:27 140072141715200 [Note] WSREP: (c91c99ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-03-31 13:40:29 140072133322496 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 140072133322496 [Note] WSREP: Member 1.0 (node2) synced with group.
Familiar messages (unfortunately...) that prove we had a second node joined and performed state transfer from the first one. Now it's time to connect and test how cluster works. This is what I had after node1 started and table with some data created there, but before node2 started:
[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node1.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'wsrep_cluster%';
+-----------------------+-------------------------------------------------------+
| Variable_name         | Value                                                 |
+-----------------------+-------------------------------------------------------+
| wsrep_cluster_address | gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no |
| wsrep_cluster_name    | singlebox                                             |
+-----------------------+-------------------------------------------------------+
2 rows in set (0.00 sec)

MariaDB [(none)]> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 1                                    |
| wsrep_cluster_size       | 1                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.00 sec)

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
Then, when node2 joined the cluster, I checked that the data we've added on node1 are there:

[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node2.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)

MariaDB [test]> show status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 2                                    |
| wsrep_cluster_size       | 2                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.01 sec)
So, the first basic test with the Galera cluster of 2 nodes (both running on the same box) built from current source of Galera and MariaDB 10.1.x on Fedora 23 is completed successfully. I plan to play with it more in the future, use current xtrabackup built from source for SST and so on, and create blog posts about these steps and any interesting tests in this setup. Stay tuned.

From the dates above you can conclude that it took me 3 weeks to publish this post. That's because I was busy with the company meeting in Berlin and some usual Support work, and was not sure is it really a good idea for me to write any post with "Galera" or "MariaDB" words used in it even once...



by Valeriy Kravchuk (noreply@blogger.com) at April 24, 2016 06:33 PM

April 22, 2016

Shlomi Noach

MySQL Community Awards 2016: the Winners

The MySQL Community Awards initiative is an effort to acknowledge and thank individuals and corporates for their contributions to the MySQL ecosystem. It is a from-the-community, by-the-community and for-the-community effort. The committee is composed of an independent group of community members of different orientation and opinion, themselves past winners or known contributors to the community.

The 2016 community awards were presented on April 21st, 2016, during the keynotes at the Percona Live conference. The winners are:

MySQL Community Awards: Community Contributor of the year 2016

  • Bill Karwin
    Bill has been working with the community for years, helping them understand SQL. Bill is the author of the great book "SQL Antipatterns". He has given a large amount of help on sites such as StackOverflow, Quora, and of course many conference talks. Bill has provided a huge amount of help to the community directly.
  • Domas Mituzas
    Domas Mituzas started in the MySQL ecosystem as a MySQL Support Engineer at MySQL AB. Since he had some spare time, he did a lot of work to scale MySQL at Wikipedia. He is now a small data engineer at Facebook, mostly working with user-facing data systems. He continues to write very interesting blog posts and bug reports. Domas is responsible for giving us MyDumper, PoorMansProfiler, and the infamous Query Cache tuner!
  • Yoshinori Matsunobu
    Yoshinori Matsunobu is currently leading the MyRocks effort to get the RocksDB storage engine for MySQL into production at Facebook. Previously (amongst his other accomplishments) he created HandlerSocket, and implemented MHA to support failover automation for MySQL – both of which have been used at many companies. He is a frequent speaker at community events, and his tutorials and slide decks do a lot to increase expertise in the community. He is a frequent bug reporter with a focus on replication (RBR, semi-sync).

MySQL Community Awards: Application of the year 2016

  • MaxScale
    MariaDB MaxScale is an Open Source dynamic routing gateway. It is widely used as a database load balancer for Galera Cluster deployments, for standard replication setups, and as a replication relay. It has a modular architecture which includes plugins for read-write splitting and query logging. It serves a variety of tasks, from load balancing to database firewall filtering to binlog server and is widely used in production in large topologies.

MySQL Community Awards: Corporate Contributor of the year 2016

  • Booking.com
    Booking.com has been a massive contributor to the MySQL ecosystem, sending many of their excellent DBAs to various conferences to talk. They have provided an innovative test bed for testing out, and giving a wealth of invaluable feedback about new releases (across a wide variety of MySQL and related software projects). Booking.com contributes to Open Source foundations, projects and communities by donating, sponsoring, making code contributions and hosting events. The quality of MySQL is undoubtedly much better today because of their help and input.

Congrats to all winners!

Committee members

  • Baron Schwartz
  • Colin Charles
  • Daniël van Eeden
  • Davi Arnaut
  • Frederic Descamps
  • Geoffrey Anderson
  • Giuseppe Maxia
  • Justin Swanhart
  • Mark Leith
  • Morgan Tocker
  • Philip Stoev
  • Ronald Bradford
  • Santiago Lertora

Co-secretaries:

  • Jeremy Cole
  • Shlomi Noach

Special thanks

Thank you to this year's anonymous sponsor for donating the goblets!

Thank you to Colin Charles for acquiring and transporting the goblets!

 

by shlomi at April 22, 2016 05:03 PM

MariaDB Foundation

MariaDB 5.5.49 and updated connectors now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.49, MariaDB Connector/J 1.4.2, and MariaDB Connector/ODBC 2.0.10. See the release notes and changelogs for details on these releases. Download MariaDB 5.5.49 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/J 1.4.2 Release Notes Changelog […]

The post MariaDB 5.5.49 and updated connectors now available appeared first on MariaDB.org.

by Daniel Bartholomew at April 22, 2016 02:11 PM