Planet MariaDB

July 28, 2016

Peter Zaitsev

Percona Monitoring and Management 1.0.2 Beta

Percona Monitoring and Management 1.0.2 Beta

Percona Monitoring and Management 1.0.2 BetaPercona announces the release of Percona Monitoring and Management 1.0.2 Beta on 28 July 2016.

Like prior versions, PMM is distributed through Docker Hub and is free to download. Full instructions for download and installation of the server and client are available in the documentation.

Notable changes to the tool include:

  • Upgraded to Grafana 3.1.0.
  • Upgraded to Prometheus 1.0.1.
  • Set default metrics retention to 30 days.
  • Eliminated port 9001. Now the container uses only one configurable port, 80 by default.
  • Eliminated the need to specify ADDRESS variable when creating Docker container.
  • Completely re-wrote pmm-admin with more functions.
  • Added ability to stop all services using the new pmm-admin.
  • Added support to name instances using the new pmm-admin.
  • Query Analytics Application updates:
    • Redesigned queries profile table
    • Redesigned metrics table
    • Redesigned instance settings page
    • Added sparkline charts
    • Added ability to show more than ten queries
  • Various updates for MongoDB dashboards.

The full release notes are available in the documentation. The documentation also includes details on installation and architecture.

A demonstration of the tool has been set up at pmmdemo.percona.com.

We have also implemented forums for the discussion of PMM.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Some screen shots of the updates:

Note the new sparkline that shows the current load in context (so you know if the number is higher/normal/lower than normal), and the option to “Load next 10 queries” at the bottom of the listing.

Sparklines in QAN

Our admin tool was completely re-written with new functions:

pmm-admin-help
pmm-admin –help output

 

pmm-admin list command output
pmm-admin list command output

 

pmm-admin check-network output, which provides information on the status of the client’s network connection to the server.

by Bob Davis at July 28, 2016 07:39 PM

Upcoming Webinar August 2 10:00 am PDT: MySQL and Ceph

MySQL and Ceph

MySQL and CephJoin Brent Compton, Kyle Bader and Yves Trudeau on August 2, 2016 at 10 am PDT (UTC-7) for a MySQL and Ceph webinar.

Many operators select OpenStack as their control plane of choice for providing both internal and external IT services. The OpenStack user survey repeatedly shows Ceph as the dominant backend for providing persistent storage volumes through OpenStack Cinder. When building applications and repatriating old workloads, developers are discovering the need to provide OpenStack infrastructure database services. Given MySQL’s ubiquity, and it’s reliance on persistent storage, it is of utmost importance to understand how to achieve the performance demanded by today’s applications. Databases like MySQL can be incredibly IO intensive, and Ceph offers a great opportunity to go beyond the limitations presented by a single scale-up system. Since Ceph provides a mutable object store with atomic operations, could MySQL store InnoDB pages directly in Ceph?

This talk reviews the general architecture of Ceph, and then discusses benchmark results from small to mid-size Ceph clusters. These benchmarks lead to the development of prescriptive guidance around tuning Ceph storage nodes (OSDs), the impact the amount of physical memory, and the presence of SSDs, high-speed networks or RAID controllers.

Speakers:
MySQL and Ceph
Brent Compton
Director Storage Solution Architectures, Red Hat
Brent Compton is Director Storage Solution Architectures at Red Hat. He leads the team responsible for building Ceph and Gluster storage reference architectures with Red Hat Storage partners. Before Red Hat, Brent was responsible for emerging non-volatile memory software technologies at Fusion-io. Previous enterprise software leadership roles include VP Product Management at Micromuse (now IBM Tivoli Netcool) and Product Marketing Director within HP’s OpenView software division. Brent also served as Director Middleware Development Platforms at the LDS Church and as CIO at Joint Commission International. Brent has a tight-knit family, and can be found on skis or a mountain bike whenever possible.
MySQL and Ceph
Kyle Bader
Sr Solution Architect, Red Hat
Kyle Bader, a Red Hat senior architect, provides expertise in the design and operation of petabyte-scale storage systems using Ceph. He joined Red Hat as part of the 2014 Inktank acquisition. As a senior systems engineer at DreamHost, he helped implement, operate, and design Ceph and OpenStack-based systems for DreamCompute and DreamObjects cloud products.
MySQL and Ceph
Yves Trudeau
Principal Architect
Yves is a Principal Consultant at Percona, specializing in MySQL High-Availability and scaling solutions. Before joining Percona in 2009, he worked as a senior consultant for MySQL AB and Sun Microsystems, assisting customers across North America with NDB Cluster and Heartbeat/DRBD technologies. Yves holds a Ph.D. in Experimental Physics from Université de Sherbrooke. He lives in Québec, Canada with his wife and three daughters.

by Dave Avery at July 28, 2016 06:27 PM

Jean-Jerome Schmidt

Planets9s - Sign up for our webinar trilogy on MySQL Query Tuning

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Sign up for our webinar trilogy on MySQL Query Tuning

This is a new webinar trilogy on MySQL Query Tuning, which follows the popular webinar on MySQL database performance tuning. In this trilogy, we will look at query tuning process and tools to help with that. We’ll cover topics such as SQL tuning, indexing, the optimizer and how to leverage EXPLAIN to gain insight into execution plans. This is a proper deep-dive into optimising MySQL queries, which we’re covering in three parts.

Sign up for the webinars

ClusterControl Developer Studio: MongoDB Replication Lag Advisor

This blog post explains, step by step, how we implemented our MongoDB replication lag advisor in our Developer Studio. We have included this advisor in ClusterControl 1.3.2, and enabled it by default on any MongoDB cluster or replica set. ClusterControl Developer Studio allows you to write your own scripts, advisors and alerts. With just a few lines of code, you can already automate your clusters. Happy clustering!

Read the blog

MySQL on Docker: Single Host Networking for MySQL Containers

Having covered the basics of running MySQL in a container and how to build a custom MySQL image in our previous MySQL on Docker posts, we are now going to cover the basics of how Docker handles single-host networking and how MySQL containers can leverage that. We’d love to hear your feedback, so feel free to comment on our blogs as well.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at July 28, 2016 07:24 AM

July 27, 2016

Peter Zaitsev

Percona Live Europe Amsterdam 2016 Tutorial Schedule is Up!

Percona Live Europe Amsterdam 2016 Tutorial Schedule

Percona Live Europe Amsterdam 2016 Tutorial ScheduleThis blog post lists the Percona Live Europe Amsterdam 2016 tutorial schedule.

We are excited to announce that the tutorial schedule for the Percona Live Europe Amsterdam Open Source Database Conference 2016 is up!

The Percona Live Europe Amsterdam Open Source Database Conference is the premier event for the diverse and active open source community, as well as businesses that develop and use open source software. The conferences have a technical focus with an emphasis on the core topics of MySQL, MongoDB, and other open source databases. Tackling subjects such as analytics, architecture and design, security, operations, scalability and performance, Percona Live provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs.

Tackling subjects such as analytics, architecture and design, security, operations, scalability and performance, Percona Live Europe provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs. This conference is an opportunity to network with peers and technology professionals by bringing together accomplished DBA’s, system architects and developers from around the world to share their knowledge and experience – all to help you learn how to tackle your open source database challenges in a whole new way. These tutorials are a must for any data performance professional!

The Percona Live Europe Open Source Database Conference is October 3-5 at the Mövenpick Hotel Amsterdam City Centre.

Click through to the tutorial link right now, look them over, and pick which sessions you want to attend. Discounted passes available below!

Tutorial List:
Early Bird Discounts

Just a reminder to everyone out there: our Early Bird discount rate for the Percona Live Europe Amsterdam Open Source Database Conference is only available ‘til August 8, 2016, 11:30 pm PST! This rate gets you all the excellent and amazing opportunities that Percona Live offers, at a very reasonable price!

Sponsor Percona Live

Become a conference sponsor! We have sponsorship opportunities available for this annual MySQL, MongoDB and open source database event. Sponsors become a part of a dynamic and growing ecosystem and interact with hundreds of DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

by Kortney Runyan at July 27, 2016 10:09 PM

Monitoring MongoDB with Nagios

Monitoring MongoDB with Nagios

Monitoring MongoDB with NagiosIn this blog, we’ll discuss monitoring MongoDB with Nagios.

There is a significant amount of talk around graphing MongoDB metrics using things like Prometheus, Data Dog, New Relic, and Ops Manager from MongoDB Inc. However, I haven’t noticed a lot of talk around “What MongoDB alerts should I be setting up?”

While building out Percona’s remote DBA service for MongoDB, I looked at Prometheus’s AlertManager. After reviewing it, I’m not sure it’s quite ready to be used exclusively. We needed to decide quickly if there are better Nagios checks on the market, or did I need to write my own?

In the end, we settled on a hybrid approach. There are some good frameworks, but we need to create or tweak some of the things needed for an “SEV 1-” or “SEV 2-” type issue (which are most important to me). One of the most common problems for operations, Ops, DevOps, DBA teams and most engineering is alert spam. As such I wanted to be very careful to only alert on the things pointing to immediate dangers or current outages. As a result, we have now added

pmp-check-mongo.py
 to the GitHub for Percona Monitoring Plugins. Since we use Grafana and Prometheus for metrics and graphing, there are no accompanying Catci information templates. In the future, we’ll need to decide how this will change PMP overtime. In the meantime, we wanted to make the tool available now and worry about some of the issues later on.

As part of this push, I want to give you some real world examples of how you might use this tool. There are many options available to you, and Nagios is still a bit green in regards to making those options as user-friendly as our tools are.

Usage: pmp-check-mongo.py [options]
Options:
  -h, --help                         show this help message and exit
  -H HOST, --host=HOST               The hostname you want to connect to
  -P PORT, --port=PORT               The port mongodb is running on
  -u USER, --user=USER               The username you want to login as
  -p PASSWD, --password=PASSWD       The password you want to use for that user
  -W WARNING, --warning=WARNING      The warning threshold you want to set
  -C CRITICAL, --critical=CRITICAL   The critical threshold you want to set
  -A ACTION, --action=ACTION         The action you want to take. Valid choices are
                                     (check_connections, check_election, check_lock_pct,
                                     check_repl_lag, check_flushing, check_total_indexes,
                                     check_balance, check_queues, check_cannary_test,
                                     check_have_primary, check_oplog, check_index_ratio,
                                     check_connect) Default: check_connect
  -s SSL, --ssl=SSL                  Connect using SSL
  -r REPLICASET, --replicaset=REPLICASET    Connect to replicaset
  -c COLLECTION, --collection=COLLECTION    Specify the collection in check_cannary_test
  -d DATABASE, --database=DATABASE          Specify the database in check_cannary_test
  -q QUERY, --query=QUERY                   Specify the query, only used in check_cannary_test
  --statusfile=STATUS_FILENAME      File to current store state data in for delta checks
  --backup-statusfile=STATUS_FILENAME_BACKUP    File to previous store state data in for delta checks
  --max-stale=MAX_STALE             Age of status file to make new checks (seconds)

There seems to be a huge amount going on here, but let’s break it down into a few categories:

  • Connection options
  • Actions
  • Action options
  • Status options

Hopefully, this takes some of the scariness out of the script above.

Connection options
  • Host / Port Number
    • Pretty simple, this is just the host you want to connect to and what TCP port it is listening on.
  • Username and Password
    • Like with Host/Port, this is some of your normal and typical Mongo connection field options. If you do not set both the username and password, the system will assume auth was disabled.
  • SSL
    • This is mostly around the old SSL support in Mongo clients (which was a boolean). This tool needs updating to support the more modern SSL connection options. Use this as a “deprecated” feature that might not work on newer versions.
  • ReplicaSet
    • Very particular option that is only used for a few checks and verifies that the connection uses a replicaset connection. Using this option lets the tool automatically find a primary node for you, and is helpful to some checks specifically around replication and high availability (HA):
      • check_election
      • check_repl_lag
      • check_cannary_test
      • chech_have_primary
      • check_oplog
Actions and what they mean
  • check_connections
    • This parameter refers to memory usage, but beyond that you need to know if your typical connections suddenly double. This indicates something unexpected happened in the application or database and caused everything to reconnect. It often takes up to 10 minutes for those old connections to go away.
  • check_election
    • This uses the status file options we will cover in a minute, but it checks to see if the primary from the last check differs from the current found primary. If so, it alerts. This check should only have a threshold of one before it alarms (as an alert means an HA event occurred).
  • check_lock_pct
    • MMAP only, this engine has a write lock on the whole collection/database depending on the version. This is a crucial metric to determine if MMAP writes are blocking reads, meaning you need to scale the DB layer in some way.
  • check_repl_lag
    • Checks the replication stream to understand how lagged a given node is the primary. To accomplish this, it uses a fake record in the test DB to cause a write. Without this, a read-only system would look lagged artificially as no new oplog entries get created.
  • check_flushing
    • A common issue with MongoDB is very long flush times, causing a system halt. This is a caused by your disk subsystem not keeping up, and then the DB having to wait on flushing to make sure writes get correctly journaled.
  • check_total_indexes
    • The more indexes you have, the more the planner has to work to determine which index is a good fit. This increases the risk that the recovery of a failure will take a long time. This is due to the way a restore builds indexes and how MongoDB can only make one index at a time.
  • check_balance
    • While MongoDB should keep things in balance across a cluster, many things can happen: jumbo chunks, a disabled balancer being, constantly attempting to move the same chunk but failing, and even adding/removing sharding. This alert is for these cases, as an imbalance means some records might get served faster than others. It is purely based on the chunk count that the MongoDB balancer is also based on, which is not necessarily the same as disk usage.
  • check_queues
    • No matter what engine you have selected, a backlog of sustained reads or writes indicates your DB layer is unable to keep up with demand. It is important in these cases to send an alert if the rate is maintained. You might notice this is also in our Prometheus exporter for graphics as both trending and alerting are necessary to watch in a MongoDB system.
  • check_cannary_test
    • This is a typical query for the database and then used to set critical/warning levels based on the latency of the returned query. While not as accurate as full synthetic transactions, queries through the application are good to measure response time expectations and SLAs.
  • check_have_primary
    • If we had an HA event but failed to get back up quickly, it’s important to know if a new primary is causing writes to error on the system. This check simply determines if the replica set has a primary, which means it can handle reads and writes.
  • check_oplog
    • This check is all about how much oplog history you have. This is much like measuring how much history you have in MySQL blogs. The reason this is important is when recovering from a backup and performing a point in time recovery, you can use the current oplog if the oldest timestamp in the oplog is newer than the backup timestamp. As a result, this is normal three times the backup interval you use to guarantee that you have plenty of time to find the newest recovery and then do the recovery.
  • check_index_ratio
    • This is an older metric that modern MongoDB versions will not find useful, but in the past, it was a good way to understand the percentage of queries not handled by an index.
  • check_connect
    • A very basic check to ensure it can connect (and optionally login) to MongoDB and verify the server is working.
Status File options

These options rarely need to be changed but are present in case you want to store the status on an SHM mount point to avoid actual disk writes.

  • statusfile
    • This is where a copy of the current rs.status, serverStatus and other command data is stored
  • backup-statusfile
    • Like status_file, but status_file is moved here when a new check is done. These two objects can then be compared to find the delta between two checkpoints.
  • max-stale
    • This is the amount of age for which an old file is still valid. Deltas older then this aren’t allowed and exist to protect the system from will assumption when a statusfile is hours or days old.

If you have any questions on how to use these parameters, feel free to let us know. In the code, there is also a defaults dictionary for most of these options so that in many cases setting warning and critical level are not needed.

by David Murphy at July 27, 2016 02:08 PM

Jean-Jerome Schmidt

New Webinar Trilogy: The MySQL Query Tuning Deep-Dive

Following our popular webinar on MySQL database performance tuning, we’re excited to introduce a new webinar trilogy dedicated to MySQL query tuning.

This is an in-depth look into the ins and outs of optimising MySQL queries conducted by Krzysztof Książek, Senior Support Engineer at Severalnines.

When done right, tuning MySQL queries and indexes can significantly increase the performance of your application as well as decrease response times. This is why we’ll be covering this complex topic over the course of three webinars of 60 minutes each.

Dates

Part 1: Query tuning process and tools

Tuesday, August 30th
Register

Part 2: Indexing and EXPLAIN - deep dive

Tuesday, September 27th
Register

Part 3: Working with the optimizer and SQL tuning

Tuesday, October 25th
Register

Agenda

Part 1: Query tuning process and tools

  • Query tuning process
    • Build
    • Collect
    • Analyze
    • Tune
    • Test
  • Tools
    • tcpdump
    • pt-query-digest

Part 2: Indexing and EXPLAIN - deep dive

  • How B-Tree indexes are built?
  • Indexes - MyISAM vs. InnoDB
  • Different index types
    • B-Tree
    • Fulltext
    • Hash
  • Indexing gotchas
  • EXPLAIN walkthrough - query execution plan

Part 3: Working with optimizer and SQL tuning

  • Optimizer
    • How execution plans are calculated
    • InnoDB statistics
  • Hinting the optimizer
    • Index hints
    • JOIN order modifications
    • Tweakable optimizations
  • Optimizing SQL

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience in managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. He’s the main author of the Severalnines blog and webinar series: Become a MySQL DBA.

by Severalnines at July 27, 2016 01:12 PM

July 26, 2016

Peter Zaitsev

Testing Samsung storage in tpcc-mysql benchmark of Percona Server

tpcc-mysql benchmark

This blog post will detail the results of Samsung storage in

tpcc-mysql
 benchmark using Percona Server.

I had an opportunity to test different Samsung storage devices under tpcc-mysql benchmark powered by Percona Server 5.7. You can find a summary with details here https://github.com/Percona-Lab-results/201607-tpcc-samsung-storage/blob/master/summary-tpcc-samsung.md

I have in my possession:

  • Samsung 850 Pro, 2TB: This is a SATA device and is positioned as consumer-oriented, something that you would use in a high-end user desktop. As of this post, I estimate the price of this device as around $430/TB.
  • Samsung SM8631.92TB: this device is also a SATA, and positioned for a server usage. The current price is about $600/TB. 
  • Samsung PM1725, 800GB: This is an NVMe device, in a 2.5″ form factor, but it requires a connection to a PCIe slot, which I had to allocate in my server. The device is high-end, oriented for server-side and demanding workloads. The current price is about $1300/TB.

I am going to use 1000 warehouses in the 

tpcc-mysql
 benchmarks, which corresponds roughly to a data size of 100GB.

This benchmark varies the

innodb_buffer_pool_size
 from 5GB to 115GB. With 5GB buffer pool size only a very small portion of data fits into memory, which results in intensive foreground IO reads and intensive background IO writes. With 115GB almost all data fits into memory, which results in very small (or almost zero) IO reads and moderate background IO writes.

All buffer pool sizes in the middle of the interval correspond to resulting IO reads and writes. For example, we can see the read to write ratio on the chart below (received for the PM1725 device) with different buffer pool sizes:

tpcc-mysql benchmarks

We can see that for the 5GB buffer pool size we have 56000 read IOPs operations and 32000 write IOPs. For 115GB, the reads are minimal at about 300 IOPS and the background writes are at the 20000 IOPs level. Reads gradually decline with the increasing buffer pool size.

The charts are generated with the Percona Monitoring and Management tools.

Results

Let’s review the results. The first chart shows measurements taken every one second, allowing us to see the trends and stalls.

tpcc-mysql benchmarks

If we take averages, the results are:

tpcc-mysql benchmarks

In table form (the results are in new order transactions per minute (NOTPM)):

bp, GB pm1725 sam850 sam863 pm1725 / sam863 pm1725 / sam850
5 42427.57 1931.54 14709.69 2.88 21.97
15 78991.67 2750.85 31655.18 2.50 28.72
25 108077.56 5156.72 56777.82 1.90 20.96
35 122582.17 8986.15 93828.48 1.31 13.64
45 127828.82 12136.51 123979.99 1.03 10.53
55 130724.59 19547.81 127971.30 1.02 6.69
65 131901.38 27653.94 131020.07 1.01 4.77
75 133184.70 38210.94 131410.40 1.01 3.49
85 133058.50 39669.90 131657.16 1.01 3.35
95 133553.49 39519.18 132882.29 1.01 3.38
105 134021.26 39631.03 132126.29 1.01 3.38
115 134037.09 39469.34 132683.55 1.01 3.40

Conclusion

The Samsung 850 obviously can’t keep with the more advanced SM863 and PM1725. The PM1725 shows a greater benefit with smaller buffer pool sizes. In cases using large amounts of memory, there is practically no difference with SM863. The reason is that with big buffer pool sizes, MySQL does not push IO subsystem much to use all the PM1725 performance.

For the reference, my.cnf file is

[mysqld]
datadir=/var/lib/mysql
socket=/tmp/mysql.sock
ssl=0
symbolic-links=0
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
# general
thread_cache_size=2000
table_open_cache = 200000
table_open_cache_instances=64
back_log=1500
query_cache_type=0
max_connections=4000
# files
innodb_file_per_table
innodb_log_file_size=15G
innodb_log_files_in_group=2
innodb_open_files=4000
innodb_io_capacity=10000
loose-innodb_io_capacity_max=12000
innodb_lru_scan_depth=1024
innodb_page_cleaners=32
# buffers
innodb_buffer_pool_size= 200G
innodb_buffer_pool_instances=8
innodb_log_buffer_size=64M
# tune
innodb_doublewrite= 1
innodb_support_xa=0
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit= 1
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_max_dirty_pages_pct=90
join_buffer_size=32K
sort_buffer_size=32K
innodb_use_native_aio=0
innodb_stats_persistent = 1
# perf special
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_read_io_threads = 16
innodb_write_io_threads = 8
innodb_purge_threads=4
innodb_adaptive_hash_index=0
innodb_change_buffering=none
loose-innodb-log_checksum-algorithm=crc32
loose-innodb-checksum-algorithm=strict_crc32
loose-innodb_sched_priority_cleaner=39
loose-metadata_locks_hash_instances=256

by Vadim Tkachenko at July 26, 2016 06:00 PM

July 25, 2016

Peter Zaitsev

Percona XtraBackup 2.4.4 is now available

Percona XtraBackup 2.4.4

Percona XtraBackup 2.4.4Percona announces the GA release of Percona XtraBackup 2.4.4 on July 25th, 2016. You can download it from our download site and from apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup has been rebased on MySQL 5.7.13.

Bugs Fixed:

  • Percona XtraBackup reported the difference in the actual size of the system tablespace and the size which was stored in the tablespace header. This check is now skipped for tablespaces with autoextend support. Bug fixed #1550322.
  • Because Percona Server 5.5 and MySQL 5.6 store the LSN offset for large log files at different places inside the redo log header, Percona XtraBackup was trying to guess which offset is better to use by trying to read from each one and compare the log block numbers and assert lsn_chosen == 1 when both LSNs looked correct, but they were different. Fixed by improving the server detection. Bug fixed #1568009.
  • Percona XtraBackup didn’t correctly detect when tables were both compressed and encrypted. Bug fixed #1582130.
  • Percona XtraBackup would crash if the keyring file was empty. Bug fixed #1590351.
  • Backup couldn’t be prepared when the size in cache didn’t match the physical size. Bug fixed #1604299.
  • Free Software Foundation address in copyright notices was outdated. Bug fixed #1222777.
  • Backup process would fail if the datadir specified on the command-line was not the same as one that is reported by the server. Percona XtraBackup now allows the datadir from my.cnf override the one from SHOW VARIABLES. xtrabackup prints a warning that they don’t match, but continues. Bug fixed #1526467.
  • With upstream change of maximum page size from 16K to 64K, the size of incremental buffer became 1G. Which increased the requirement to 1G of RAM in order to prepare the backup. While in fact there is no need to allocate such a large buffer for smaller pages. Bug fixed #1582456.
  • Backup process would fail on MariaDB Galera cluster operating in GTID mode if binary logs were in non-standard directory. Bug fixed #1517629.

Other bugs fixed: #1583717, #1583954, and #1599397.

Release notes with all the bugfixes for Percona XtraBackup 2.4.4 are available in our online documentation. Please report any bugs to the launchpad bug tracker.

by Hrvoje Matijakovic at July 25, 2016 06:05 PM

MongoDB Consistent Backups

MongoDB consistent backups

In this post, I’m going to discuss MongoDB consistent backups, and how to achieve them.

You might have read before that MongoDB backup is not consistent. But what if I told you there is a tool that could make them consistent. What if this tool also would make it cluster-wide consistent, automatically compress the backup, become the first step toward continually incremental recording, notify your monitoring system and upload the backup to cloud storage for you?

It’s all TRUE!

Recently Percona-Labs created a new repository aimed at exactly these issues. We hope it will eventually grow into something that becomes part of the officially supported tools (like Percona Toolkit and  Percona’s Xtrabackup utility). Before we get into how it works, let’s talk about why we need it and its key highlights. Then (for all the engineering types reading this) we can discuss what is does and why.

Why do we need a consistent backup tool?

The first thing to note is you absolutely can’t have a consistent backup on a working system unless your node is in a replicaset. (You could even have a single node replicaset for this to be accurate.) Why? Consistency requires an operations log to say what changes occurred from the first point in the backup to the last point. This lets us ensure we are consistent to the end timestamp of the backup. We are unable to verify consistency when the MongoDB backup started without the ability to take a “snapshot” of data and then save the data while other changes occur. MongoDB does not have ACID-like isolation in this way. However, it can be consistent to the backup endpoint by applying any deltas at the end of the backup restore process.

You might say, “but mongodump already provides

--oplog
 for this feature.” You are right: it does, and it works great if you only have a single replicaset to backup. When we bring sharding into the mix, however, things get vastly more complicated. It ignores that flag and hits your primaries:

Screen Shot 2016-07-11 at 12.42.20 PM

In the diagram above you can see the backup and oplog recording for the first shard ended long before the second shard. As such, the consistency point needed is nowhere close to being covered by the red line. Even if all your shards are the same size, there would be some level of variance due to network, disk, CPU and memory speeds. The new tool helps you here by keeping track of the dumps, but also by having a thread recording the oplog for all shards until the last shard finishes. This ensures that all shards can be synced to the point in time where the last shard finished. At that moment in time, we have a consistent backup across all the shards. As you can see below, the oplog finished watching both shards after the last shard finish. On recovery, they remain in sync.

Screen Shot 2016-07-11 at 12.50.59 PM

You might ask, “well what about the meta-data stored in the config servers.” This is a great quest, as the behavior differs in our tool depending on if you’re using MongoDB 3.2’s new Config Servers as a replica set feature, or a legacy config server approach.

In the legacy mode, we 

fsyncAndLock
 the config servers just long enough to record a server config data dump. Then we stop the oplog tailer threads for all the shards. After that, and after the oplog tailers finish, we unlock the config server. This ensures we remove the race conditions that could occur if it took longer than expected to close an oplog cursor. However, if we run in 3.2 mode, the config servers act just like another shard. They get dumped at the same time, and the oplog just gets tailed until we complete the data shard dumps. The newest features available to MongoDB Community, MongoDB Enterprise, and Percona Server for MongoDB 3.2 make the process much simpler.

Key Takeaways from new tool

  1. Not yet an official Percona tool, but being used already by people as it’s just a wrapper to run multiple mongo dumps for you.
  2. If you execute the make setup, it outputs a single binary file that needs only python2.7 installed on your database system, even though under the hood it’s running many python modules in a virtualenv
  3. Dumps all shard in parallel and keeps tailing the oplog until all dumps are complete
  4. Handled backing up metadata for old and new config server topologies
  5. Can currently upload to S3, but more cloud storage is coming
  6. Backups compressed by default
  7. Uses the cluster_name,  time, and shard_name to make backup paths look like  /cluster1/<timestamp>/shard1.tgz, helping you keep things organized and letting you remove old backups by timestamp and cluster name.

Desired Roadmap

  • Mature into an officially support Percona product like  Xtrabackup
  • Fully Opensource and welcoming community improvements
  • Extending uploading to  CloudFiles by Rackspace, Azure ZRS, Google Cloud Storage and more
  • Complementary documentation on restores but can just natively use mongorestore tool also
  • Modular backup methods to extend to mongodump, LVM snapshots, ISCSI, EBS snapshots, MongoDB commands and more
  • Encryption before saving to disk
  • Partial backups and restores limit to specific databases and collections
  • Offline backup querying

Please be sure to check out the GitHub @mongodb_consistent_backup and log any issues or features requests.

Feel free to reach out to me on Twitter @dbmurphy_data or @percona with any questions or suggestions as well.

by David Murphy at July 25, 2016 05:35 PM

Jean-Jerome Schmidt

MySQL on Docker: Single Host Networking for MySQL Containers

Networking is critical in MySQL, it is a fundamental resource to manage access to the server from client applications and other replication peers. The behaviour of a containerized MySQL service is determined by how the MySQL image is spawned with “docker run” command. With Docker single-host networking, a MySQL container can be run in an isolated environment (only reachable by containers in the same network), or an open environment (where the MySQL service is totally exposed to the outside world) or the instance simply runs with no network at all.

In the previous two blog posts, we covered the basics of running MySQL in a container and how to build a custom MySQL image. In today’s post, we are going to cover the basics of how Docker handles single-host networking and how MySQL containers can leverage that.

3 Types of Networks

By default, Docker creates 3 networks on the machine host upon installation:

$ docker network ls
NETWORK ID          NAME                DRIVER
1a54de857c50        host                host
1421a175401a        bridge              bridge
62bf0f8a1267        none                null

Each network driver has its own characteristic, explained in the next sections.

Host Network

The host network adds a container on the machine host’s network stack. You may imagine containers running in this network are connecting to the same network interface as the machine host. It has the following characteristics:

  • Container’s network interfaces will be identical with the machine host.
  • Only one host network per machine host. You can’t create more.
  • You have to explicitly specify “--net=host” in the “docker run” command line to assign a container to this network.
  • Container linking, “--link mysql-container:mysql” is not supported.
  • Port mapping, “-p 3307:3306” is not supported.

Let’s create a container on the host network with “--net=host”:

$ docker run \
--name=mysql-host \
--net=host \
-e MYSQL_ROOT_PASSWORD=mypassword \
-v /storage/mysql-host/datadir:/var/lib/mysql \
-d mysql

When we look into the container’s network interface, the network configuration inside the container is identical to the machine host:

[machine-host]$ docker exec -it mysql-host /bin/bash
[container-host]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:fa:f6:30 brd ff:ff:ff:ff:ff:ff
    inet 192.168.55.166/24 brd 192.168.55.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fefa:f630/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:93:50:ee:c8 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:93ff:fe50:eec8/64 scope link

In this setup, the container does not need any forwarding rules in iptables since it’s already attached to the same network as the host. Hence, port mapping using option “-p” is not supported and Docker will not manage the firewall rules of containers that run in this type of network.

If you look at the listening ports on the host machine, port 3306 is listening as it should:

[machine-host]$ netstat -tulpn | grep 3306
tcp6       0      0 :::3306                 :::*                    LISTEN      25336/mysqld

Having a MySQL container running on the Docker host network is similar to having a standard MySQL server installed on the host machine. This is only helpful if you want to dedicate the host machine as a MySQL server, however managed by Docker instead.

Now, our container architecture can be illustrated like this:

Containers created on host network are reachable by containers created inside the default docker0 and user-defined bridge.

Bridge network

Bridging allows multiple networks to communicate independently while keep separated on the same physical host. You may imagine this is similar to another internal network inside the host machine. Only containers in the same network can reach each other including the host machine. If the host machine can reach the outside world, so can the containers.

There are two types of bridge networks:

  1. Default bridge (docker0)
  2. User-defined bridge

Default bridge (docker0)

The default bridge network, docker0 will be automatically created by Docker upon installation. You can verify this by using the “ifconfig” or “ip a” command. The default IP range is 172.17.0.1/16 and you can change this inside /etc/default/docker (Debian) or /etc/sysconfig/docker (RedHat). Refer to Docker documentation if you would like to change this.

Let’s jump into an example. Basically, if you don’t explicitly specify “--net” parameter in the “docker run” command, Docker will create the container under the default docker0 network:

$ docker run \
--name=mysql-bridge \
-p 3307:3306 \
-e MYSQL_ROOT_PASSWORD=mypassword \
-v /storage/mysql-bridge/datadir:/var/lib/mysql \
-d mysql

And when we look at the container’s network interface, Docker creates one network interface, eth0 (excluding localhost):

[machine-host]$ docker exec -it mysql-container-bridge /bin/bash
[container-host]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.2/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:2/64 scope link
       valid_lft forever preferred_lft forever

By default, Docker utilises iptables to manage packet forwarding to the bridge network. Each outgoing connection will appear to originate from one of the host machines’s own IP addresses. The following is the machine’s NAT chains after the above container was started:

[machine-host]$ iptables -L -n -t nat
Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  172.17.0.0/16        0.0.0.0/0
MASQUERADE  tcp  --  172.17.0.2           172.17.0.2           tcp dpt:3306

Chain DOCKER (2 references)
target     prot opt source               destination
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:3307 to:172.17.0.2:3306

The above rules allows port 3307 to be exposed on the machine host based on the port mapping option “-p 3307:3306” in the “docker run” command line. If we look at the netstat output on the host, we can see MySQL is listening on port 3307, owned by docker-proxy process:

[machine-host]$ netstat -tulpn | grep 3307
tcp6       0      0 :::3307                 :::*                    LISTEN      4150/docker-proxy

At this point, our container setup can be illustrated below:

The default bridge network supports the use of port mapping and container linking to allow communication between containers in the docker0 network. If you would like to link another container, you can use the “--link” option in the “docker run” command line. Docker documentation provides extensive details on how the container linking works by exposing environment variables and auto-configured host mapping through /etc/hosts file.

User-defined bridge

Docker allows us to create custom bridge network, a.k.a user-defined bridge network (you can also create user-defined overlay network, but we are going to cover that in the next blog post). It behaves exactly like the docker0 network, where each container in the network can immediately communicate with other containers in the network. Though, the network itself isolates the containers from external networks.

The big advantage of having this network is that all containers have the ability to resolve the container’s name. Consider the following network:

[machine-host]$ docker network create mysql-network

Then, create 5 mysql containers under the user-defined network:

[machine-host]$ for i in {1..5}; do docker run --name=mysql$i --net=mysql-network -e MYSQL_ROOT_PASSWORD=mypassword -d mysql; done

Now, login into one of the containers (mysql3):

[machine-host]$ docker exec -it mysql3 /bin/bash

We can then ping all containers in the network without ever linking them:

[mysql3-container]$ for i in {1..5}; do ping -c 1 mysql$i ; done
PING mysql1 (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: icmp_seq=0 ttl=64 time=0.151 ms
--- mysql1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.151/0.151/0.151/0.000 ms
PING mysql2 (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: icmp_seq=0 ttl=64 time=0.138 ms
--- mysql2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.138/0.138/0.138/0.000 ms
PING mysql3 (172.18.0.4): 56 data bytes
64 bytes from 172.18.0.4: icmp_seq=0 ttl=64 time=0.087 ms
--- mysql3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.087/0.087/0.087/0.000 ms
PING mysql4 (172.18.0.5): 56 data bytes
64 bytes from 172.18.0.5: icmp_seq=0 ttl=64 time=0.353 ms
--- mysql4 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.353/0.353/0.353/0.000 ms
PING mysql5 (172.18.0.6): 56 data bytes
64 bytes from 172.18.0.6: icmp_seq=0 ttl=64 time=0.135 ms
--- mysql5 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.135/0.135/0.135/0.000 ms

If we look into the resolver setting, we can see Docker configures an embedded DNS server:

[mysql3-container]$ cat /etc/resolv.conf
search localdomain
nameserver 127.0.0.11
options ndots:0

The embedded DNS server maintains the mapping between the container name and its IP address, on the network the container is connected to, as in this case it is mysql-network. This feature facilitates node discovery in the network and is extremely useful in building a cluster of MySQL containers using MySQL clustering technology like MySQL replication, Galera Cluster or MySQL Cluster.

At this point, our container setup can be illustrated as the following:

Default vs User-defined Bridge

The following table simplifies the major differences between these two networks:

Area Default bridge (docker0) User-defined bridge
Network deployment Docker creates upon installation Created by user
Container deployment Default to this network Explicitly specify “--net=[network-name]” in the “docker run” command
Container linking Allows you to link multiple containers together and send connection information from one to another by using “--link [container-name]:[service-name]”. When containers are linked, information about a source container can be sent to a recipient container. Not supported
Port mapping Supported e.g, by using “-p 3307:3306” Supported e.g, by using “-p 3307:3306”
Name resolver Not supported (unless you link them) All containers in this network are able to resolve each other’s container name to IP address. Version <1.10 use /etc/hosts, >=1.10 use embedded DNS server.
Packet forwarding Yes, via iptables Yes, via iptables
Example usage for MySQL MySQL standalone MySQL replication, Galera Cluster, MySQL Cluster (involving more than one MySQL container setup)

No network

We can also create a container without any network attached to it by specifying “--net=none” in the “docker run” command. The container is only accessible through interactive shell. No additional network interface will be configured on the node.

Consider the following:

[machine-host]$ docker run --name=mysql0 --net=none -e MYSQL_ROOT_PASSWORD=mypassword -d mysql

By looking at the container’s network interface, only localhost interface is available:

[machine-host]$ docker exec -it mysql0 /bin/bash
[mysql0-container]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

Container in network none indicates it can’t join any network. Nevertheless, the MySQL container is still running and you can access it directly from the shell using mysql client command line through localhost or socket:

[mysql0-container]$ mysql -uroot -pmypassword -h127.0.0.1 -P3306
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.13 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Example use cases to run MySQL container in this network are MySQL backup verification by testing the restoration process, preparing the backup created using, e.g., Percona Xtrabackup or testing queries on different version of MySQL servers.

At this point, our containers setup can be illustrated as the following:

This concludes today’s blog. In the next blog post, we are going to look into multiple host networking (using overlay networks) together with Docker Swarm, an orchestration tool to manage containers on multiple machine hosts.

by Severalnines at July 25, 2016 11:25 AM

July 21, 2016

Jean-Jerome Schmidt

Planets9s - Watch our webinar replays for the MySQL, MongoDB and PostgreSQL DBA

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch our webinar replays for the MySQL, MongoDB and PostgreSQL DBA

Whether you’re interested in open source datastores such as MySQL, MariaDB, Percona, MongoDB or MySQL; load balancers such as HAProxy, MaxScale or ProxySQL; whether you’re in DB Ops or DevOps; looking to automate and manage your databases… Chances are that we have a relevant webinar replay for you. And we have just introduced a new search feature for our webinar replays, which makes it easier and quicker to find the webinar replay you’re looking for.

Search for a webinar replay

Severalnines boosts US health care provider’s IT operations

This week we were delighted to announce that US health care provider Accountable Health Inc. uses our flagship product ClusterControl to outcompete its larger rivals. To quote Greg Sarrica, Director of IT development at AHI: “Using ClusterControl was an absolute no-brainer for me. AHI looked for an alternative to Oracle and IBM, which could match our demands and with our budget. We wanted to give our clients frictionless access to their healthcare information without portals crashing and potentially losing their personal data. Now we have a solution that allows us to be agile when competing in the fast-moving US healthcare market.”

Read the press release

ClusterControl Tips & Tricks: Best practices for database backups

Backups - one of the most important things to take care of while managing databases. It is said there are two types of people - those who backup their data and those who will backup their data. In this new blog post in the Tips & Tricks series, we discuss good practices around backups and show you how you can build a reliable backup system using ClusterControl.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at July 21, 2016 01:33 PM

July 20, 2016

Peter Zaitsev

The Value of Database Support

database support

database supportIn this post, I’ll discuss how database support is good for your enterprise.

Years ago when I worked for the MySQL Support organization at the original MySQL AB, we spoke about MySQL Support as insurance and focused on a value proposition similar to that of car insurance. You must purchase car insurance before the incident happens, or insurance won’t cover the damage. In fact, most places around the world require automobile insurance. Similarly, many organizations that leverage production-use technology have their own “insurance” by means of 24/7 support.

In my opinion, this is a very one-sided view that does not capture the full value (and ROI) that a database support contract with Percona provides. With a Percona support contract, you are assured that your database environment (virtual, physical, or in the cloud) is fully covered – whether it’s one server or many.

Increasingly – especially with the introduction of cloud-based database environments – database servers are being spun up and torn down on a day-to-day basis. However briefly these databases exist, they need support. One of the challenges modern businesses face is providing support for a changing database infrastructure, while still maintaining a viable cost structure.

Let’s look at the different dimensions of value offered by Percona Support based on the different support cases we have received throughout the years.

Reduce and Prevent Downtime

If your database goes down, the time to recover will be significantly shorter with a support agreement than without it. The cost of downtime varies widely between organizations. A recent Ponemon study estimates the average cost of downtime can be up to $8,800 per minute.

With our clients, we’ve found preventing even one significant downtime event a year justifies support costs. Even when the client’s in-house team is very experienced, our help is often invaluable as we are exposed to a great variety of incidents from hundreds of companies. It is much more likely we have encountered the same incident before and have a solution ready. Helping to recover from downtime quickly is a reactive part of support – you can realize even more value by proactively working with support to get advice on your HA options as well as ensure that you’re following the best database backup and security practices.

Better Security

Having a database support contract by itself is not enough to prevent all security incidents. Databases are only one of the attack vectors, and it takes a lot of everyday work to stay secure. There is nothing that can guarantee complete security. Database support, however, can be a priceless resource for your security team. It can apply security and compliance practices to your database environment and demonstrate how to avoid typical mistakes.

The cost of data breaches can be phenomenal, as well as impact business reputations much more than downtime or performance issues. Depending on the company size and market, costs vary. Recent studies estimate direct costs ranging in average from $1.6M to 7.01M. Everyone agrees leaving rising security risks and costs unchecked is a recipe for disaster.

Fix Database Software Bugs

While you might have great DBAs on your team who are comfortable with best practices and downtime recovery, most likely you do not have a development team comfortable with fixing bugs in the database kernel or supporting tools. Getting up-to-date software fixes reduces downtime. It also helps ensure efficient development and operations teams, avoid using complex workarounds and other commonly faced issues.

Reduce Resources

We deal with a large number of performance-related questions. When we address such problems, we provide a better user experience, save costs, and minimize environmental impact by reducing resource use.

Savings vary depending on your application scale and how optimized the environment is already. In the best cases, our support team helped customers make applications more than 10x more efficient. In most cases, we can help make things at least 30% more efficient. If you’re spending $100K or more on your database environment, this benefit alone makes a support agreement well worth it.

Efficient Developers

You cannot minimize the importance of development efficiency. Too often customers don’t give their developers support access, even though they critically help realize application’s full value. Developers make database decisions about schema design all the time. These include query writing, stored procedures, triggers, sharding, document storage, or foreign keys. Without a database support contract, developers often have resort to “Google University” to find an answer – and often end up with inapplicable, outdated or simply wrong information. Combined with this, they often apply or resort to time-consuming trial and error.

With the help of a Percona Support team, developers can learn proven practices that apply to their specific situation. This saves a lot of time and gets better applications to market faster. Even with a single US-based developer intensively working within the database environment, a support agreement might justify the cost based on increased developer efficiency alone. Larger development teams simply cannot afford to not have support.

Efficient Operations

Your operations staff (DBAs, DevOps, Sysadmins) are in the same boat – if your database environment is significant, chances are you are always looking for ways to save time, make operations more efficient and reduce mistakes. Our support team can provide you with specific actionable advice for your challenges.

Chances are we have seen environments similar to yours and know which software, approaches and practices work well (and which do not). This knowledge helps prevent and reduce downtime. It also helps with team efficiency. Percona Support’s help allows you to handle operations with a smaller team, or address issues with a less experienced staff.

Better Applications

Percona Support access helps developers not only be more productive, but results in better application quality because application database interface design, schema, queries, etc. best practices are followed. The Percona team supports many applications, for many years. We often  think about problems before you might think about them, such as:

  • “How will this design play with replication or sharding?”
  • “Will it scale with large amounts of users or data?”
  • “How flexible is such a design when the  application will inevitably be evolving over years?”

While a better application is hard to quantify, it really is quite important.

Faster Time to Market

Yet another benefit that comes from developers having access to a database support team is faster time-to-market. For many agile applications, being able to launch new features faster is even more important than cost savings – this is how businesses succeed against the competition. At Percona, we love helping businesses succeed.

Conclusion

As you see, there are a lot of ways Percona Support can contribute to the success of your business. Support is much more than “insurance” that you should consider purchasing for compliance reasons. Percona Support provides a great return on investment. It allows you to minimize risks and costs while delivering the highest quality applications or services. Our flexible plans can cover your database environment, even if it is an ever-changing one, while still allowing you to plan your operations costs.

by Peter Zaitsev at July 20, 2016 09:31 PM

Jean-Jerome Schmidt

Press Release: Severalnines boosts US healthcare provider’s IT operations

Accountable Health Inc uses ClusterControl to outcompete larger rivals

Stockholm, Sweden and anywhere else in the world - 20/07/2016 - Severalnines, the provider of database infrastructure management software, today announced its latest customer, Accountable Health INC (AHI). This move comes at a time when technology is disrupting healthcare globally with the introduction of self-service medicine kiosks and virtual medical consultations.

AHI is a US-based company which helps pharmaceutical and healthcare firms to enhance their business and technical performance. Subsidiaries of AHI, such as Connect Health Solutions and Accountable Health Solutions, help employers build health and wellness programmes to facilitate a stronger return on investment in employees. Severalnines’ ClusterControl enables AHI to remedy database issues affecting business performance.

This is the second time the IT team at AHI has chosen Severalnines’ ClusterControl over rivals such as Oracle, Microsoft and Rackspace to provide database infrastructure. With the acquisition of Connect Health Solutions, AHI learnt the existing database infrastructure was inadequate. The pressure on the database caused severe data overloads and failed to handle the massive queries the business required, meaning client portals crashed regularly and employees were waiting for hours for the server to upload claims documents. AHI estimated the previous IT set-up was losing the business thousands of dollars each day in productivity loss.

To compete in a highly competitive US healthcare market, AHI needed to achieve high database uptime, availability and reliability for all businesses across its portfolio. Having successfully deployed ClusterControl in the past, AHI deployed the database management platform again to improve technology performance and customer satisfaction. Other solutions were seen as unattainable due to technical complexity and prohibitive costs.

It took 10 days to fully deploy ClusterControl and migrate to a clustered database setup. Severalnines assisted AHI with the migration. AHI can now access Severalnines’ database experts with one phone call which is different to the tiered support systems offered by the large software vendors.

ClusterControl is now the database management platform for all wholly-owned subsidiaries of AHI, who themselves currently deploy clusters on commodity off-the-shelf hardware. The ease of deployment and management along with competitive pricing meant AHI could be agile in its growth strategy and compete with US healthcare rivals such as Trizetto, Optum and Cognizant Healthcare Consulting.

Greg Sarrica, Director of IT development at AHI, said: “Using ClusterControl was an absolute no-brainer for me. AHI looked for an alternative to Oracle and IBM, which could match our demands and with our budget. We wanted to give our clients frictionless access to their healthcare information without portals crashing and potentially losing their personal data. Now we have a solution that allows us to be agile when competing in the fast-moving US healthcare market.”

Vinay Joosery, Severalnines CEO said: “The security and availability of data is indicative of the performance of healthcare providers. The US healthcare industry is so compact that smaller businesses need to punch harder than their bigger rivals to gain market share. We are happy to be working with AHI and help the team there deliver fast and accurate customer service.”

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 8,000 deployments to date via its popular online database configurator. Currently counting BT, Orange, Cisco, CNRS, Technicolour, AVG, Ping Identity and Paytrail as customers. Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit, http://www.severalnines.com/company.

About Accountable Health Solutions

Accountable Health Solutions was founded in 1992 as Molloy Wellness Company and acquired by the Principal Financial Group in 2004, with ownership transferred to Accountable Health, Inc., and whose name changed to Accountable Health Solutions in 2013.

Accountable Health Solutions offers comprehensive health and wellness programs to employers and health plan clients. Accountable Health combines smart technology, healthcare and behavior change expertise to deliver solutions that improve health, increase efficiencies and reduce costs in the delivery of healthcare. The company's product suite ranges from traditional wellness products to health improvement programs. Accountable Health Solutions is an industry leader with more than 20 years in the health and wellness industry and a 97% client retention rate. More at accountablehealthsolutions.com.

Press contact:

Positive Marketing
Steven de Waal/Camilla Nilsson
severalnines@positivemarketing.com
0203 637 0647/0643

by Severalnines at July 20, 2016 11:20 AM

July 19, 2016

Jean-Jerome Schmidt

All the Webinar Replays for the MySQL, MongoDB or PostgreSQL DBA

Those of you who already follow us know that we’ve been running a number of blog series for some time now that we also deliver as webinar series.

These blog series include:

And for those who are looking to automate the management of all of the above, we also have:

These blog series as well as related topics are covered monthly in our live webinars, which are available to view as replays.

Watch our webinar replays

So whether you’re interested in open source datastores such as MySQL, MariaDB, Percona, MongoDB or MySQL; load balancers such as HAProxy, MaxScale or ProxySQL; whether you’re in DB Ops or DevOps; looking to automate and manage your databases… Chances are that we have a relevant webinar replay for you!

These are the categories you can select from:

  • ClusterControl
  • DB Ops
  • DevOps
  • Galera
  • MongoDB
  • MySQL
  • PostgreSQL
  • Tips & Tricks

Or just perform a search for the topic or keyword of your choice and see what comes up.

We trust that these resources are useful!

If you have a topic you’d like to see covered that we haven’t dealt with yet, please do let us know.

by Severalnines at July 19, 2016 01:42 PM

Peter Zaitsev

Upcoming Webinar Wednesday July 20, 11 am PDT: Practical MySQL Performance Optimization

MySQL Performance Optimization

Practical MySQL Performance OptimizationAre you looking to improve your MySQL performance? Application success is often limited by poor MySQL performance. Please join Percona CEO and Founder Peter Zaitsev for this exclusive webinar on Wednesday, July 20th, 2016 at 11:00 AM PDT (UTC – 7) as he presents “Practical MySQL Performance Optimization“.

Peter Zaitsev discusses how to get excellent MySQL performance while being practical. In other words, spending time on what gives you the best return. The webinar updates Peter’s ever-popular Practical MySQL Performance Optimization presentation. It covers the important points for improving MySQL performance. It also includes a discussion of the new tools and features in the latest MySQL 5.7 release, as well as their most important aspects – so you can employ them for maximum database performance.

Areas covered:

  • Hardware
  • MySQL Configuration
  • Schema and Queries
  • Application Architecture
  • MySQL 5.7 New Tools and Features

Peter will highlight practical approaches and techniques for optimizing your time. He will also focus on the queries that are most important for your application. At the end of this webinar, you will know how to optimize MySQL performance in the most practical way.

register-now

Practical MySQL Performance OptimizationPeter Zaitsev, CEO

Peter Zaitsev co-founded Percona and assumed the role of CEO in 2006. As one of the foremost experts on MySQL strategy and optimization, Peter leveraged both his technical vision and entrepreneurial skills to grow Percona from a two-person shop to one of the most respected open source companies in the business. With over 150 professionals in 20 plus countries, Peter’s venture now serves over 3000 customers – including the “who’s who” of internet giants, large enterprises and many exciting startups. Percona was named to the Inc. 5000 in 2013, 2014 and 2015.

Peter was an early employee at MySQL AB, eventually leading the company’s High Performance Group. A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University where he majored in Computer Science. Peter is a co-author of High Performance MySQL: Optimization, Backups, and Replication, one of the most popular books on MySQL performance. Peter frequently speaks as an expert lecturer at MySQL and related conferences, and regularly posts on the Percona Data Performance Blog. He has also been tapped as a contributor to Fortune and DZone, and his recent ebook Practical MySQL Performance Optimization Volume 1 is one of percona.com’s most popular downloads. Peter lives in North Carolina with his wife and two children. In his spare time, Peter enjoys travel and spending time outdoors.

by Dave Avery at July 19, 2016 10:51 AM

July 18, 2016

MariaDB Foundation

MariaDB 10.1.16 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.16. See the release notes and changelog for details on this release. Download MariaDB 10.1.16 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.1.16 now available appeared first on MariaDB.org.

by Daniel Bartholomew at July 18, 2016 03:28 PM

Jean-Jerome Schmidt

ClusterControl Tips & Tricks - Best Practices for Database Backups

Backups - one of the most important things to take care of while managing databases. It is said there are two types of people - those who backup their data and those who will backup their data. In this blog post, we will discuss good practices around backups and show you how you can build a reliable backup system using ClusterControl.

Backup types

There are two main types of backup:

  • Logical backup - backup of data is stored in a human-readable format like SQL
  • Physical backup - backup contains binary data

Both complement each other - logical backup allows you to (more or less easily) retrieve up to a single row of data. Physical backups would require more time to accomplish that, but, on the other hand, they allow you to restore an entire host very quickly (something which may take hours or even days when using logical backup).

ClusterControl supports both types of backup: mysqldump for logical backup and xtrabackup for physical backup.

Backup schedule

Obviously, you’d want to have a fixed schedule for your backups. How often you want the backup to execute? It depends on your application, importance of data, time needed to take the backup and so on. We’d recommend to take a backup at least daily. When possible, you’d want to take both mysqldump and xtrabackup backups on a daily basis. To cover even more bases, you may want to schedule several incremental xtrabackup runs per day.

In ClusterControl you can easily schedule these different types of backups. There are a couple of settings to decide on. You can store a backup on the controller or locally, on the database node where the backup is taken. You need to decide on the location in which the backup should be stored, and which databases you’d like to backup - all data set or separate schemas? Depending on which backup type you’ve chosen, there are separate settings to configure. For Xtrabackup and Galera Cluster, you may choose if a node should be desynced or not. Are you going to use backup locks or maybe ‘FLUSH TABLES WITH READ LOCK’ should be used instead? Should the backup be compressed or not?

Which options to use will depend on your particular setup and hardware. For Galera Cluster, to avoid impact on the rest of the cluster, we’d suggest at least to think about desyncing the node for the duration of the backup. Please keep in mind this may also remove your backup node from rotation, this is especially true if you use HAProxy or MaxScale proxies.

Another popular way of minimizing the impact of a backup on a Galera Cluster or a replication master is to deploy a replication slave and then use it as a source of backups - this way Galera Cluster will not be affected at any point.

You can deploy such a slave in just a few clicks using ClusterControl.

Checking backup status

Taking a backup is not enough - you have to check if it actually completed correctly. ClusterControl can help with this. You can go to the Backup -> Reports tab and check the status of your backups. Have they completed successfully? What’s their size - is it as expected? This is a good way to do a quick sanity check - if your dataset is around 1GB of size, there’s no way a full backup can be as small as 100KB - something must have gone wrong at some point. To dig even deeper, you can take a look at the backup log and look for any errors.

Disaster Recovery

Storing backups within the cluster (either directly on a database node or on the ClusterControl host) comes in handy when you want to quickly restore your data: all backup files are in place and can be decompressed and restored promptly. When it comes to disaster recovery, this may not be the best option. Different issues may happen - servers may crash, network may not work reliably, even whole data centers may not be accessible due to some kind of outage. It may happen whether you work with a small service provider with a single data center, or a global vendor like Amazon or Rackspace. It is therefore not safe to keep all your eggs in a single basket - you should make sure you have a copy of your backup stored in some external location. ClusterControl supports Amazon S3 and Glacier services for that .

For those who would like to implement their own DR policies, ClusterControl backups are stored in a nicely structured directory. So it is perfectly fine to build and deploy your own set of scripts and handle DR according to your exact requirements.

Finally, another great way of implementing a Disaster Recovery policy is to use an asynchronous replication slave - something we mentioned earlier in this blog post. You can deploy such asynchronous slave in a remote location, some other data center maybe, and then use it to do backups and store them locally on that slave. Of course, you’d want to take a local backup of your cluster to have it around locally if you’d need to recover the cluster. Moving data between datacenters may take a long time, so having a backup files available locally can save you some time. In case you lose the access to your main production cluster, you may still have an access to the slave. This setup is very flexible - first, you have a running MySQL host with your production data so it shouldn’t be too hard to deploy your full application in the DR site. You’ll also have backups of your production data which you could use to scale out your DR environment.

We hope this gives you enough information to build a safe and reliable backup system.

by Severalnines at July 18, 2016 09:40 AM

July 14, 2016

Peter Zaitsev

Percona Live Europe, Amsterdam 2016: Speaking Gets You the Whole Community Event!

Percona Live Europe Call for Papers

Percona Live Europe Amsterdam 2016 talksCome speak at Percona Live Europe, and get access to the entire conference.

The Percona Live Open Source Database Performance Conference Europe 2016 is the premier event for the rich and diverse MySQL, MongoDB and ODBMS ecosystems in Europe. Attendees include DBAs, SysAdmins, developers, architects, CTOs, CEOs, and vendors from around the world. It’s a great place to meet and participate with the open source community.

Want to go, but having a hard time getting the budget approved? We have a solution: be a speaker and get a complimentary full pass!

Submit your speaking proposal for a Percona Live session and share your MySQL, MongoDB and ODBMS ideas, case studies, best practices, and technical knowledge in front of an intelligent, engaged audience open source technology users. If selected as a speaker by our Conference Committee, you will receive a complimentary full conference pass.

Speaking at Percona Live is a great way to further the goals of open source software, and give back to a community that is literally changing the world.

Below are examples of some of the outstanding speakers from this year’s Percona Live Conference in Santa Clara. Speakers are made up of CEOs, Directors, DBAs, and a celebrity or two:

Speaking at Percona Live puts you in some pretty great company, and pays for your pass! Submit your speaking proposal today! The submission deadline is Monday, July 18th.

See the interviews from some of our speakers from this year’s Percona Live Conference in Santa Clara below.

by Dave Avery at July 14, 2016 08:34 PM

MongoDB Data Durability

MongoDB Data Durability

MongoDB Data DurabilityIn this post, I want to talk about MongoDB data durability options across MongoDB versions.

I consider a write durable if, once confirmed by the server, it becomes permanent at the node or cluster level (ignoring catastrophic failures like all nodes on a cluster failing at the same time).

MongoDB lets you choose between different levels of data durability using Write Concern. Unlike server-side configured durability (as you get with Innodb using innodb_flush_log_at_trx_commit), the client specifies the Write Concern on each write operation.

As indicated in the linked manual page, the Write Concern specification can include a

w
 and a
j
 field (among other things).

The

w
 field determines the number of nodes that must confirm a write before the client acknowledges it, with the following possible values:

  • 1: meaning the primary,
  • “majority”: meaning a majority of the nodes,
  • Any other integer value, meaning that many nodes.

The

j
 field requests acknowledgement that for every node determined by the “w” value, writes are confirmed to the on-disk journal. Otherwise, the write is confirmed only in memory.

How the client specifies Write Concern depends on the programming language and driver used. Here is how it javascript does it, using the mongo command line client:

db.test.insert({_id: 1}, {writeConcern: {w:1, j:1}})

while to use the same write concern on C, with the mongo-c-driver, you must do this before the corresponding write operation:

mongoc_write_concern_t wc = mongoc_write_concern_new();
mongoc_write_concern_set_w(wc, 1);
mongoc_write_concern_set_journal(wc, 1);

To get a better understanding of what this means from a durability perspective I ran a few tests using the following environment:

  • A single client, using the mongo command line client, inserting an auto-incrementing integer as the single field (_id) of a collection.
  • Standalone mongod, and a replica set of 4 mongod instances, all on the same machine. You can repeat the tests using this script as a guide (the only requisite would be that mongod and mongo are on the shell’s path).
  • SIGKILL sent to the Primary node while the writes are happening.
  • Comparing the last value for _id reported by the client, with the maximum value available in the collection, on the new Primary node after the replica set reconfigures (or on the standalone mongod, after I manually restarted it).
  • MongoDB 3.0.4 and 3.2.7, using WiredTiger as the storage engine.

(I’ll discuss performance perspectives in a future post.)

In all cases, I indicate “missing docs” if the value reported by the client is higher than the value reported by

db.collection.find().sort({_id:-1}).limit(1)

Here are the results for a standalone mongod:

Standalone
w j Missing docs
1 1 No
1 0 Yes
0 0 Yes
0 1 No

 

The first three don’t hold surprises, but the last one does. The mongo-c-driver does not let you specify a write concern of

{w:0, j:1}
, and a cursory inspection of the MongoDB code makes me believe that “w:0” is interpreted as “w:1”. This would explain the result.

Here are the results for a four node replica set:

Replica Set
w j Missing docs
“majority” 1 No
“majority” 0 No
0 1 Yes

 

Again,

w:0, j:1
 is transformed into
w:1, j:1
. How can no data get lost in a standalone mongod, but can get lost in a replica set? The answer is in the standalone case, after SIGKILL I restarted the same instance. In that case, WiredTiger performs crash recovery. Since we request acknowledgement for write confirmation to the on-disk journal, the last _id is recovered (if needed), and no docs go missing.

However, in my replica set tests, I did not restart the SIGKILLED instance. Instead, I let mongod do its thing and automatically reconfigure the set, promoting one of the Secondaries as a new Primary. In this context, having a write concern that only requests acknowledgements of writes on the master is a liability, and leads to lost data.

When specifying w:”majority”, it is important to note that the value

j:0
 gets replaced with
j:1
 since version 3.2. That explains the lack of lost documents. I also tested 3.0 and, in that case, docs went missing when using
w:"majority", j:0
. This probably explains the behavior changed in 3.2, and, depending on your use cases, might justify an upgrade if you’re on an older version.

In conclusion, MongoDB data durability options lets you satisfy different requirements on a per operation basis, with the client being responsible for using the desired setting. When using a Write Concern that does not guarantee full durability, a mongod crash is enough to cause the loss of unconfirmed documents. In this sense, the Write Concern values that include

j:0
  are analogous to running Innodb with
innodb_flush_log_at_trx_commit
 set to 0.

The “majority” value for the w component is valid even in the standalone case (where it is treated as “1”), so I think

{w:"majority", j:1}
 is a good value to use in the general case to guarantee data durability.

by Fernando Ipar at July 14, 2016 05:49 PM

Jean-Jerome Schmidt

Planets9s - Watch the replay: how to monitor MongoDB (if you’re really a MySQL DBA)

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch the replay: how to monitor MongoDB (if you’re really a MySQL DBA)

Thanks to everyone who joined us for this week’s webinar on how to monitor MongoDB (for the MySQL DBA). Art van Scheppingen, Senior Support Engineer at Severalnines, discussed the most important metrics to keep an eye on for MongoDB and described them in plain MySQL DBA language and outlined the open source tools available for MongoDB monitoring and trending. The webinar also included a demo of ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your MongoDB system.

Watch the replay

Check out our updated ClusterControl documentation

If you haven’t upgraded to ClusterControl 1.3.1, you should! It’s full of great new features and enhancements. And we have lots of documentation to help you get started. Some of the updates include: Wizard - Create Replication Setups for Oracle MySQL, MariaDB and Percona Server; Wizard - Add Existing MySQL Cluster (NDB); ClusterControl troubleshooting with debug package … and more!

View the documentation

Download our MySQL Replication Blueprint whitepaper

The MySQL Replication Blueprint whitepaper includes all aspects of a Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies. All the tips & tricks to get you started and more in one convenient document.

Download the whitepaper

Become a MongoDB DBA: Monitoring and Trending (part 2)

Following our initial post that discussed various functions and commands in MongoDB to retrieve your metrics, we now dive a bit deeper into the metrics: group them together and see which ones are the most important ones to keep an eye on. This blog goes well with the webinar replay, which we also published this week (see above).

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at July 14, 2016 01:46 PM

How to monitor MongoDB (if you’re really a MySQL DBA) - Webinar Replay & Slides

Thanks to everyone who joined us for this week’s webinar on how to monitor MongoDB (for the MySQL DBA).

Art van Scheppingen, Senior Support Engineer at Severalnines, discussed the most important metrics to keep an eye on for MongoDB and described them in plain MySQL DBA language.

This webinar looked at answering the following questions (amongst others):

  • Which status overviews and commands really matter to you?
  • How do you trend and alert on them?
  • What is the meaning behind the metrics?

It also included a look at the open source tools available for MongoDB monitoring and trending. Finally, Art did a demo of ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your MongoDB system.

View the replay or read the slides

Agenda

  • How does MongoDB monitoring compare to MySQL
  • Key MongoDB metrics to know about
  • Trending or alerting?
  • Available open source MongoDB monitoring tools
  • How to monitor MongoDB using ClusterControl
  • Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 16 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a MongoDB DBA’ blog series.

by Severalnines at July 14, 2016 08:27 AM

July 13, 2016

Peter Zaitsev

Using Ceph with MySQL

Ceph

CephOver the last year, the Ceph world drew me in. Partly because of my taste for distributed systems, but also because I think Ceph represents a great opportunity for MySQL specifically and databases in general. The shift from local storage to distributed storage is similar to the shift from bare disks host configuration to LVM-managed disks configuration.

Most of the work I’ve done with Ceph was in collaboration with folks from Red Hat (mainly Brent Compton and Kyle Bader). This work resulted in a number of talks presented at the Percona Live conference in April and the Red Hat Summit San Francisco at the end of June. I could write a lot about using Ceph with databases, and I hope this post is the first in a long series on Ceph. Before I starting with use cases, setup configurations and performance benchmarks, I think I should quickly review the architecture and principles behind Ceph.

Introduction to Ceph

Inktank created Ceph a few years ago as a spin-off of the hosting company DreamHost. Red Hat acquired Inktank in 2014 and now offers it as a storage solution. OpenStack uses Ceph as its dominant storage backend. This blog, however, focuses on a more general review and isn’t restricted to a virtual environment.

A simplistic way of describing Ceph is to say it is an object store, just like S3 or Swift. This is a true statement but only up to a certain point.  There are minimally two types of nodes with Ceph, monitors and object storage daemons (OSDs). The monitor nodes are responsible for maintaining a map of the cluster or, if you prefer, the Ceph cluster metadata. Without access to the information provided by the monitor nodes, the cluster is useless. Redundancy and quorum at the monitor level are important.

Any non-trivial Ceph setup has at least three monitors. The monitors are fairly lightweight processes and can be co-hosted on OSD nodes (the other node type needed in a minimal setup). The OSD nodes store the data on disk, and a single physical server can host many OSD nodes – though it would make little sense for it to host more than one monitor node. The OSD nodes are listed in the cluster metadata (the “crushmap”) in a hierarchy that can span data centers, racks, servers, etc. It is also possible to organize the OSDs by disk types to store some objects on SSD disks and other objects on rotating disks.

With the information provided by the monitors’ crushmap, any client can access data based on a predetermined hash algorithm. There’s no need for a relaying proxy. This becomes a big scalability factor since these proxies can be performance bottlenecks. Architecture-wise, it is somewhat similar to the NDB API, where – given a cluster map provided by the NDB management node – clients can directly access the data on data nodes.

Ceph stores data in a logical container call a pool. With the pool definition comes a number of placement groups. The placement groups are shards of data across the pool. For example, on a four-node Ceph cluster, if a pool is defined with 256 placement groups (pg), then each OSD will have 64 pgs for that pool. You can view the pgs as a level of indirection to smooth out the data distribution across the nodes. At the pool level, you define the replication factor (“size” in Ceph terminology).

The recommended values are a replication factor of three for spinners and two for SSD/Flash. I often use a size of one for ephemeral test VM images. A replication factor greater than one associates each pg with one or more pgs on the other OSD nodes.  As the data is modified, it is replicated synchronously to the other associated pgs so that the data it contains is still available in case an OSD node crashes.

So far, I have just discussed the basics of an object store. But the ability to update objects atomically in place makes Ceph different and better (in my opinion) than other object stores. The underlying object access protocol, rados, updates an arbitrary number of bytes in an object at an arbitrary offset, exactly like if it is a regular file. That update capability allows for much fancier usage of the object store – for things like the support of block devices, rbd devices, and even a network file systems, cephfs.

When using MySQL on Ceph, the rbd disk block device feature is extremely interesting. A Ceph rbd disk is basically the concatenation of a series of objects (4MB objects by default) that are presented as a block device by the Linux kernel rbd module. Functionally it is pretty similar to an iSCSI device as it can be mounted on any host that has access to the storage network and it is dependent upon the performance of the network.

The benefits of using Ceph

Agility
In a world striving for virtualization and containers, Ceph gives easily moves database resources between hosts.

IO scalability
On a single host, you have access only to the IO capabilities of that host. With Ceph, you basically put in parallel all the IO capabilities of all the hosts. If each host can do 1000 iops, a four-node cluster could reach up to 4000 iops.

High availability
Ceph replicates data at the storage level, and provides resiliency to storage node crash.  A kind of DRBD on steroids…

Backups
Ceph rbd block devices support snapshots, which are quick to make and have no performance impacts. Snapshots are an ideal way of performing MySQL backups.

Thin provisioning
You can clone and mount Ceph snapshots as block devices. This is a useful feature to provision new database servers for replication, either with asynchronous replication or with Galera replication.

The caveats of using Ceph

Of course, nothing is free. Ceph use comes with some caveats.

Ceph reaction to a missing OSD
If an OSD goes down, the Ceph cluster starts copying data with fewer copies than specified. Although good for high availability, the copying process significantly impacts performance. This implies that you cannot run a Ceph with a nearly full storage, you must have enough disk space to handle the loss of one node.

The “no out” OSD attribute mitigates this, and prevents Ceph from reacting automatically to a failure (but you are then on your own). When using the “no out” attribute, you must monitor and detect that you are running in degraded mode and take action. This resembles a failed disk in a RAID set. You can choose this behavior as default with the mon_osd_auto_mark_auto_out_in setting.

Scrubbing
Every day and every week (deep), Ceph scrubs operations that, although they are throttled, can still impact performance. You can modify the interval and the hours that control the scrub action. Once per day and once per week are likely fine. But you need to set osd_scrub_begin_hour and osd_scrub_end_hour to restrict the scrubbing to off hours. Also, scrubbing throttles itself to not put too much load on the nodes. The osd_scrub_load_threshold variable sets the threshold.

Tuning
Ceph has many parameters so that tuning Ceph can be complex and confusing. Since distributed systems push hardware, properly tuning Ceph might require things like distributing interrupt load among cores and thread core pinning, handling of Numa zones – especially if you use high-speed NVMe devices.

Conclusion

Hopefully, this post provided a good introduction to Ceph. I’ve discussed the architecture, the benefits and the caveats of Ceph. In future posts, I’ll present use cases with MySQL. These cases include performing Percona XtraDB Cluster SST operations using Ceph snapshots, provisioning async slaves and building HA setups. I also hope to provide guidelines on how to build and configure an efficient Ceph cluster.

Finally, a note for the ones who think cost and complexity put building a Ceph cluster out of reach. The picture below shows my home cluster (which I use quite heavily). The cluster comprises four ARM-based nodes (Odroid-XU4), each with a two TB portable USB-3 hard disk, a 16 GB EMMC flash disk and a gigabit Ethernet port.

I won’t claim record breaking performance (although it’s decent), but cost-wise it is pretty hard to beat (at around $600)!

Ceph

https://rh2016.smarteventscloud.com/connect/sessionDetail.ww?SESSION_ID=42190&tclass=popup

 

by Yves Trudeau at July 13, 2016 05:48 PM

July 12, 2016

Henrik Ingo

Authoring Impress.js presentations in Markdown

With the Euro Cup 2016 done, evenings can again be spent contributing to my favorite open source project: The impress.js presentation framework. In the series of blog posts about my additions to it, it is now time to unveil a feature I added by popular request: Markdown support.

Thanks again to the power of open source, this was easy to add. By integrating Markdown.js as an extra addon, you can now type Markdown instead of HTML in each impress.js step:

read more

by hingo at July 12, 2016 08:15 PM

Peter Zaitsev

Call for Percona Live Europe MongoDB Speakers

Percona Live Europe MongoDB Speakers

Percona Live Europe MongoDB SpeakersWant to become one of the Percona Live Europe MongoDB speakers? Read this blog for details.

The Percona Live Europe, Amsterdam call for papers is ending soon and we are looking for MongoDB speakers! This is a great way to build your personal and company brands. It also provides you with a complimentary full conference pass (which is good for your budget)!

If you haven’t submitted a paper yet, here are a list of ideas we would love to see covered at this conference:

If you find any of these ideas interesting, simply let @Percona know and we can help get you listed as the speaker. If nothing on this list strikes your fancy or peaks your interest, please submit a similar talk of your own – we’d love to find out what you have to say!

Here are some other ideas that might get your thoughts bubbling:

  • Secret use of “hidden” and tagged ReplicaSets
  • To use a hashed shard key or not?
  • Understanding how a shard key is used in MongoDB
  • Using scatter-gathers to your benefit
  • WriteConcern and its use cases
  • How to quickly build a sharded environment for MongoDB in Docker
  • How to monitor and scale MongoDB in the cloud
  • MongoDB Virtualization: the good, the bad, and the ugly
  • MongoDB and VMware: a cautionary tale
  • Streaming MySQL bin logs to MongoDB and back again
  • How to ensure that other technologies can safely use the epilog for pipelining

The Percona team and conference commitee would love to see what other ideas the community has that we haven’t covered. Anything helps: using @Percona and mentioning topics you would like to see, to sharing topics on twitter you like, or even just sharing the link to the call for papers.

The call for papers closes next Monday (7/18), so let’s get some great things in this week and build a truly dynamic conference!

by David Murphy at July 12, 2016 04:23 PM

Percona Server for MongoDB 3.0.12-1.8 is now available

Print

Percona Server for MongoDBPercona announces the release of Percona Server for MongoDB 3.0.12-1.8 on July 12, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.12-1.8 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.12, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The MongoRocks storage engine is still under development. There is currently no officially released version of MongoRocks recommended for production.


This release includes all changes from MongoDB 3.0.12, and the following known issue that will be fixed in a future release:

  • Fixed the software version incorrectly reported by the --version option.

You can find the release notes in the official documentation.

 

by Alexey Zhebel at July 12, 2016 03:33 PM

July 11, 2016

Peter Zaitsev

Webinar July 14, 10 am PDT: Introduction into storage engine troubleshooting

storage engine troubleshooting

storage engine troubleshootingPlease join Sveta Smirnova for a webinar Thursday, July 14 at 10 am PDT (UTC-7) on an Introduction Into Storage Engine Troubleshooting.

The number of MySQL storage engines provide great flexibility for database users, administrators and developers. At the same time, engines add an extra level of complexity when it comes to troubleshooting issues. Before choosing the right troubleshooting tool, you need to answer the following questions (and often others):

  • What part of the server is responsible for my issue?
  • Was a lock set at the server or engine level?
  • Is a standard or engine-specific tool better?
  • Where are the engine-specific options?
  • How to know if an engine-specific command exists?

This webinar will discuss these questions and how to find the right answers across all storage engines in a general sense.

You will also learn:

  • How to troubleshoot issues caused by simple storage engines such as MyISAM or Memory
  • Why Federated is deprecated, and what issues affected that engine
  • How Blackhole can affect replication

. . . and more.

Register for the webinar here.

Note: We will hold a separate webinar specifically for InnoDB.

storage engine troubleshootingSveta Smirnova, Principal Technical Services Engineer
Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns which can solve typical issues quicker, teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as a Support Engineer in the MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle. She is the author of the book “MySQL Troubleshooting” and JSON UDF functions for MySQL.

by Dave Avery at July 11, 2016 05:33 PM

Jean-Jerome Schmidt

What's new in ClusterControl Documentation

If you haven’t upgraded to ClusterControl 1.3.1, you should! It’s full of great new features and enhancements. We have lots of documentation to help you get started. Documentation on older versions is also available in our Github repository.

Wizard - Create Replication Setups for Oracle MySQL, MariaDB and Percona Server

It is now possible to create entire master-slave setups in one go via the deployment wizard. In previous versions, one had to first create a master, and afterwards, add slaves to it. Among other improvements, it is possible to encrypt client/server connections and let ClusterControl automatically set all slaves to read-only (auto_manage_readonly) to avoid accidental writes.

Wizard - Add Existing MySQL Cluster (NDB)

We recently added support for deployment of MySQL Cluster (NDB), and it is now also possible to import existing NDB Cluster deployments (2 MGMT nodes, x SQL nodes and y Data nodes).

Official Changelog

We now have two Changelog pages, one in our support forum (this is mostly for our development reference) and a new official one in the documentation. You can now easily browse all the changes between each release, including release features, type of release and package build numbers.

Check out the new Changelog page.

ClusterControl troubleshooting with debug package

ClusterControl Controller (cmon) now comes with a debuginfo package to help trace any crashes. It produces a core dump of the working memory of the server at the time the program crashed or terminated abnormally.

ClusterControl Controller (CMON) package comes with a cron file installed under /etc/cron.d/ which will auto-restart if the cmon process is terminated abnormally. Typically, you may notice if cmon process has crashed by looking at the “dmesg” output.

Check out the new debugging steps here.

Standby ClusterControl

It is possible to have several ClusterControl servers to monitor a single cluster. This is useful if you have a multi-datacenter cluster and need to have ClusterControl on the remote site to monitor and manage local nodes if the network connection between them goes down. However, the ClusterControl servers must be configured to be working in active/passive mode to avoid race conditions when digesting queries and recovering a failed node or cluster.

Check out the updated instructions to install the ClusterControl Standby server.

ClusterControl RPC key

ClusterControl v1.3.1 introduces and enforces an RPC key for any communication request to the RPC interface on port 9500. This authentication string is critical and must be included in any interaction between CMON controller and the client to obtain a correct response. The RPC key is distinct per cluster and stored inside CMON configuration file of the respective cluster.

ClusterControl Domain Specific Language (CCDSL)

The DSL syntax is similar to JavaScript, with extensions to provide access to ClusterControl’s internal data structures and functions. The CCDSL allows you to execute SQL statements, run shell commands/programs across all your cluster hosts, and retrieve results to be processed for advisors/alerts or any other actions.

Our javascript-like language to manage your database infrastructure has now been updated with several new features, for example:

  • Types:
    • CmonMongoHost
    • CmonMaxscaleHost
    • CmonJob
  • Functions:
    • JSON
    • Regular Expression
    • CmonJob
    • Cluster Configuration Job
  • Examples:
    • Interact with MongoDB

Check out the ClusterControl DSL page here.

We welcome any feedback, suggestion and comment in regards to our documentation page to make sure it serves the purpose right. Happy clustering!

by Severalnines at July 11, 2016 11:59 AM

July 08, 2016

Oli Sennhauser

Temporary tables and MySQL STATUS information

When analysing MySQL configuration and status information at customers it is always interesting to see how the applications behave. This can partially be seen by the output of the SHOW GLOBAL STATUS command. See also Reading MySQL fingerprints.

Today we wanted to know where the high Com_create_table and the twice as high Com_drop_table is coming from. One suspect was TEMPORARY TABLES. But are real temporary tables counted as Com_create_table and Com_drop_table at all? This is what we want to find out today. The tested MySQL version is 5.7.11.

Caution: Different MySQL or MariaDB versions might behave differently!

Session 1 Global Session 2
CREATE TABLE t1 (id INT);
Query OK, 0 rows affected
   
Com_create_table +1
Opened_table_definitions +1
Com_create_table +1
Opened_table_definitions +1
 
 
CREATE TABLE t1 (id INT);
ERROR 1050 (42S01): Table 't1' already exists
   
Com_create_table +1
Open_table_definitions +1
Open_tables +1
Opened_table_definitions +1
Opened_tables +1
Com_create_table + 1
Open_table_definitions +1
Open_tables +1
Opened_table_definitions +1
Opened_tables +1
 
 
CREATE TABLE t1 (id INT);
ERROR 1050 (42S01): Table 't1' already exists
   
Com_create_table + 1 Com_create_table + 1  
 
DROP TABLE t1;
Query OK, 0 rows affected
   
Com_drop_table +1
Open_table_definitions -1
Open_tables -1
Com_drop_table +1
Open_table_definitions -1
Open_tables -1
 
 
DROP TABLE t1;
ERROR 1051 (42S02): Unknown table 'test.t1'
   
Com_drop_table -1 Com_drop_table -1  
 
CREATE TEMPORARY TABLE ttemp (id INT);
Query OK, 0 rows affected
   
Com_create_table +1
Opened_table_definitions +2
Opened_tables +1
Com_create_table +1
Opened_table_definitions +2
Opened_tables +1
 
 
CREATE TEMPORARY TABLE ttemp (id INT);
ERROR 1050 (42S01): Table 'ttemp' already exists
   
Com_create_table +1 Com_create_table +1  
 
DROP TABLE ttemp;
Query OK, 0 rows affected
   
Com_drop_table +1 Com_drop_table +1  
 
CREATE TEMPORARY TABLE ttemp (id int);
Query OK, 0 rows affected
  CREATE TEMPORARY TABLE ttemp (id int);
Query OK, 0 rows affected
Com_create_table +1
Opened_table_definitions +2
Opened_tables +1
Com_create_table +2
Opened_table_definitions +4
Opened_tables +2
Com_create_table +1
Opened_table_definitions +2
Opened_tables +1
 
DROP TABLE ttemp;
Query OK, 0 rows affected
  DROP TABLE ttemp;
Query OK, 0 rows affected
Com_drop_table +1 Com_drop_table +2 Com_drop_table +1

Conclusion

  • A successful CREATE TABLE command opens and closes a table definition.
  • A non successful CREATE TABLE command opens the table definition and the file handle of the previous table. So a faulty application can be quite expensive.
  • A further non successful CREATE TABLE command has no other impact.
  • A DROP TABLE command closes a table definition and the file handle.
  • A CREATE TEMPORARY TABLE opens 2 table definitions and the file handle. Thus behaves different than CREATE TABLE
  • But a faulty CREATE TEMPORARY TABLE seems to be much less intrusive.
  • Open_table_definitions and Open_tables is always global, also in session context.
Taxonomy upgrade extras: 

by Shinguz at July 08, 2016 04:42 PM

Peter Zaitsev

Percona Server 5.5.50-38.0 is now available

Percona_ServerLogoVert_CMYK


Percona announces the release of Percona Server 5.5.50-38.0 on July 8, 2016. Based on MySQL 5.5.50, including all the bug fixes in it, Percona Server 5.5.50-38.0 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. Details of the release can be found in the 5.5.50-38.0 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

New Features:
Bugs Fixed:
  • Querying the GLOBAL_TEMPORARY_TABLES table would cause server crash if temporary table owning threads would execute new queries. Bug fixed #1581949.
  • The innodb_log_block_size feature attempted to diagnose the situation where the logs have been created with a log block value that differs from the current innodb_log_block_size setting. But this diagnostics came too late, and a misleading error No valid checkpoints found was produced first, aborting the startup. Bug fixed #1155156.
  • AddressSanitizer build with LeakSanitizer enabled was failing at gen_lex_hash invocation. Bug fixed #1580993 (upstream #80014).
  • ssl.cmake file was broken when custom OpenSSL build was used. Bug fixed #1582639 (upstream #61619).
  • mysqlbinlog did not free the existing connection before opening a new remote one. Bug fixed #1587840 (upstream #81675).
  • Fixed memory leaks in mysqltest. Bugs fixed #1582718 and #1588318.
  • Fixed memory leaks in mysqlcheck. Bug fixed #1582741.
  • Fixed memory leak in mysqlbinlog. Bug fixed #1582761 (upstream #78223).
  • Fixed memory leaks in mysqldump. Bug fixed #1587873 and #1588845 (upstream #81714).
  • Fixed memory leak in innochecksum. Bug fixed #1588331.
  • Fixed memory leak in non-existing defaults file handling. Bug fixed #1588344.
  • Fixed memory leak in mysqlslap. Bug fixed #1588361.

Other bugs fixed: #1588169, #1588386, #1529885, #1587757, #1587426 (upstream, #81657), #1587527, #1588650, and #1589819.

The release notes for Percona Server 5.5.50-38.0 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at July 08, 2016 04:19 PM

Percona XtraBackup 2.3.5 is now available

Percona XtraBackup 2.3.5

Percona XtraBackup 2.3.5Percona announces the release of Percona XtraBackup 2.3.5 on July 8, 2016. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release is the current GA (Generally Available) stable release in the 2.3 series.

Bugs fixed:
  • Backup process would fail if --throttle option was used. Bug fixed #1554235.
  • .ibd files for remote tablespaces were not copied back to the original location pointed by the .isl files. Bug fixed #1555423.
  • When called with insufficient parameters, like specifying the empty --defaults-file option, Percona XtraBackup could crash. Bug fixed #1566228.
  • Documentation states that the default value for --ftwrl-wait-query-type is all, however it was update. Changed the default value to reflect the documentation. Bug fixed #1566315.
  • Free Software Foundation address in copyright notices was outdated. Bug fixed #1222777.
  • Backup process would fail if the datadir specified on the command-line was not the same as one that is reported by the server. Percona XtraBackup now allows the datadir from my.cnf override the one from SHOW VARIABLES. xtrabackup will print a warning that they don’t match, but continue. Bug fixed #1526467.
  • Backup process would fail on MariaDB if binary logs were in non-standard directory. Bug fixed #1517629.
  • Output of --slave-info option was missing an apostrophe. Bug fixed #1573371.

Other bugs fixed: #1599397.

Release notes with all the bugfixes for Percona XtraBackup 2.3.5 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at July 08, 2016 04:17 PM

MariaDB Foundation

MariaDB at DebConf16

At MariaDB we often get to deal with tables, but this time we took on one of a different type, and climbed to the top. We’re in Cape Town for Debconf16, and managed to take some time to climb to the top of Table Mountain. It’s been a productive visit. Otto, who handles MariaDB packaging […]

The post MariaDB at DebConf16 appeared first on MariaDB.org.

by ian at July 08, 2016 09:27 AM

July 07, 2016

Peter Zaitsev

Percona Server 5.6.31-77.0 is now available

percona server 5.6.30-76.3


percona server 5.6.31-77.0Percona
 announces the release of Percona Server 5.6.31-77.0 on July 7th, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.6.31, including all the bug fixes in it, Percona Server 5.6.31-77.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.31-77.0 milestone on Launchpad.

New Features:
  • Percona Server implemented protocol support for TLS 1.1 and TLS 1.2. This implementation turns off TLS v1.0 support by default.
  • TokuDB MTR suite is now part of the default MTR suite in Percona Server 5.6.
Bugs Fixed:
  • Querying the GLOBAL_TEMPORARY_TABLES table caused server crash if temporary table owning threads would execute new queries. Bug fixed #1581949.
  • Audit Log Plugin would hang when trying to write log record of audit_log_buffer_size length. Bug fixed #1588439.
  • Audit log in ASYNC mode could skip log records that don’t fit into log buffer. Bug fixed #1588447.
  • The innodb_log_block_size feature attempted to diagnose the situation where the logs have been created with a log block value that differs from the current innodb_log_block_size setting. But this diagnostics came too late, and a misleading error No valid checkpoints found was produced first, aborting the startup. Bug fixed #1155156.
  • Some transaction deadlocks did not increase the INFORMATION_SCHEMA.INNODB_METRICS lock_deadlocks counter. Bug fixed #1466414 (upstream #77399).
  • InnoDB tablespace import failed when trying to import a table with different data directory. Bug fixed #1548597 (upstream #76142).
  • Audit Log Plugin truncated SQL queries to 512 bytes. Bug fixed #1557293.
  • Regular user extra port connection failed if max_connections plus one SUPER user were already connected on the main port, even if it connecting would not violate the extra_max_connections. Bug fixed #1583147.
  • The error log warning Too many connections was only printed for connection attempts when max_connections plus one SUPER have connected. If the extra SUPER is not connected, the warning was not printed for a non-SUPER connection attempt. Bug fixed #1583553.
  • mysqlbinlog did not free the existing connection before opening a new remote one. Bug fixed #1587840 (upstream #81675).
  • Fixed memory leaks in mysqltest. Bugs fixed #1582718 and #1588318.
  • Fixed memory leaks in mysqlcheck. Bug fixed #1582741.
  • Fixed memory leak in mysqlbinlog. Bug fixed #1582761 (upstream #78223).
  • Fixed memory leaks in mysqldump. Bug fixed #1587873 and #1588845 (upstream #81714).
  • Fixed memory leak in non-existing defaults file handling. Bug fixed #1588344.
  • Fixed memory leak in mysqlslap. Bug fixed #1588361.
  • Transparent Huge Pages check will now only happen if tokudb_check_jemalloc option is set. Bugs fixed #939 and #713.
  • Logging in ydb environment validation functions now prints more useful context. Bug fixed #722.

Other bugs fixed: #1588386, #1529885, #1541698 (upstream #80261), #1582681, #1583589, #1587426 (upstream, #81657), #1589431, #956, and #964.

Release notes for Percona Server 5.6.31-77.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

by Hrvoje Matijakovic at July 07, 2016 01:53 PM

Shlomi Noach

Solving the non-atomic table swap, Take III: making it atomic

With the unintended impression of becoming live blogging, we now follow up on Solving the non-atomic table swap, Take II and Solving the Facebook-OSC non-atomic table swap problem with a safe, blocking, atomic solution

Why yet another iteration?

The solution presented in Solving the non-atomic table swap, Take II was good, in that it was safe. No data corruption. Optimistic: if no connection is killed throughout the process, then completely blocking.

Two outstanding issues remained:

  • If something did go wrong, the solution reverted to a table-outage
  • On replicas, the table swap is non atomic, non blocking. There's table-outage scenario on replica.

As it turns out, there's a simpler solution which overcomes both the above. As with math and physics, the simpler solution is often the preferred one. But it took those previous iterations to gather a few ideas together. So, anyway:

Safe, locking, atomic, asynchronous table swap

Do read the aforementioned previous posts; the quick-quick recap is: we want to be able to LOCK a table tbl, then do some stuff, then swap it out and put some ghost table in its place. MySQL does not allow us to rename tbl to tbl_old, ghost to tbl if we have locks on tbl in that session.

The solution we offer is now based on two connections only (as opposed to three, in the optimistic approach). "Our" connections will be C10, C20. The "normal" app connections are C1..C9, C11..C19, C21..C29.

  • Connections C1..C9 operate on tbl with normal DML: INSERT, UPDATE, DELETE
  • Connection C10: CREATE TABLE tbl_old (id int primary key) COMMENT='magic-be-here'
  • Connection C10: LOCK TABLES tbl WRITE, tbl_old WRITE
  • Connections C11..C19, newly incoming, issue queries on tbl but are blocked due to the LOCK
  • Connection C20: RENAME TABLE tbl TO tbl_old, ghost TO tbl
    This is blocked due to the LOCK, but gets prioritized on top connections C11..C19 and on top C1..C9 or any other connection that attempts DML on tbl
  • Connections C21..C29, newly incoming, issue queries on tbl but are blocked due to the LOCK and due to the RENAME, waiting in queue
  • Connection C10: checks that C20's RENAME is applied (looks for the blocked RENAME in processlist)
  • Connection 10: DROP TABLE tbl_old
    Nothing happens yet; tbl is still locked. All other connections still blocked.
  • Connection 10: UNLOCK TABLES
    BAM!
    The RENAME is first to execute, ghost table is swapped in place of tbl, then C1..C9, C11..C19, C21..C29 all get to operate on the new and shiny tbl

Some notes

  • We create tbl_old as a blocker for a premature swap
  • It is allowed for a connection to DROP a table it has under a WRITE LOCK
  • A blocked RENAME is always prioritized over a blocked INSERT/UPDATE/DELETE, no matter who came first

What happens on failures?

Much fun. Just works; no rollback required.

  • If C10 errors on the CREATE we do not proceed.
  • If C10 errors on the LOCK statement, we do not proceed. The table is not locked. App continues to operate as normal.
  • If C10 dies just as C20 is about to issue the RENAME:
    • The lock is released, the queries C1..C9, C11..C19 immediately operate on tbl.
    • C20's RENAME immediately fails because tbl_old exists.
      The entire operation is failed, but nothing terrible happens; some queries were blocked for some time is all. We will need to retry everything
  • If C10 dies while C20 is blocked on RENAME: Mostly similar to the above. Lock released, then C20 fails the RENAME (because tbl_old exists), then all queries resume normal operation
  • If C20 dies before C10 drops the table, we catch the error and let C10 proceed as planned: DROP, UNLOCK. Nothing terrible happens, some queries were blocked for some time. We will need to retry
  • If C20 dies just after C10 DROPs the table but before the unlock, same as above.
  • If both C10 and C20 die, no problem: LOCK is cleared; RENAME lock is cleared. C1..C9, C11..C19, C21..C29 are free to operate on tbl.

No matter what happens, at the end of operation we look for the ghost table. Is it still there? Then we know the operation failed, "atomically". Is it not there? Then it has been renamed to tbl, and the operation worked atomically.

A side note on failure is the matter of cleaning up the magic tbl_old. Here this is a matter of taste. Maybe just let it live and avoid recreating it, or you can drop it if you like.

Impact on app

App connections are guaranteed to be blocked, either until ghost is swapped in, or until operation fails. In the former, they proceed to operate on the new table. In the latter, they proceed to operate on the original table.

Impact on replication

Replication only sees the RENAME. There is no LOCK in the binary logs. Thus, replication sees an atomic two-table swap. There is no table-outage.

Conclusion

This solution satisfies all we wanted to achieve. We're unlikely to give this another iteration. Well, if some yet-more-elegant solution comes along I'll be tempted, for the beauty of it, but the solution offered in this post is simple-enough, safe, atomic, replication friendly, and should make everyone happy.

by shlomi at July 07, 2016 12:54 PM

Jean-Jerome Schmidt

Planets9s - #MySQLHA CrowdChat Launch, HA for PostgreSQL and Live Upgrades to MySQL 5.7

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Join the Conversation: Severalnines Launches #MySQLHA CrowdChat

This new CrowdChat is brought to you by Severalnines and is hosted by a community of subject matter experts. CrowdChat is a community platform that works across Facebook, Twitter, and LinkedIn to allow users to discuss a topic using a specific #hashtag. This crowdchat focuses on the hashtag #MySQLHA. So if you’re a DBA, architect, CTO, or a database novice, sign up and become part of the conversation!

Sign up for #MySQLHA CrowdChat

Become a PostgreSQL DBA - How to Setup Streaming Replication for High Availability

Historically, PostgreSQL did not have support for replication, which was provided using external tools like Pgpool and Slony. These solutions did not come out of the box and most of them required a good bit of work to set them up. This was a serious drawback, and it made people look into MySQL, where replication had been available for a long time. Thankfully, with PostgreSQL 9.0, replication has been added natively to PostgreSQL and this post shows you how to set up streaming replication.

Read the blog

Performing a Live Upgrade to MySQL 5.7

After studying the differences between MySQL 5.6 and 5.7, and going through a vigorous regression test process in our two previous posts on this topic, it’s now time for perform the actual upgrade itself. How do we best introduce 5.7 in our live environment? How can we minimize risks? What do we do if something goes wrong? And what tools are available out there to assist us? This latest post provides answers to these questions as well as a link to the related whitepaper on how to upgrade to MySQL 5.7.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at July 07, 2016 11:00 AM

July 06, 2016

Peter Zaitsev

Pipelining versus Parallel Query Execution with MySQL 5.7 X Plugin

MongoDB Data Durability

Pipelining versus Parallel Query ExecutionIn this blog post, we’ll look at pipelining versus parallel query execution when using X Plugin for MySQL 5.7.

In my previous blog post, I showed how to use X Plugin for MySQL 5.7 for parallel query execution. The tricks I used to make it work:

  • Partitioning by hash
  • Open N connections to MySQL, where N = number of CPU cores

I had to do it manually (as well as to sort the result at the end) as X Plugin only supports “pipelining” (which only saves the round trip time) and does not “multiplex” connections to MySQL (MySQL does not use multiple CPU cores for a single query).

TL:DR; version

In this (long) post I’m playing with MySQL 5.7 X Plugin / X Protocol and document store. Here is the summary:

  1. X Plugin does not “multiplex” connections/sessions to MySQL. Similar to the original protocol, one connection to X Plugin will result in one session open to MySQL
  2. An X Plugin query (if the library supports it) returns immediately and does not wait until the query is finished (async call). MySQL works like a queue.
  3. X Plugin does not have any additional server-level durability settings. Unless you check or wait for the acknowledgement (which is asynchronous) from the server, the data might or might not be written into MySQL (“fire and forget”).

At the same time, X Protocol can be helpful if:

  • We want to implement an asynchronous client (i.e., we do not want to block the network communication such as downloading or API calls) when the MySQL table is locked.
  • We want to use MySQL as a queue and save the round-trip time.
Benchmark results: “pipelining” versus “parallelizing” versus a single query

I’ve done a couple of tests comparing the results between “pipelining” versus “parallelizing” versus a single query. Here are the results:

      1. Parallel queries with NodeJS:
        $ time node async_wikistats.js
        ...
        All done! Total: 17753
        ...
        real    0m30.668s
        user    0m0.256s
        sys     0m0.028s
      2. Pipeline with NojeJS:
        $ time node async_wikistats_pipeline.js
        ...
        All done! Total: 17753
        ...
        real 5m39.666s
        user 0m0.212s
        sys 0m0.024s

        In the pipeline with NojeJS, I’m reusing the same connection (and do not open a new one for each thread).
      3. Direct query – partitioned table:
        mysql> select sum(tot_visits) from wikistats.wikistats_by_day_spark_part where url like ‘%postgresql%’;
        +-----------------+
        | sum(tot_visits) |
        +-----------------+
        | 17753           |
        +-----------------+
        1 row in set (5 min 31.44 sec)
      4. Direct query – non-partitioned table.
        mysql> select sum(tot_visits) from wikistats.wikistats_by_day_spark where url like ‘%postgresql%’;
        +-----------------+
        | sum(tot_visits) |
        +-----------------+
        | 17753           |
        +-----------------+
        1 row in set (4 min 38.16 sec)
Advantages of pipelines with X Plugin 

Although pipelining with X Plugin does not significantly increase query response time (it can reduce the total latency), it might be helpful in some cases. For example, let’s say we are downloading something from the Internet and need to save the progress of the download as well as the metadata for the document. In this example, I use youtube-dl to search and download the metadata about YouTube videos, then save the metadata JSON into MySQL 5.7 Document Store. Here is the code:

var mysqlx = require('mysqlx');
# This is the same as running $ youtube-dl -j -i ytsearch100:"mysql 5.7"
const spawn = require('child_process').spawn;
const yt = spawn('youtube-dl', ['-j', '-i', 'ytsearch100:"mysql 5.7"'], {maxBuffer: 1024 * 1024 * 128});
var mySession =
mysqlx.getSession({
    host: 'localhost',
    port: 33060,
    dbUser: 'root',
    dbPassword: '<your password>'
});
yt.stdout.on('data', (data) => {
        try {
                dataObj = JSON.parse(data);
                console.log(dataObj.fulltitle);
                mySession.then(session => {
                                                session.getSchema("yt").getCollection("youtube").add(  dataObj  )
                                                .execute(function (row) {
                                                }).catch(err => {
                                                        console.log(err);
                                                })
                                                .then( function (notices) { console.log("Wrote to MySQL: " + JSON.stringify(notices))  });
                                }).catch(function (err) {
                                              console.log(err);
                                              process.exit();
                                });
        } catch (e) {
                console.log(" --- Can't parse json" + e );
        }
});
yt.stderr.on('data', (data) => {
  console.log("Error receiving data");
});
yt.on('close', (code) => {
  console.log(`child process exited with code ${code}`);
  mySession.then(session => {session.close() } );
});

In the above example, I execute the youtube-dl binary (you need to have it installed first) to search for “MySQL 5.7” videos. Instead of downloading the videos, I only grab the video’s metadata in JSON format  (“-j” flag). Because it is JSON, I can save it into MySQL document store. The table has the following structure:

CREATE TABLE `youtube` (
  `doc` json DEFAULT NULL,
  `_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL,
  UNIQUE KEY `_id` (`_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Here is the execution example:

$ node yt.js
What's New in MySQL 5.7
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["3f312c3b-b2f3-55e8-0ee9-b706eddf"]}}
MySQL 5.7: MySQL JSON data type example
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["88223742-9875-59f1-f535-f1cfb936"]}}
MySQL Performance Tuning: Part 1. Configuration (Covers MySQL 5.7)
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["c377e051-37e6-8a63-bec7-1b81c6d6"]}}
Dave Stokes — MySQL 5.7 - New Features and Things That Will Break — php[world] 2014
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["96ae0dd8-9f7d-c08a-bbef-1a256b11"]}}
MySQL 5.7 & JSON: New Opportunities for Developers - Thomas Ulin - Forum PHP 2015
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["ccb5c53e-561c-2ed5-6deb-1b325739"]}}
Cara Instal MySQL 5.7.10 NoInstaller pada Windows Manual Part3
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["95efbd79-8d79-e7b6-a535-271640c8"]}}
MySQL 5.7 Install and Configuration on Ubuntu 14.04
Wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["b8cfe132-aca4-1eba-c2ae-69e48db8"]}}

Now, here is what make this example interesting: as NodeJS + X Plugin = Asynchronous + Pipelining, the program execution will not stop if the table is locked. I’ve opened two sessions:

  • session 1: $ node yt.js > test_lock_table.log
  • session 2:
    mysql> lock table youtube read; select sleep(10); unlock tables;
    Query OK, 0 rows affected (0.00 sec)
    +-----------+
    | sleep(10) |
    +-----------+
    |         0 |
    +-----------+
    1 row in set (10.01 sec)
    Query OK, 0 rows affected (0.00 sec)

Results:

...
Upgrade MySQL Server from 5.5 to 5.7
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["d4d62a8a-fbfa-05ab-2110-2fd5cf6d"]}}
OSC15 - Georgi Kodinov - Secure Deployment Changes Coming in MySQL 5.7
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["8ac1cdb9-1499-544c-da2a-5db1ccf5"]}}
MySQL 5.7: Create JSON string using mysql
FreeBSD 10.3 - Instalación de MySQL 5.7 desde Código Fuente - Source Code
Webinar replay: How To Upgrade to MySQL 5.7 - The Best Practices - part 1
How to install MySQL Server on Mac OS X Yosemite - ltamTube
Webinar replay: How To Upgrade to MySQL 5.7 - The Best Practices - part 4
COMO INSTALAR MYSQL VERSION 5.7.13
MySQL and JSON
MySQL 5.7: Merge JSON data using MySQL
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["a11ff369-6f23-11e9-187b-e3713e6e"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["06143a61-4add-79da-0e1d-c2b52cf6"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["1eb94ef4-db63-cb75-767e-e1555549"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["e25f15b5-8c19-9531-ed69-7b46807a"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["02b5a4c9-6a21-f263-90d5-cd761906"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["e0bef958-10af-b181-81cd-5debaaa0"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["f48fa635-fa63-7481-0668-addabbac"]}}
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["557fa5c5-3c8a-fe01-c17c-549c557e"]}}
MySQL 5.7 Install and Configuration on Ubuntu 14.04
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["456b11d8-ba03-0aec-8e06-9517c6e1"]}}
MySQL WorkBench 6.3 installation on Ubuntu 14.04
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["0b651987-9b23-b5e0-f8f7-49b8ba5c"]}}
Going through era of IoT with MySQL 5.7 - FOSSASIA 2016
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["e133746c-836c-a7e0-3893-292a7429"]}}
MySQL 5.7: MySQL JSON operator example
... => wrote to MySQL: {"_state":{"rows_affected":1,"doc_ids":["4d13830d-7b30-5b31-d068-c7305e0a"]}}

As we can see, the first two writes were immediate. Then I’ve locked the table, and no MySQL queries went through. At the same time the download process (which is the slowest part here) proceeded and was not blocked (we can see the titles above, which are not followed by lines “… => wrote to MySQL:”). When the table was unlocked, a pile of waiting queries succeeded.

This can be very helpful when running a “download” process, and the network is a bottleneck. In a traditional synchronous query execution, when we lock a table the application gets blocked (including the network communication). With NodeJS and X Plugin, the download part will proceed with MySQL acting as a queue.

Pipeline Durability

How “durable” this pipeline, you might ask. In other words, what will happen if I will kill the connection? To test it out, I have (once again) locked the table (but now before starting the nodejs), killed the connection and finally unlocked the table. Here are the results:

Session 1:
----------
mysql> truncate table youtube_new;
Query OK, 0 rows affected (0.25 sec)
mysql> lock table youtube_new read;
Query OK, 0 rows affected (0.00 sec)
mysql> select count(*) from youtube_new;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)
Session 2:
----------
(when table is locked)
$ node yt1.js
11 03  MyISAM
Switching to InnoDB from MyISAM
tablas InnoDB a MyISAM
MongoDB vs MyISAM (MariaDB/MySQL)
MySQL Tutorial 35 - Foreign Key Constraints for the InnoDB Storage Engine
phpmyadmin foreign keys myisam innodb
Convert or change database manual from Myisam to Innodb
... >100 other results omited ...
^C
Session 1:
----------
mysql> select count(*) from youtube_new;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)
     Id: 4916
   User: root
   Host: localhost:33221
     db: NULL
Command: Query
   Time: 28
  State: Waiting for table metadata lock
   Info: PLUGIN: INSERT INTO `iot`.`youtube_new` (doc) VALUES ('{"upload_date":"20140319","protocol":"
mysql> unlock table;
Query OK, 0 rows affected (0.00 sec)
mysql> select count(*) from youtube_new;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)
mysql>  select json_unquote(doc->'$.title') from youtube_new;
+---------------------------------+
| json_unquote(doc->'$.title')    |
+---------------------------------+
| 11 03  MyISAM                   |
| Switching to InnoDB from MyISAM |
+---------------------------------+
2 rows in set (0.00 sec)

Please note: in the above, there isn’t a single acknowledgement from the MySQL server. When code receives a response from MySQL it prints “Wrote to MySQL: {“_state”:{“rows_affected”:1,”doc_ids”:[“…”]}}“. Also, note that when the connection was killed the MySQL process is still there, waiting on the table lock.

What is interesting here is is that only two rows have been inserted into the document store. Is there a “history length” here or some other buffer that we can increase? I’ve asked Jan Kneschke, one of the authors of the X Protocol, and the answers were:

  • Q: Is there any history length or any buffer and can we tune it?
    • A: There is no “history” or “buffer” at all, it is all at the connector level.
  • Q: Then why is 2 rows were finally inserted?
    • To answer this question I’ve collected tcpdump to port 33060 (X Protocol), see below

This is very important information! Keep in mind that the asynchronous pipeline has no durability settings: if the application fails and there are some pending writes, those writes can be lost (or could be written).

To fully understand how the protocol works, I’ve captured tcpdump (Jan Kneschke helped me to analyze it):

tcpdump -i lo -s0 -w tests/node-js-pipelining.pcap "tcp port 33060"

(see update below for the tcpdump visualization)

This is what is happening:

  • When I hit CTRL+C, nodejs closes the connection. As the table is still locked, MySQL can’t write to it and will not send the result of the insert back.
  • When the table is unlocked, it starts the first statement despite the fact that the connection has been closed. It then acknowledges the first insert and starts the second one.
  • However, at this point the script (client) has already closed the connection and the final packet (write done, here is the id) gets denied. The X Plugin then finds out that the client closed the connection and stops executing the pipeline.

Actually, this is very similar to how the original MySQL protocol worked. If we kill the script/application, it doesn’t automatically kill the MySQL connection (unless you hit CTRL+C in the MySQL client, sends the kill signal) and the connection waits for the table to get unlocked. When the table is unlocked, it inserts the first statement from a file.

Session 1
---------
mysql> select * from t_sql;
Empty set (0.00 sec)
mysql> lock table t_sql read;
Query OK, 0 rows affected (0.00 sec)
Session 2:
----------
$ mysql iot < t.sql
$ kill -9 ...
[3]   Killed                  mysql iot < t.sql
Session 1:
----------
mysql> show processlist;
+------+------+-----------------+------+---------+---------+---------------------------------+-----------------------------------------------+
| Id   | User | Host            | db   | Command | Time    | State                           | Info                                          |
+------+------+-----------------+------+---------+---------+---------------------------------+-----------------------------------------------+
| 4913 | root | localhost       | iot  | Query   |      41 | Waiting for table metadata lock | insert into t_sql  values('{"test_field":0}') |
+------+------+-----------------+------+---------+---------+---------------------------------+-----------------------------------------------+
4 rows in set (0.00 sec)
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from t_sql;
+-------------------+
| doc               |
+-------------------+
| {"test_field": 0} |
+-------------------+
1 row in set (0.00 sec)

Enforcing unique checks

If I restart my script, it finds the same videos again. We will probably need to enforce the consistency of our data. By default the plugin generates the unique key (_id) for the document, so it prevents inserting the duplicates.

Another way to enforce the unique checks is to create a unique key for youtube id. Here is the updated table structure:

CREATE TABLE `youtube` (
  `doc` json DEFAULT NULL,
  `youtube_id` varchar(11) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$.id'))) STORED NOT NULL,
  UNIQUE KEY `youtube_id` (`youtube_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

I’ve changed the default “_id” column to the YouTube’s unique ID. Now when I restart the script it shows:

MySQL 5.7: Merge JSON data using MySQL
{ [Error: Document contains a field value that is not unique but required to be]
  info:
   { severity: 0,
     code: 5116,
     msg: 'Document contains a field value that is not unique but required to be',
     sql_state: 'HY000' } }
... => wrote to MySQL: undefined

…as this document has already been loaded.

Conclusion

Although X Plugin pipelining does not necessarily significantly increase query response (it might save the roundtrip time) it can be helpful for some applications.We might not want to block the network communication (i.e., downloading or API calls) when the MySQL table is locked, for example. At the same time, unless you check/wait for the acknowledgement from the server, the data might or might not be written into MySQL.

Bonus: data analysis

Now we can see what we have downloaded. There are a number of interesting fields in the result:

"is_live": null,
	"license": "Standard YouTube License",
	"duration": 2965,
	"end_time": null,
	"playlist": ""mysql 5.7"",
	"protocol": "https",
	"uploader": "YUI Library",
	"_filename": "Douglas Crockford - The JSON Saga--C-JoyNuQJs.mp4",
	"age_limit": 0,
	"alt_title": null,
	"extractor": "youtube",
	"format_id": "18",
	"fulltitle": "Douglas Crockford: The JSON Saga",
	"n_entries": 571,
	"subtitles": {},
	"thumbnail": "https://i.ytimg.com/vi/-C-JoyNuQJs/hqdefault.jpg",
	"categories": ["Science & Technology"],
	"display_id": "-C-JoyNuQJs",
	"like_count": 251,
	"player_url": null,
	"resolution": "640x360",
	"start_time": null,
	"thumbnails": [{
		"id": "0",
		"url": "https://i.ytimg.com/vi/-C-JoyNuQJs/hqdefault.jpg"
	}],
	"view_count": 36538,
	"annotations": null,
	"description": "Yahoo! JavaScript architect Douglas Crockford tells the story of how JSON was discovered and how it became a major standard for describing data.",
	"format_note": "medium",
	"playlist_id": ""mysql 5.7"",
	"upload_date": "20110828",
	"uploader_id": "yuilibrary",
	"webpage_url": "https://www.youtube.com/watch?v=-C-JoyNuQJs",
	"uploader_url": "http://www.youtube.com/user/yuilibrary",
	"dislike_count": 5,
	"extractor_key": "Youtube",
	"average_rating": 4.921875,
	"playlist_index": 223,
	"playlist_title": null,
	"automatic_captions": {},
	"requested_subtitles": null,
	"webpage_url_basename": "-C-JoyNuQJs"

We can see the most popular videos. To do that I’ve added one more virtual field on view_count, and created an index on it:

CREATE TABLE `youtube` (
  `doc` json DEFAULT NULL,
  `youtube_id` varchar(11) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$.id'))) STORED NOT NULL,
  `view_count` int(11) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$.view_count'))) VIRTUAL,
  UNIQUE KEY `youtube_id` (`youtube_id`),
  KEY `view_count` (`view_count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

We can run the queries like:

mysql> select json_unquote(doc->'$.title'),
    -> view_count,
    -> json_unquote(doc->'$.dislike_count') as dislikes
    -> from youtube
    -> order by view_count desc
    -> limit 10;
+----------------------------------------------------------------------------------------------------+------------+----------+
| json_unquote(doc->'$.title')                                                                       | view_count | dislikes |
+----------------------------------------------------------------------------------------------------+------------+----------+
| Beginners MYSQL Database Tutorial 1 # Download , Install MYSQL and first SQL query                 |     664153 | 106      |
| MySQL Tutorial                                                                                     |     533983 | 108      |
| PHP and MYSQL - Connecting to a Database and Adding Data                                           |     377006 | 50       |
| PHP MySQL Tutorial                                                                                 |     197984 | 41       |
| Installing MySQL (Windows 7)                                                                       |     196712 | 28       |
| Understanding PHP, MySQL, HTML and CSS and their Roles in Web Development - CodersCult Webinar 001 |     195464 | 24       |
| jQuery Ajax Tutorial #1 - Using AJAX & API's (jQuery Tutorial #7)                                  |     179198 | 25       |
| How To Root Lenovo A6000                                                                           |     165221 | 40       |
| MySQL Tutorial 1 - What is MySQL                                                                   |     165042 | 45       |
| How to Send Email in Blackboard Learn                                                              |     144948 | 28       |
+----------------------------------------------------------------------------------------------------+------------+----------+
10 rows in set (0.00 sec)

Or if we want to find out the most popular resolutions:

mysql> select count(*) as cnt,
    -> sum(view_count) as sum_views,
    -> json_unquote(doc->'$.resolution') as resolution
    -> from youtube
    -> group by resolution
    -> order by cnt desc, sum_views desc
    -> limit 10;
+-----+-----------+------------+
| cnt | sum_views | resolution |
+-----+-----------+------------+
| 273 |   3121447 | 1280x720   |
|  80 |   1195865 | 640x360    |
|  18 |     33958 | 1278x720   |
|  15 |     18560 | 1152x720   |
|  11 |     14800 | 960x720    |
|   5 |      6725 | 1276x720   |
|   4 |     18562 | 1280x682   |
|   4 |      1581 | 1280x616   |
|   4 |       348 | 1280x612   |
|   3 |      2024 | 1200x720   |
+-----+-----------+------------+
10 rows in set (0.02 sec)

Special thanks to Jan Kneschke and Morgan Tocker from Oracle for helping with the X Protocol internals.

Update: Jan Kneschke also generated the visualization for the tcpdump I’ve collected (when connection was killed):

node-js-pipelining-early-close

by Alexander Rubin at July 06, 2016 07:14 PM

Percona Server 5.7.13-6 is now available

percona server 5.6.30-76.3

percona server 5.6.30-76.3Percona announces the GA release of Percona Server 5.7.13-6 on July 6, 2016. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.7.13, including all the bug fixes in it, Percona Server 5.7.13-6 is the current GA release in the Percona Server 5.7 series. Percona’s provides completely open-source and free software. All the details of the release can be found in the 5.7.13-6 milestone at Launchpad.

New Features:
  • TokuDB MTR suite is now part of the default MTR suite in Percona Server 5.7.
Bugs Fixed:
  • Querying the GLOBAL_TEMPORARY_TABLES table would cause server crash if temporary table owning threads would execute new queries. Bug fixed #1581949.
  • IMPORT TABLESPACE and undo tablespace truncate could get stuck indefinitely with a writing workload in parallel. Bug fixed #1585095.
  • Requesting to flush the whole of the buffer pool with doublewrite parallel buffer wasn’t working correctly. Bug fixed #1586265.
  • Audit Log Plugin would hang when trying to write log record of audit_log_buffer_size length. Bug fixed #1588439.
  • Audit log in ASYNC mode could skip log records which don’t fit into log buffer. Bug fixed #1588447.
  • In order to support innodb_flush_method being set to ALL_O_DIRECT, the log I/O buffers were aligned to innodb_log_write_ahead_size. That implementation missed the case that the variable is dynamic and could still lead to a server to crash. Bug fixed #1597143.
  • InnoDB tablespace import would fail when trying to import a table with different data directory. Bug fixed #1548597 (upstream #76142).
  • Audit Log Plugin was truncating SQL queries to 512 bytes. Bug fixed #1557293.
  • mysqlbinlog did not free the existing connection before opening a new remote one. Bug fixed #1587840 (upstream #81675).
  • Fixed a memory leak in mysqldump. Bug fixed #1588845 (upstream #81714).
  • Transparent Huge Pages check will now only happen if tokudb_check_jemalloc option is set. Bugs fixed #939 and #713.
  • Logging in ydb environment validation functions now prints more useful context. Bug fixed #722.

Other bugs fixed: #1541698 (upstream #80261), #1587426 (upstream, #81657), #1589431, #956, and #964.

The release notes for Percona Server 5.7.13-6 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

by Hrvoje Matijakovic at July 06, 2016 04:07 PM

July 05, 2016

Peter Zaitsev

MySQL 8.0

MySQL 8

MySQL 8.0

If you haven’t heard the news yet, MySQL 8.0 is apparently the next release of the world-famous database server.

Obviously abandoning plans to name the next release 5.8, Percona Server’s upstream provider relabelled all 5.8-related bugs to 8.0 as follows:

Reported version value updated to reflect release name change from 5.8 to 8.0

What will MySQL 8.0 bring to the world?

While lossless RBR has been suggested by Simon Mudd (for example), the actual feature list (except a Boost 1.60.0 upgrade!) remains a secret.

As far as bug and feature requests go, a smart google query revealed which bugs are likely to be fixed in (or are feature requests for) MySQL 8.0.

Here is the full list:

  • MySQL Bug #79380: Upgrade to Boost 1.60.0
  • MySQL Bug #79037: get rid of dynamic_array in st_mysql_options
  • MySQL Bug #80793: EXTEND EXPLAIN to cover ALTER TABLE
  • MySQL Bug #79812: JSON_ARRAY and JSON_OBJECT return …
  • MySQL Bug #79666: fix errors reported by ubsan
  • MySQL Bug #79463: Improve P_S configuration behaviour
  • MySQL Bug #79939: default_password_lifetime &gt; 0 should print …
  • MySQL Bug #79330: DROP TABLESPACE fails for missing general …
  • MySQL Bug #80772: Excessive memory used in memory/innodb …
  • MySQL Bug #80481: Accesses to new data-dictionary add confusing …
  • MySQL Bug #77712: mysql_real_query does not report an error for …
  • MySQL Bug #79813: Boolean values are returned inconsistently with …
  • MySQL Bug #79073: Optimizer hint to disallow full scan
  • MySQL Bug #77732: REGRESSION: replication fails for insufficient …
  • MySQL Bug #79076: make hostname a dynamic variable
  • MySQL Bug #78978: Add microseconds support to UNIX_TIMESTAMP
  • MySQL Bug #77600: Bump major version of libmysqlclient in 8.0
  • MySQL Bug #79182: main.help_verbose failing on freebsd
  • MySQL Bug #80627: incorrect function referenced in spatial error …
  • MySQL Bug #80372: Built-in mysql functions are case sensitive …
  • MySQL Bug #79150: InnoDB: Remove runtime checks for 32-bit file …
  • MySQL Bug #76918: Unhelpful error for mysql_ssl_rsa_setup when …
  • MySQL Bug #80523: current_memory in sys.session can go negative!
  • MySQL Bug #78210: SHUTDOWN command should have an option …
  • MySQL Bug #80823: sys should have a mdl session oriented view
  • MySQL Bug #78374: “CREATE USER IF NOT EXISTS” reports an error
  • MySQL Bug #79522: can mysqldump print the fully qualified table …
  • MySQL Bug #78457: Use gettext and .po(t) files for translations
  • MySQL Bug #78593: mysqlpump creates incorrect ALTER TABLE …
  • MySQL Bug #78041: GROUP_CONCAT() truncation should be an …
  • MySQL Bug #76927: Duplicate UK values in READ-COMMITTED …
  • MySQL Bug #77997: Automatic mysql_upgrade
  • MySQL Bug #78495: Table mysql.gtid_executed cannot be opened.
  • MySQL Bug #78698: Simple delete query causes InnoDB: Failing …
  • MySQL Bug #76392: Assume that index_id is unique within a …
  • MySQL Bug #76671: InnoDB: Assertion failure in thread 19 in file …
  • MySQL Bug #76803: InnoDB: Unlock row could not find a 2 mode …
  • MySQL Bug #78527: incomplete support and/or documentation of …
  • MySQL Bug #78732: InnoDB: Failing assertion: *mbmaxlen &lt; 5 in file …
  • MySQL Bug #76356: Reduce header file dependencies for …
  • MySQL Bug #77056: There is no clear error message if …
  • MySQL Bug #76329: COLLATE option not accepted in generated …
  • MySQL Bug #79500: InnoDB: Assertion failure in thread …
  • MySQL Bug #72284: please use better options to …
  • MySQL Bug #78397: Subquery Materialization on DELETE WHERE …
  • MySQL Bug #76552: Cannot shutdown MySQL using JDBC driver
  • MySQL Bug #76532: MySQL calls exit(MYSQLD_ABORT_EXIT …
  • MySQL Bug #76432: handle_fatal_signal (sig=11) in …
  • MySQL Bug #41925: Warning 1366 Incorrect string value: … for …
  • MySQL Bug #78452: Alter table add virtual index hits assert in …
  • MySQL Bug #77097: InnoDB Online DDL should support change …
  • MySQL Bug #77149: sys should possibly offer user threads …

by Roel Van de Paar at July 05, 2016 10:18 PM

MySQL 5.7, utf8mb4 and the load data infile

utf8mb4 and the load data infile

utf8mb4 and the load data infileIn this post, I’ll discuss how MySQL 5.7 handles UTF8MB4 and the load data infile.

Many of my clients have told me that they do not like using the LOAD DATA INFILE statement and prefer to manually parse and load the data. The main reason they do it is issues with the character sets, specifically UTF8MB4 and the load data infile. This was surprising to me as nowadays everyone uses UTF8. MySQL 5.7 (as well as 5.6) has full support for UTF8MB4, which should fix any remaining issues (i.e., you can now load new emoji, like 🐬).

Last week I was investigating an interesting case where we were loading data and got the following error:

mysql -e 'select version()'
+-----------+
| version() |
+-----------+
| 5.7.12    |
+-----------+
$ mysql -vvv testdb < load_data.sql
ERROR 1300 (HY000) at line 1: Invalid utf8mb4 character string: 'Casa N'

The load data statement:

LOAD DATA LOCAL INFILE
                           'input.psv'
                        REPLACE INTO TABLE
                            input
                        CHARACTER SET
                            utf8mb4
                        FIELDS
                            TERMINATED BY '|'
                        LINES
                            TERMINATED BY 'rn'
                        IGNORE
                            1 LINES

The table uses the correct character set (global character set applied to all varchar fields):

CREATE TABLE `input` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  ...
  `address` varchar(255) DEFAULT NULL,
  ...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

The string looked like “Casa Nº 24”. So this should be N + U+00BA (MASCULINE ORDINAL INDICATOR, hex code: c2ba). When I do “less input.tsv”, it shows N<BA> 24. So why can’t MySQL load it?

After further investigation, we discovered the original encoding is not UTF8. WE found out by running:

$ file -i input.tsv
input.tsv: text/plain; charset=iso-8859-1

So the code <BA> was misleading. Also, when I got the actual character from the file, it was just one byte (UTF8 for this character should be two bytes). When MySQL parsed the UTF8 input file, it found only the first part of the multibyte UTF8 code and stopped with an error.

The original character in hex is “ba”:

xxd -p char_ascii
ba0a

(0a is a carriage return, and “ba” is “masculine ordinal indicator”)

The UTF8 equivalent:

$ xxd -p char_utf8
c2ba0a

This is now two bytes (+ carriage return): c2ba

To solve the problem we can simply change the CHARACTER SET utf8mb4 to CHARACTER SET latin1 when doing a load data infile. This fixed the issue:

Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0
mysql> set names utf8mb4;
Query OK, 0 rows affected (0.00 sec)
mysql> select address from input;
+--------------------------------+
| consignee_address              |
+--------------------------------+
| Casa Nº 24 ................... |
...
+--------------------------------+
2 rows in set (0.00 sec)

Another option will be to detect the character set encoding (iconv can do it) and convert to UTF8.

But it worked before…?

It worked a bit differently in MySQL 5.6:

$ mysql -e 'select version()'
+-------------+
| version()   |
+-------------+
| 5.6.25-73.0 |
+-------------+
$ mysql -vvv testdb < load_data.sql
...
Query OK, 2 rows affected, 2 warnings (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 2
--------------
show warnings
--------------
+---------+------+---------------------------------------------------------------------+
| Level   | Code | Message                                                             |
+---------+------+---------------------------------------------------------------------+
| Warning | 1366 | Incorrect string value: 'xBA 24 ...' for column 'address' at row 1  |
| Warning | 1366 | Incorrect string value: 'xBA 24 ...' for column 'address' at row 2  |
+---------+------+---------------------------------------------------------------------+
2 rows in set (0.00 sec)

MySQL 5.7 is more strict and doesn’t allow you to insert data in the wrong format. However, it is not 100% consistent. For some characters, MySQL 5.7 will also throw a warning if disabling strict SQL mode.

Another character that caused the same issue was xC9. When loading to MySQL 5.7 with the default sql_mode (ONLY_FULL_GROUP_BY, STRICT_TRANS_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER, NO_ENGINE_SUBSTITUTION) it throws an error:

ERROR 1366 (HY000) at line 1: Incorrect string value: 'xC9' for column 'address' at row 1

When disabling the strict mode it now defaults to warnings:

mysql> set global sql_mode = '';
Query OK, 0 rows affected (0.00 sec)
Query OK, 2 rows affected, 1 warning (0.00 sec)
Records: 1  Deleted: 1  Skipped: 0  Warnings: 1
--------------
show warnings
--------------
+---------+------+--------------------------------------------------------------+
| Level   | Code | Message                                                      |
+---------+------+--------------------------------------------------------------+
| Warning | 1366 | Incorrect string value: 'xC9' for column 'address' at row 1  |
+---------+------+--------------------------------------------------------------+
1 row in set (0.00 sec)

Emoji in MySQL

With UTF8MB4 support (in MySQL 5.6 and 5.7), you can also insert a little dolphin into a MySQL table:

CREATE TABLE `test_utf8mb4` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `v` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
mysql> set names utf8mb4;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into test_utf8mb4 (v) values ('Dolphin:🐬');
Query OK, 1 row affected (0.00 sec)
mysql> select * from test_utf8mb4;
+----+--------------+
| id | v            |
+----+--------------+
|  1 | Dolphin:🐬   |
+----+--------------+
1 row in set (0.00 sec)

This should help you clear up issues with UTF8MB4 and the load data infile. Have fun!

by Alexander Rubin at July 05, 2016 06:49 PM

Colin Charles

Speaking in July 2016

  • Texas LinuxFest – July 8-9 2016 – Austin, Texas – I’ve never spoken at this event before but have heard great things about it. I’ve got a morning talk about what’s in MariaDB Server 10.1, and what’s coming in 10.2.
  • db tech showcase – July 13-15 2016 – Tokyo, Japan – I’ve regularly spoken at this event and its a case of a 100% pure database conference, with a very captive audience. I’ll be talking about the lessons one can learn from other people’s database failures (this is the kind of talk that keeps on changing and getting better as the software improves).
  • The MariaDB Tokyo Meetup – July 21 2016 – Tokyo, Japan – Not the traditional meetup timing, since its 1.30pm-7pm, there will be many talks and its organised by the folk behind the SPIDER storage engine. It should be fun to see many people and food is being provided too. In Japanese: MariaDB コミュニティイベント in Tokyo, MariaDB Community Event in TOKYO.

by Colin Charles at July 05, 2016 03:57 PM

Jean-Jerome Schmidt

Sign up for our webinar on monitoring MongoDB (if you’re really a MySQL DBA)

In this new webinar on July 12th, we’ll discuss the most important metrics MongoDB offers and will describe them in ordinary plain MySQL DBA language. We’ll have a look at the open source tools available for MongoDB monitoring and trending. And finally, we’ll show you how to leverage ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

To operate MongoDB efficiently, you need to have insight into database performance. And with that in mind, we’ll dive into monitoring in this second webinar in the ‘Become a MongoDB DBA’ series.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, July 12th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, July 12th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • How does MongoDB monitoring compare to MySQL
  • Key MongoDB metrics to know about
  • Trending or alerting?
  • Available open source MongoDB monitoring tools
  • How to monitor MongoDB using ClusterControl
  • Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 16 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

We look forward to “seeing” you there!


This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a MongoDB DBA’ blog series.

by Severalnines at July 05, 2016 08:00 AM

July 04, 2016

MariaDB Foundation

MariaDB 10.2.1 Alpha and other releases now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.2.1 Alpha, MariaDB Connector/C 2.3.0, MariaDB Galera Cluster 5.5.50, and MariaDB Galera Cluster 10.0.26. See the release notes and changelogs for details on these releases. Download MariaDB 10.2.1 Alpha Release Notes Changelog What is MariaDB 10.2? MariaDB APT and YUM Repository Configuration Generator […]

The post MariaDB 10.2.1 Alpha and other releases now available appeared first on MariaDB.org.

by Daniel Bartholomew at July 04, 2016 01:13 PM

Jean-Jerome Schmidt

Performing a Live Upgrade to MySQL 5.7

After studying the differences between MySQL 5.6 and 5.7, and going through a vigorous regression test process, it’s now time for perform the actual upgrade itself. How do we best introduce 5.7 in our live environment? How can we minimize risks? What do we do if something goes wrong? And what tools are available out there to assist us?

The upgrade process

You will most likely perform a rolling upgrade - this means that you will upgrade one slave at a time, taking them out of rotation for the time needed to complete the upgrade. As the binary, in-place upgrade is supported for 5.6 -> 5.7, we can save lot of time by avoiding long dump and reload operations. This makes the upgrade process prompt and easy to perform.

One of the very important steps while performing a binary upgrade is that we need to disable innodb_fast_shutdown on the slave before we stop it for the upgrade. This is needed to avoid potential problems with InnoDB incompatibilities. You also have to remember to execute mysql_upgrade - another step required to fix some of the incompatibilities which may otherwise impact your workload.

The switchover process

Once you have upgraded the slaves, you’ll have to execute a switchover and promote one of the slaves as a master. How to do that, it’s up to you. If you use GTID in your replication setup, switchover and subsequent reconfiguration will be easier.  Some external tools can also be used to make the switchover process smooth - for example, ClusterControl can do the switchover for you, along with all preparations like reslaving hosts, setting up grants, etc.. The only requirement is that GTID is used.

In the proxy layer, you can think about using ProxySQL which, when configured correctly, allows for fully graceful master switches where no error is sent to the application. When combined with ClusterControl, you can execute a fully automated switchover without any rollbacked transaction and very slight impact to the application.

The upgrade process along with how to achieve a graceful switchover is covered in more details in our ebook “Upgrading to MySQL 5.7”.

by Severalnines at July 04, 2016 07:59 AM

July 01, 2016

MariaDB Foundation

2016 MariaDB Developers Meetup

The 2016 MariaDB Developers Meetup will be taking place in Amsterdam, from 6 – 8 October. The meetup will last for three days and you can join for the whole time, or as little time as you wish. The schedule of this unconference will be drafted in a public spreadsheet. Initially, anyone attending can help set […]

The post 2016 MariaDB Developers Meetup appeared first on MariaDB.org.

by ian at July 01, 2016 09:42 PM

Peter Zaitsev

Amazon RDS and pt-online-schema-change

Amazon RDS and pt-online-schema-change

Amazon RDS and pt-online-schema-changeIn this blog post, I discuss some of the insights needed when using Amazon RDS and pt-online-schema-change together.

The pt-online-schema-change tool runs DDL queries (ALTER) online so that the table is not locked for reads and writes. It is a commonly used tool by community users and customers. Using it on Amazon RDS requires knowing about some specific details. First, a high-level explanation of how the tool works.

This is an example from the documentation:

pt-online-schema-change --alter "ADD COLUMN c1 INT" D=sakila,t=actor

The tool runs an ALTER on the table “actor” from the database “sakila.” The alter adds a column named “c1” of type “integer.” In the background, the tool creates a new empty table similar to “actor” but with the new column already added. It then creates triggers on the original table to update the corresponding rows in the new table. After, it starts copying rows to the new table (this is the phase that takes the longest amount of time). When the copy is done, the tables are swapped, triggers removed and the old table dropped.

As we can see, it is a tool that uses the basic features of MySQL. You can run it on MySQL, Percona Server, MariaDB, Amazon RDS and so on. But when using Amazon, there is a hidden issue: you don’t have SUPER privileges. This means that if you try to run the tool on an RDS with binary logs enabled, you could get the following error:

DBD::mysql::db do failed: You do not have the SUPER privilege and binary logging is enabled (you might want to use the less safe log_bin_trust_function_creators variable) [for Statement "CREATE TRIGGER `pt_osc_db_table_del` AFTER DELETE ON `db`.`table` FOR EACH ROW DELETE IGNORE FROM `db`.`_table_new` WHERE `db`.`_table_new`.`table_id` <=> OLD.`table_id` AND `db`.`_table_new`.`account_id` <=> OLD.`account_id`"] at /usr/bin/pt-online-schema-change line 10583.

The following documentation page explains the reason for this message:

http://dev.mysql.com/doc/refman/5.7/en/stored-programs-logging.html

The bottom line is creating triggers on a server with binary logs enabled requires a user with SUPER privileges (which is impossible in Amazon RDS). The error message specifies the workaround. We need to enable the variable log_bin_trust_function_creators. Enabling it is like saying to the server:

“I trust regular users’ triggers and functions, and that they won’t cause problems, so allow my users to create them.”

Since the database functionality won’t change, it becomes a matter of trusting your users. log_bin_trust_function_creators is a global variable that can be changed dynamically:

mysql> SET GLOBAL log_bin_trust_function_creators = 1;

Run the tool again. This time, it will work.

After you’re done with the ALTER process, you can change the variable to “0” again.

UPDATE:

As Marc pointed out in the comments, in RDS the variable must be set via instance parameter group instead of SET GLOBAL.

by Miguel Angel Nieto at July 01, 2016 05:30 PM

June 30, 2016

Peter Zaitsev

Rescuing a crashed pt-online-schema-change with pt-archiver

crashed pt-online-schema-change

crashed pt-online-schema-changeThis article discusses how to salvage a crashed pt-online-schema-change by leveraging pt-archiver and executing queries to ensure that the data gets accurately migrated. I will show you how to continue the data copy process, and how to safely close out the pt-online-schema-change via manual operations such as RENAME TABLE and DROP TRIGGER commands. The normal process to recover from a crashed pt-online-schema-change is to drop the triggers on your original table and drop the new table created by the script. Then you would restart pt-online-schema-change. In this case, this wasn’t possible.

A customer recently needed to add a primary key column to a very busy table (with around 200 million rows). The table only had a unique key on one column (called our_id below). The customer had concerns about slave lag, and wanted to ensure there was little or no lag. This, as well as the fact that you can’t add a primary key as an online DDL in MySQL and Percona Server 5.6, meant the obvious answer was using pt-online-schema-change.

Due to the sensitivity of their environment, they could only afford one short window for the initial metadata locks, and needed to manually do the drop swap that pt-online-schema-change normally does automatically. This is where no-drop-triggers and no-swap-tables come in. The triggers will theoretically run indefinitely to keep the new and old tables in sync once pt-online-schema-change is complete. We crafted the following command:

pt-online-schema-change
--execute
--alter-foreign-keys-method=auto
--max-load Threads-running=30
--critical-load Threads_running=55
--check-slave-lag mysql-slave1,mysql-slave2,mysql-slave3
--max−lag=10
--chunk-time=0.5
--set-vars=lock_timeout=1
--tries="create_triggers:10:2,drop_triggers:10:2"
--no-drop-new-table
--no-drop-triggers
--no-swap-tables
--chunk-index "our_id"
--alter "ADD newcol BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST"
D=website,t=largetable
--nocheck-plan

You can see some of the specifics of other flags and why we used them in the Percona Toolkit Manual.

Once we ran the command the customer got concerned, as their monitoring tools weren’t showing any work done (which is by design, pt-online-schema-change doesn’t want to hurt your running environment). The customer ran strace -p to verify it was working. This wasn’t a great choice as it crashed pt-online-schema-change.

At this point, we knew that the application (and management) would not allow us to take new metadata locks to create triggers on the table, as we had passed our metadata lock window.

So how do we recover?

First, let’s start with a clean slate. We issued the following commands to create a new table, where __largetable_new is the table created by pt-online-schema-change:

CREATE TABLE mynewlargetable LIKE __largetable_new;
RENAME TABLE __largetable_new TO __largetable_old, mynewlargetable TO __largetable_new;
DROP TABLE __largetable_old;

Now the triggers on the original table, largetable are updating the new empty table that has our new schema.

Now let’s address the issue of actually moving the data that’s already in largetable to __largetable_new. This is where pt-archiver comes in. We crafted the following command:

pt-archiver
--execute
--max-lag=10
--source D=website,t=largetable,i=our_id
--dest D=website,t=__largetable_new
--where "1=1"
--no-check-charset
--no-delete
--no-check-columns
--txn-size=500
--limit=500
--ignore
--statistics

We use pt-archiver to slowly copy records non-destructively to the new table based on our_id and WHERE 1=1 (all records). At this point, we periodically checked the MySQL data directory over the course of a day with ls -l to compare table sizes.

Once the table files were close to the same size, we ran counts on the tables. We noticed something interesting: the new table had thousands more records than the original table.

This concerned us. We wondered if our “hack” was a mistake. At this point we ran some verification queries:

select min(our_id) from __largetable_new;
select max(our_id) from __largetable_new;
select min(our_id) from largetable;
select max(our_id) from largetable;

We learned that there were older records that didn’t exist in the live table. This means that pt-archiver and the DELETE trigger may have missed each other (i.e., pt-archiver was already in a transaction but hadn’t written records to the new table until after the DELETE trigger already fired).

We verified with more queries:

SELECT COUNT(*) FROM largetable l WHERE NOT EXISTS (SELECT our_id FROM __largetable_new n WHERE n.our_id=l.our_id);

They returned nothing.

SELECT COUNT(*) FROM __largetable_new n WHERE NOT EXISTS (SELECT our_id FROM largetable l WHERE n.our_id=l.our_id);

Our result showed 4000 extra records in the new table. This shows that we ended up with extra records that were deleted from the original table. We ran other queries based on their data to verify as well.

This wasn’t a huge issue for our application, and it could have been easily dealt with using a simple DELETE query based on the unique index (i.e., if it doesn’t exist in the original table, delete it from the new one).

Now to complete the pt-online-schema-change actions. All we need to do is the atomic rename or drop swap. This should be done as soon as possible to avoid running in a degraded state, where all writes to the old table are duplicated on the new one.

RENAME TABLE largetable TO __largetable_old , __largetable_new TO largetable;

Then drop the triggers for safety:

DROP TRIGGER pt_osc_website_largetable_ins;
DROP TRIGGER pt_osc_website_largetable_upd;
DROP TRIGGER pt_osc_website_largetable_del;

At this point it is safer to wait for the old table to clear out of the buffer pool before dropping it, just to ensure there is no impact on the server (maybe a week to be safe). You can check information_schema for a more accurate reading on this:

SELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE WHERE TABLE_NAME = '`website`.`__largetable_old`';
+----------+
| count(*) |
+----------+
|   279175 |
+----------+
1 row in set (8.94 sec)

Once this goes to 0 you can issue:

DROP TABLE __largetable_old;

by Manjot Singh at June 30, 2016 09:20 PM

Jean-Jerome Schmidt

Planets9s - MySQL on Docker: Building the Container Images, Monitoring MongoDB and more

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

MySQL on Docker: Building the Container Image

Building a docker image for MySQL is essential if you’d like to customize MySQL to suit your needs. In this second post of our ‘MySQL on Docker’ series, we show you two ways to build your own MySQL Docker image - changing a base image and committing, or using Dockerfile. We show you how to extend the Docker team’s MySQL image, and add Percona XtraBackup to it.

Read the blog

Sign up for our webinar on Monitoring MongoDB - Tuesday July 12th

MongoDB offers many metrics through various status overviews or commands, and as MySQL DBA, it might be a little unfamiliar ground to get started with. In this webinar on July 12th, we’ll discuss the most important ones and describe them in ordinary plain MySQL DBA language. We’ll have a look at the open source tools available for MongoDB monitoring and trending. And we’ll show you how to leverage ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

Sign up for the webinar

StreamAMG chooses ClusterControl to support its online European football streaming

This week we’re delighted to announce a new ClusterControl customer, StreamAMG (Advanced Media Group), Europe’s largest player in online video solutions, helping football teams such as Liverpool FC, Aston Villa, Sunderland AFC and the BBC keep fans watching from across the world. StreamAMG replaced its previous environment, based on a master-slave replication topology, with a multi-master Galera Cluster; and Severalnines’ ClusterControl platform was applied to automate operational tasks and provide visibility of uptime and performance through monitoring capabilities.

Read the story

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 30, 2016 08:33 AM

June 29, 2016

Peter Zaitsev

2016 MySQL User Group Leaders Summit

MySQL User Group Leaders Summit

In this post, I’ll share my experience attending the annual MySQL User Group Leaders Summit in Bucharest, Romania.

The MySQL User Group Leaders Summit gathers together as many of the global MySQL user group leaders as possible. At the summit, we discuss further actions on how we can better act for their local communities. This year, it focused primarily on cloud technologies.

As the Azerbaijan MySQL User Group leader, I felt a keen responsibility to go. I wanted to represent our group and learn as much as possible to take back to with me. Mingling and having conversations with other group leaders helps give me more ideas about how to spread the MySQL word!

The Conference

I attended three MySQL presentations:

  • Guided tour on the MySQL source code. In this session, we reviewed the layout of the MySQL code base, roughly following the query execution path. We also covered how to extend MySQL with both built-in and pluggable add-ons.
  • How profiling SQL works in MySQL. This session gave an overview of the performance monitoring tools in MySQL: performance counters, performance schema and SYS schema. It also covered some of the details in analyzing MySQL performance with performance_schema.
  • What’s New in MySQL 5.7 Security. This session presented an overview of the new MySQL Server security-related features, as well as the MySQL 5.6 Enterprise edition tools. This session detailed the shifting big picture of secure deployments, along with all of the security-related MySQL changes.

MySQL User Group Leaders SummitI thought that the conference was very well organized, with uniformly great discussions. We also participated in some city activities and personal interactions. I even got to see Le Fred!

I learned a lot from the informative sessions I attended. The MySQL source code overview showed me the general paths of MySQL source code, including the most important directories, the most important functions and classes. The session about MySQL profiling instrumentation sessions informed us of the great MySQL profiling improvements. It reviewed some useful tools and metrics that you can use to get info from the server. The last session about MySQL security covered improved defaults, tablespace encryption and authentication plugins.

In conclusion, my time was well spent. Meeting and communicating with other MySQL user group leaders gives me insight into the MySQL community. Consequently, I highly recommend everyone gets involved in your local user groups and attend get-togethers like the MySQL User Group Leaders Summit when you can find the time.

Below you can see some of the pics from the trip. Enjoy!

MySQL User Group Leaders Summit

 

MySQL User Group Leaders Summit

 

MySQL User Group Leaders Summit

MySQL User Group Leaders Summit

 

MySQL User Group Leaders Summit

MySQL User Group Leaders Summit

MySQL User Group Leaders Summit

 

by Shahriyar Rzayev at June 29, 2016 05:20 PM

Percona Server for MongoDB 3.2.7-1.1 is now available

Percona_ServerfMDBLogoVert

Percona Server for MongoDBPercona announces the release of Percona Server for MongoDB 3.2.7-1.1 on June 29, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.2.7-1.1 is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.2 protocol and drivers. Based on MongoDB 3.2.7, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as enterprise-grade features like external authentication and audit logging at no extra cost. Percona Server for MongoDB requires no changes to MongoDB applications or code.

Note:

The PerconaFT storage engine has been deprecated and will not be available in future releases.


This release includes all changes from MongoDB 3.2.7 as well as the following:

  • Fixed the software version incorrectly reported by the --version option.
  • Added recommended ulimit values for the mongod process

The release notes are available in the official documentation.

 

by Alexey Zhebel at June 29, 2016 04:45 PM

June 28, 2016

Henrik Ingo

Features I've added to Impress.js (and the Plugin API)

This is the third post in my series of blog posts on Impress.js. As I've already mentioned, I've also written some new code to impress.js. Now I want to finally present the new features I've added (currently available in my github fork).

Prologue

read more

by hingo at June 28, 2016 06:02 PM

Jean-Jerome Schmidt

Press Release: Severalnines kicks off online European football streaming

Award-winning database management platform scores deal with continent’s largest online video solutions provider

Stockholm, Sweden and anywhere else in the world - 28/06/2016 - Severalnines, Europe’s leading database performance management provider, today announced its latest customer, StreamAMG (Advanced Media Group), a UK-based pioneer in the field of bespoke online video streaming and content management. StreamAMG is Europe’s largest player in online video solutions, helping football teams such as Liverpool FC, Aston Villa, Sunderland AFC and the BBC keep fans watching from across the world.

Long hailed as the future of online content, analysts predict that 90% of all consumer internet traffic will be video by 2019. This poses a challenge to streaming providers, both in terms of the amount of online video data to handle and the variety of ways the content is consumed. Customers expect a seamless viewing experience across any device on any operating system. Downtime, lag or disturbances to streaming can have serious repercussions for customer loyalty. Streaming providers should provide a secure and reliable media platform to maintain the interest of fans and attract new viewers, casting database performance in a starring role.

Founded in 2001, StreamAMG builds bespoke solutions for its customers to host and manage online video content. Its software delivers the high-availability needed for on-demand streaming or live broadcasting on any device. Loss of customer trust and damage to brand reputation are likely consequences of database failures, especially for those companies which operate in the online sports, betting and gaming industries.

Growing at 30% year on year required StreamAMG to have a scalable IT system to meet new customer demands and to maintain its leadership position in the market. StreamAMG reviewed its database performance as part of its IT infrastructure renewal project for to encompass new online channels, such as social media, and embedding marketing analytics to help its customers better understand and react to customer behaviour. It needed a solution to monitor and optimise its database management system and the detailed metrics to predict database failures.

After reviewing options provided by Oracle and AWS, amongst others, StreamAMG chose Severalnines to help future-proof its databases. The previous environment, based on a master-slave replication topology, was replaced with a multi-master Galera Cluster; and Severalnines’ ClusterControl platform was applied to automate operational tasks and provide visibility of uptime and performance through monitoring capabilities.

Thom Holliday, Marketing Manager StreamAMG, said: “With ClusterControl in place, StreamAMG’s flagship product is now backed with a fully automated database infrastructure which allows us to ensure excellent uptime. Severalnines increased our streaming speed by 76% and this has greatly improved the delivery of content to our customers. The implementation took only two months to complete and saved us 12% in costs. Expanding the current use of ClusterControl is definitely in the pipeline and we would love to work with Severalnines to develop new features.”

Vinay Joosery, Severalnines Founder and CEO, said: “Online video streaming is growing exponentially, and audiences expect quality, relevant content and viewing experiences tailor-made for each digital platform. I’m a big football fan myself and like to stay up to date with games whenever I can. Right now I’m following the European Championships and online streaming is key so I can watch the matches wherever I am. New types of viewerships place certain requirements on modern streaming platforms to create experiences that align with consumer expectations. StreamAMG is leading the way there, and helps its customers monetise online channels through a solidly architected video platform. We’re happy to be part of this.“

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 8,000 deployments to date via its popular online database configurator. Currently counting BT, Orange, Cisco, CNRS, Technicolour, AVG, Ping Identity and Paytrail as customers. Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit: http://www.severalnines.com/customers

About StreamAMG

StreamAMG helps businesses manage their online video solutions, such as Hosting video, integrating platforms, monetizing content and delivering live events. Since 2001, it has enabled clients across Europe to communicate through webcasting by building online video solutions to meet their goals.

For more information visit: https://www.streamamg.com

Media Contact

Positive Marketing
Steven de Waal / Camilla Nilsson
severalnines@positivemarketing.com
0203 637 0647/0645

by Severalnines at June 28, 2016 09:24 AM

June 27, 2016

Peter Zaitsev

Webinar Wednesday June 29: Percona XtraDB Cluster Reference Architecture

Percona XtraDB Cluster Reference Architecture

Percona XtraDB Cluster Reference ArchitecturePlease join Jay Janssen for the webinar Percona XtraDB Cluster Reference Architecture Wednesday, June 29 at 10:00 AM PDT (UTC- 7).

A reference architecture shows a typical, common, best-practice deployment of a system with all the surrounding infrastructure. In the case of database clusters, this can include the hosting platform, load balancing, monitoring, backups, etc.

Percona published a commonly referred to Percona XtraDB Cluster reference architecture on the Percona blog in 2012 (which is included in the current manual). However, this architecture is out of date.

This talk will present a revised and updated Percona XtraDB Cluster reference architecture for 2016, including:

  • Load balancing
  • Read/Write splitting
  • Monitoring
  • Backups

This will include some variants, such as:

  • AWS hosting
  • WAN deployments
  • Async slaves

Register now.

Percona XtraDB Cluster Reference ArchitectureJay Janssen, Managing Principal Architect

Jay came to Percona in 2011 after working seven years for Yahoo! Jay worked in a variety of fields including High Availability architectures, MySQL training, tool building, global server load balancing, multi-datacenter environments, operationalization, and monitoring. Jay holds a B.S. in Computer Science from Rochester Institute of Technology. He and his wife live with their four children in upstate New York.

by Dave Avery at June 27, 2016 06:56 PM

The need for parallel crash recovery in MySQL

parallel crash recovery in MySQL

parallel crash recovery in MySQLIn this blog, I will discuss how parallel crash recovery in MySQL benefits several processes.

I recently filed an Oracle feature request to make crash recovery faster by running in multiple threads.

This might not seem very important, because MySQL does not crash that often. When it does crash, however, crash recovery can take 45 mins – as I showed in this post:

What is a big innodb_log_file_size?

Even in that case, it still might not be a big issue as you often failover to a slave.

However, crash recovery plays important part in the following processes:

  • Backups with Percona XtraBackup (and MySQL Enterprise Backups) and backups with filesystem snapshots.
    • Crash recovery is part of the backup process, and it is important to make the backup task faster.
  • State Snapshot Transfer in Percona XtraDB Cluster.
    • SST, either XtraBackup or rsync bases, also relies on the crash recovery process – so the faster it is done, the faster a new node joins the cluster.
    • It might seem that Oracle shouldn’t care about Percona XtraDB Cluster. But they are working on MySQL Group Replication. I suspect that when Group Replication copies data to the new node, it will also rely on some kind of snapshot technique. Unless they aren’t serious about this feature and will recommend mysqldump/mysqlpump for data copying).
  • My recent proof of concept for Automatic Slave propagation in Docker environment also uses Percona XtraBackup, and therefore crash recovery for new slaves.

In general, any process that involves MySQL/InnoDB data transfer will benefit from a faster crash recovery. In its current state uses just one thread to read and process data. This limits performance on modern hardware, which uses multiple CPU cores and fast SSD drives.

It is also important to consider that the crash recovery time affects how big log files can be. If we improve the crash recovery time, we can store very big InnoDB log files (which positively affects performance in general).

Percona is working on ways to make it faster. However, if faster recovery times are important to you environment, I encourage you to let Oracle know that you want to see parallel crash recovery in MySQL.

by Vadim Tkachenko at June 27, 2016 06:14 PM

Mr. D vs crayon plugin

this is a common sql query highlighted by crayon:

select * from blog where user like '%anyone%'

let’s try to cheat crayon

select * from blog where user like '%anyone%'

by Leonardo Erpi at June 27, 2016 05:00 PM

Jean-Jerome Schmidt

MySQL on Docker: Building the Container Image

Building a docker image for MySQL is essential if you’d like to customize MySQL to suit your needs. The image can then be used to quickly start running MySQL containers, which can be operated individually. In our previous post, we covered the basics of running MySQL as container. For that purpose, we used the MySQL image from the official Docker repository. In this blog post, we’ll show you how to build your own MySQL image for Docker.

What is a container image?

A container requires an image to run. A container image is like a virtual machine template. It has all the required stuff to run the container. That includes operating system, software packages, drivers, configuration files and helper scripts packed in one bundle.

When running MySQL on a physical host, here is what you would normally do:

  1. Prepare a host with proper networking
  2. Install operating system of your choice
  3. Install MySQL packages via package repository
  4. Modify the MySQL configuration to suit your needs
  5. Start the MySQL service

Running a MySQL Docker image would look like this:

  1. Install Docker engine on the physical host
  2. Download a MySQL image from public (Docker Hub) or private repository, or build your own MySQL image
  3. Run the MySQL container based on the image, which is similar to starting the MySQL service

As you can see, the Docker approach contains less deployment steps to get MySQL up and running. 99% of the time, the MySQL service running in container will usually work in any kind of environment as long as you have the Docker engine running. Building a MySQL container image requires process flow, since Docker expects only one process per container.

Consider the following:

The above illustrates the following actions:

  1. The image is pulled from Docker Hub on the machine host by using:

    $ docker pull mysql
  2. Spin up two MySQL containers and map them with their respective volume:

    $ docker run -d --name=mysql1 -e MYSQL_ROOT_PASSWORD=’mypassword’ -v /storage/mysql1/mysql-datadir:/var/lib/mysql mysql
    $ docker run -d --name=mysql2 -e MYSQL_ROOT_PASSWORD=’mypassword’ -v /storage/mysql2/mysql-datadir:/var/lib/mysql mysql

How to build a MySQL image?

Take note that in this exercise, we are going to extend an existing MySQL image by adding Percona Xtrabackup onto it. We will then publish our image to Docker Hub and setup an automated build.

Base Image

To build a MySQL container image, we’ll start by pulling a base image. You can pull an image which contains a vanilla operating system of your choice, and start building the MySQL image from scratch:

$ docker pull debian
Using default tag: latest
Trying to pull repository docker.io/library/debian ... latest: Pulling from library/debian

17bd2058e0c6: Pull complete
f854eed3f31f: Pull complete
Digest: sha256:ff779f80153d8220904ec3ec6016ac6dd51bcc77e217587689feffcd7acf96a0
Status: Downloaded newer image for docker.io/debian:latest

However, this is perhaps not best practice. There are tons of MySQL container images available on Docker Hub that we can re-use and enhance with more functionality. For example, the MySQL image created by the Docker team may not contain things that we need, e.g., Percona Xtrabackup (PXB). PXB needs to have access to the local file system in order to perform hot backups. Thus, we have to install PXB on top of the MySQL container images created by Docker. Inheriting this MySQL image allows us to leverage the work done by the Docker team. We do not need to maintain the MySQL server parts, especially when a new version is released.

For now, let’s pull the MySQL image of our choice to the machine host. We are going to use MySQL 5.6 as the base image:

$ docker pull mysql:5.6
Trying to pull repository docker.io/library/mysql ... 5.6: Pulling from library/mysql

3153a44fc5c3: Pull complete
ac82a224055d: Pull complete
e3ce3c9ce67d: Pull complete
57c790472a9d: Pull complete
49772bf40877: Pull complete
73f07a1d187e: Pull complete
3446fa8ab4bb: Pull complete
70c40ffe6275: Pull complete
54672d2ddb6f: Pull complete
Digest: sha256:03646869dfecf96f443640f8b9040fbde78a96269aaf47bbfbb505a4c1adcad9
Status: Downloaded newer image for docker.io/mysql:5.6

Let’s verify what images we do have now:

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
docker.io/debian              latest              f854eed3f31f        5 days ago          125.1 MB
docker.io/mysql               latest              b0e2c14c7e92        3 weeks ago         378.4 MB
docker.io/mysql               5.6                 54672d2ddb6f        3 weeks ago         329 MB

There are three Docker images available in the host, debian:latest, mysql:latest (MySQL 5.7) and mysql:5.6 (MySQL 5.6).

There are two different ways to build a new image, we’ll cover this in the next section.

Building the image

Essentially, there are two ways to build the image:

  1. Make changes to the base image and commit
  2. Use Dockerfile - A text file that contains all the commands to build an image

1) Change & Commit

By using this approach, you make the changes directly into the container image and commit. The commit operation will not include any data contained in volumes mounted inside the container. By default, the container being committed and its processes will be paused while the image is committed. This reduces the likelihood of encountering data corruption during the process of creating the commit.

Let’s run a MySQL container with a volume. Create a volume directory on the machine host and spin a MySQL instance based on the MySQL 5.6 image we have downloaded:

$ mkdir -p /storage/test-mysql/datadir
$ docker run -d --name=test-mysql -e MYSQL_ROOT_PASSWORD=’mypassword’ -v /storage/test-mysql/datadir:/var/lib/mysql mysql:5.6

Verify if the container is running:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
4e75117289ea        mysql:5.6           "docker-entrypoint.sh"   5 seconds ago       Up 5 seconds        3306/tcp            test-mysql

Enter the container’s interactive shell:

$ docker exec -it test-mysql /bin/bash
root@4e75117289ea:/#

This is a Debian 8 (Jessie) image with minimal installation. Some common commands like wget are not available and we need to install these beforehand:

$ apt-get update && apt-get install wget

Install the appropriate Percona apt repository and install the latest PXB 2.3:

$ wget https://repo.percona.com/apt/percona-release_0.1-3.jessie_all.deb
$ dpkg -i percona-release_0.1-3.jessie_all.deb
$ apt-get update
$ apt-get install percona-xtrabackup-23
$ mkdir -p /backup/xtrabackups
$ exit

That’s it. Those were changes we’ve made to the container. Let’s commit this container as another image so we can use it later. First, retrieve the container ID:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
4e75117289ea        mysql:5.6           "docker-entrypoint.sh"   12 minutes ago      Up 12 minutes       3306/tcp            test-mysql

Then, commit and push the changes to another image called “local/mysql-pxb:5.6”:

$ docker commit 4e75117289ea mysql-pxb:5.6
830fea426cfb27d4a520c25f90de60517b711c607dda576fca93ff3a5b03c48f

We can now see that our new image is ready:

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
local/mysql-pxb               5.6                 830fea426cfb        9 seconds ago       589.7 MB
docker.io/debian              latest              f854eed3f31f        5 days ago          125.1 MB
docker.io/mysql               latest              b0e2c14c7e92        3 weeks ago         378.4 MB
docker.io/mysql               5.6                 54672d2ddb6f        3 weeks ago         329 MB

This is just an example to show you how to create an image. We are not going to test this image further, nor upload it to the Docker Hub. Instead, we’ll use Dockerfile as described in the next section.

2) Dockerfile

Contrary to the change and commit approach, another way is to compile all the necessary steps that we used above in a text file called Dockerfile. Generally, it is better to use Dockerfiles to manage your images in a documented and maintainable way. We are not going to cover each of the syntax used in Dockerfile, you can refer to the Docker documentation for details.

Let’s start by creating a directory to place the Dockerfile:

$ mkdir -p ~/docker/severalnines/mysql-pxb
$ cd ~/docker/severalnines/mysql-pxb

Create a new file called Dockerfile:

$ vim Dockerfile

And add the following lines:

## MySQL 5.6 with Percona Xtrabackup

## Pull the mysql:5.6 image
FROM mysql:5.6

## The maintainer name and email
MAINTAINER Ashraf Sharif <ashraf@s9s.com>

## List all packages that we want to install
ENV PACKAGE percona-xtrabackup-23

# Install requirement (wget)
RUN apt-get update && apt-get install -y wget

# Install Percona apt repository and Percona Xtrabackup
RUN wget https://repo.percona.com/apt/percona-release_0.1-3.jessie_all.deb && \
    dpkg -i percona-release_0.1-3.jessie_all.deb && \
    apt-get update && \
    apt-get install -y $PACKAGE

# Create the backup destination
RUN mkdir -p /backup/xtrabackups

# Allow mountable backup path
VOLUME ["/backup/xtrabackup"]

Save the file. What we are doing here is exactly the same with “change and commit” approach, where we extend the functionality of the existing MySQL image downloaded from Docker Hub by installing Percona apt repository and Percona Xtrabackup into it.

We can now build a new image from this Dockerfile:

$ docker build --rm=true -t severalnines/mysql-pxb:5.6 .

We saved the image with a proper naming format “username/image_name:tag”, which is required by Docker Hub if you would like to push and store the image there.

Let’s take a look on what we have now:

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
severalnines/mysql-pxb        5.6                 c619042c5b91        18 minutes ago      591.4 MB
local/mysql-pxb               5.6                 830fea426cfb        15 hours ago        589.7 MB
docker.io/debian              latest              f854eed3f31f        6 days ago          125.1 MB
docker.io/mysql               latest              b0e2c14c7e92        3 weeks ago         378.4 MB
docker.io/mysql               5.6                 54672d2ddb6f        3 weeks ago         329 MB

There are two same images, “local/mysql-pxb” and “severalnines/mysql-pxb”, where the latter was built by using Dockerfile. e are going to push it into Docker Hub.

This is what we have now in our machine host:

Next, we are going to test the image and make further modifications if necessary.

Testing

Let’s run a MySQL container and perform a backup using innobackupex. Create volume path directories on the machine host beforehand:

$ mkdir -p /storage/mysql-pxb/datadir
$ mkdir -p /storage/backups
$ docker run -d \
--name mysql-server \
-v /storage/mysql-server/datadir:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=mypassword \
severalnines/mysql-pxb:5.6

The above command runs a MySQL container called “mysql-server” from the newly built image, severalnines/mysql-pxb:5.6, which can be illustrated as below:

From Percona Xtrabackup documentation, the simplest innobackupex command is:

$ innobackupex --user=”[user]” --password=”[password]” [backup path]

Based on the above, we can execute the backup command in another container (mysql-run-backup), link it to the running MySQL container (mysql-server) and take advantage of the environment variables available for linked containers. By doing this, we don’t have to specify the credentials like host, port, username and password when running the innobackupex command.

Let’s run a linked container and perform the backup by using Docker’s environment variable:

$ docker run -it \
--link mysql-server:mysql \
--name=mysql-run-backup \
-v /storage/mysql-server/datadir:/var/lib/mysql \
-v /storage/backups:/backups \
--rm=true \
severalnines/mysql-pxb:5.6 \
sh -c 'exec innobackupex --host="$MYSQL_PORT_3306_TCP_ADDR" --port="$MYSQL_PORT_3306_TCP_PORT" --user=root --password="$MYSQL_ENV_MYSQL_ROOT_PASSWORD" /backups'

Our both containers are now working together like this:

However, the “run” command was pretty long and not very user friendly. We can simplify this by using a bash script. Create a bash script under the same path as the Dockerfile:

$ cd ~/docker/severalnines/mysql-pxb
$ vim run_backup.sh

And add the following lines:

# Run innobackupex
BACKUP_PATH=/backups

innobackupex --host="$MYSQL_PORT_3306_TCP_ADDR" \
--port="$MYSQL_PORT_3306_TCP_PORT" \
--user=root \
--password="$MYSQL_ENV_MYSQL_ROOT_PASSWORD" \
$BACKUP_PATH

Give the script an executable permission:

$ chmod 755 run_backup.sh

Then, use the ADD command to copy the bash script into the image when building it. Our final version of Dockerfile is now:

## MySQL 5.6 with Percona Xtrabackup

## Pull the mysql:5.6 image
FROM mysql:5.6

## The maintainer name and email
MAINTAINER Your Name <email@domain.xyz>

## List all packages that we want to install
ENV PACKAGE percona-xtrabackup-22

# Install requirement (wget)
RUN apt-get update && apt-get install -y wget

# Install Percona apt repository and Percona Xtrabackup
RUN wget https://repo.percona.com/apt/percona-release_0.1-3.jessie_all.deb && \
        dpkg -i percona-release_0.1-3.jessie_all.deb && \
        apt-get update && \
        apt-get install -y $PACKAGE

# Create backup directory
RUN mkdir -p /backups

# Copy the script to simplify backup command
ADD run_backup.sh /run_backup.sh

# Mountable backup path
VOLUME ["/backups"]

Rebuild the image:

$ docker build --rm=true -t severalnines/mysql-pxb:5.6 .

Run a new container with the new simplified command “/run_backup.sh”:

$ docker run -it \
--link mysql-server:mysql \
--name=mysql-run-backup \
-v /storage/mysql-server/datadir:/var/lib/mysql \
-v /storage/backups:/backups \
--rm=true \
severalnines/mysql-pxb:5.6 \
sh -c 'exec /run_backup.sh'

You should see Xtrabackup output on the screen. Ensure you get the “completed OK” line indicating the backup is successfully created:

...
innobackupex: Backup created in directory '/backups/2016-06-17_17-07-54'
160617 17:07:57  innobackupex: Connection to database server closed
160617 17:07:57  innobackupex: completed OK!

The container will then exit (we ran the image in foreground without --detach parameter) and automatically removed by Docker since we specified “--rm=true” in the “run” command line. On the machine host, we can see the backups are there:

$ ls -1 /storage/backups/
2016-06-17_13-01-58
2016-06-17_13-07-28
2016-06-17_14-02-50

Now our Dockerfile and the image is ready to be published.

Upload to Docker Hub

Docker Hub greatly facilitates Docker image distribution and collaboration. It comes with a bunch of great features like image storage, automatic build and test, link to code repositories and integration with Docker Cloud. It also supports private image repository for those who don’t want to publish the images to the public.

In order to share the images onto Docker Hub, you must first register. After an account is created, run the following command on the machine host to authenticate to the site:

$ docker login
Username: severalnines
Password:
Email: ashraf@domain.com
WARNING: login credentials saved in /root/.docker/config.json
Login Succeeded

Now you can push this repository to the registry designated by its name or tag:

$ docker push severalnines/mysql-pxb:5.6

Do you really want to push to public registry? [y/n]: y
The push refers to a repository [docker.io/severalnines/mysql-pxb] (len: 1)
c619042c5b91: Pushed
cb679f373840: Pushed
29db1ab3b7c2: Pushed
eadfe1149bcf: Pushed
73f07a1d187e: Pushed
d574478a62d8: Pushed
9acd57cae939: Pushed
b14dd0099b51: Pushed
c03269bf1687: Pushed
f50b60617e9c: Pushed
ed751d9dbe3b: Pushed
23286f48d129: Pushed
5.6: digest: sha256:5cf2d7c339a9902ac3059b9ddac11beb4abac63e50fc24e6dfe0f6687cdfa712 size: 20156

Once completed, the image will be available publicly in Docker Hub for user “severalnines”:

Great! Our image is now live and hosted on the Docker Hub where the community can start pulling it.

Automated Build

It’s great to have Docker Hub store the image for you, for free. But, what if a newer version of Percona Xtrabackup is available and you want the image to have it? Or, what if we just want to make a slight modification to our Dockerfile? Should we repeat the building steps over and over again? The answer is no - if you make use of the automated build. Just tell Docker Hub where the code repository is (Github or BitBucket), it will keep an eye for any changes in the revision control of the repository and trigger the build process automatically

Automated builds have several advantages:

  • Images built in this way are built exactly as specified.
  • The Dockerfile is available to anyone with access to your Docker Hub repository.
  • Your repository is kept up-to-date with code changes automatically.

In this example, we use Github. Firstly, create a Github repository. Clone the Github repository into the machine host, and then push the Dockerfile and the run_backup.sh script into Github repository:

$ git clone severalnines/docker-mysql-pxb
$ cd severalnines/docker-mysql-pxb
$ git add *
$ git commit -m ‘first release’
$ git push origin master

Next, link your Docker account with the Github repository. Refer to the instructions here. Once linked, go to Docker Hub > Create Automated Build > Create Auto-build Github > choose the repository and then specify the Docker repository name “mysql-pxb” and add a short description of the repository, similar to the following screenshot:

Then, configure the Build Settings according to the code branch, Dockerfile location and image tag:

In the above example, we also created a Dockerfile for MySQL 5.7 and set it as the default using the “latest” tag. Click on the “Trigger” button to immediately trigger a build job. Docker Hub will then put it in a queue and build the image accordingly.

Here is the final look of the Docker Hub page once you have linked the account and configured automated build:

That’s all. Now what you have to do is just make a change and commit to the Github repository, then Docker will re-build the image for you automatically. If you have a README.md file inside the Github repository, Docker Hub will also pull the content of it into the “Full Description” section of the Docker repository page. Here is what the build status should look like under “Build Details” tab:

We strongly suggest you to have a look at the Best Practice for Writing Dockerfiles. In fact, if you’re creating an official Image, you must stick to those practices.

That concludes this blog post. In the next post, we will look into Docker internetworking for MySQL containers.

by Severalnines at June 27, 2016 10:51 AM

June 24, 2016

MariaDB Foundation

MariaDB 10.0.26 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.26. See the release notes and changelog for details on this release. Download MariaDB 10.0.26 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.0.26 now available appeared first on MariaDB.org.

by Daniel Bartholomew at June 24, 2016 05:28 PM

Peter Zaitsev

Percona Toolkit 2.2.18 is now available

Percona Toolkit

Percona ToolkitPercona announces the availability of Percona Toolkit 2.2.18, released on June 24, 2016.

Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release is the current stable release in the 2.2 series. It includes new features and bug fixes as well as improved MySQL 5.7 compatibility. Full details are below. Downloads are available here and from the Percona Software Repositories.

New features:

  • 1537416: pt-stalk now sorts the output of transactions by id
  • 1553340: Added “Shared” memory info to pt-summary
  • PT-24: Added the --no-vertical-format option for pt-query-digest, allowing compatibility with non-standard MySQL clients that don’t support the G directive at the end of a statement

Bug fixes:

  • 1402776: Fixed error when parsing tcpdump capture with pt-query-digest
  • 1521880: Improved pt-online-schema-change plugin documentation
  • 1547225: Clarified the description of the --attribute-value-limit option for pt-query-digest
  • 1569564: Fixed all PERL-based tools to return a zero exit status when run with the --version option
  • 1576036: Fixed error that sometimes prevented to choose the primary key as index, when using the --where option for pt-table-checksum
  • 1585412: Fixed the inability of pt-query-digest to parse the general log generated by MySQL (and Percona Server) 5.7 instance
  • PT-36: Clarified the description of the --verbose option for pt-slave-restart

You can find release details in the release notes and the 2.2.18 milestone on Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

by Alexey Zhebel at June 24, 2016 04:18 PM

Percona Server for MongoDB 3.0.12-1.7 is now available

Print

Percona Server for MongoDBPercona announces the release of Percona Server for MongoDB 3.0.12-1.7 on June 24, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.12-1.7 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.12, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The MongoRocks storage engine is still under development. There is currently no officially released version of MongoRocks that can be recommended for production.


This release includes all changes from MongoDB 3.0.12, and the following known issue that will be fixed in a future release:

  • The --version does not correctly report the software version. The effected binaries are:
    • bsondump
    • mongodump
    • mongoexport
    • mongofiles
    • mongoimport
    • mongooplog
    • mongorestore
    • mongostat
    • mongotop

The release notes are available in the official documentation.

 

by Alexey Zhebel at June 24, 2016 04:14 PM

June 23, 2016

Peter Zaitsev

Migrate from MS SQL Server to MySQL

Migrate from MS SQL Server to MySQL

In this blog series, I will share my experiences as I migrate commercial databases (i.e., Microsoft SQL or Oracle) to open source (MySQL). More specifically, we will look at how you can migrate from MS SQL Server to MySQL.

For this first blog post I’ve chosen Jira database as an example, and used three different tools to migrate Jira database in Microsoft SQL Server to MySQL:

  1. MySQL Workbench (opensource)
  2. Amazon DMS (cloud tool)
  3. Ispirer MnMTK 2015 (commercial tool)

When I started my research, I was under the impression that Jira database would be easy to migrate (no stored procedures, no triggers, etc.). It turned out that there were some problems that I was able to fix.

One of the reasons I chose Jira as opposed to some standard MS SQL database (such as AdventureWorks2014) is that it is a non-standard choice. Most of the software vendors use standard databases to test their software, and it works perfectly on those standard databases. Jira is not a usual choice and will be closer to real life.

MySQL Workbench

MySQL Workbench supports Microsoft SQL Server migration. The migration is straightforward except the issues with character sets. I have experienced the error “Could not successfully convert UCS-2 string to UTF-8”.

workbench_errors

It turns out (with the help of Martin Brennan’s blog) that we will need to use “ODBC (FreeTDS)” drive for MS SQL, and enable sending Unicode data as UTF8:

workbench_drive_params

After changing those settings, I was able to successfully migrate Jira database from MS SQL to MySQL.

Advantages and disadvantages:

  • Plus: free and open source tool, multi-platform
  • Plus: successful migration for Jira
  • Plus: supports multi-threaded migrations (increase worker tasks if needed, default value is 2)
  • Minus: needed some tweaks to work with character sets
  • Minus: not very easy to debug errors

Amazon DMS

AWS Database Migration Service supports migrating from MS SQL to MySQL, but the actual migration method is different from other tools. It uses the source database (MS SQL server in this case) replication feature to stream the data to the target database (MySQL). Amazon DMS starts a temporary “migration” instance that is used to stream data. Both the source and destination database can be in AWS (EC2 or RDS) or outside AWS (no restriction).

amazon_dms

The important limitation for MS SQL migration: it only works with MS SQL Server versions that support replication (subscription service). It doesn’t work with MS SQL Express edition. Also, if the subscription service is not enabled the DMS can’t even see the schema(s) to migrate (full list of limitations for MS SQL Server migration).

I’ve also gotten errors around the “constraint” name being too long:

2016-04-02T18:20:23 [TARGET_LOAD ]E: Failed to execute statement: 'ALTER TABLE `dbo`.`AO_38321B_CUSTOM_CONTENT_LINK`
ADD CONSTRAINT `AO_38321B_CUSTOM_CONTENT_LINK_pk_AO_38321B_CUSTOM_CONTENT_LINK_ID` PRIMARY KEY ( `ID` )'
[122502] ODBC general error. (ar_odbc_stmt.c:4048)

The problem here is that “AO_38321B_CUSTOM_CONTENT_LINK_pk_AO_38321B_CUSTOM_CONTENT_LINK_ID” is too long a string for MySQL. At the same time, this name does not really matter as this is the PRIMARY KEY.

After changing the “constraint” name in MS SQL to smaller strings, I could migrate all tables to MySQL.

Amazon DMS notes: Amazon DMS lets you migrate from a database located anywhere (not necessarily in AWS) to another database located anywhere (not necessarily in AWS) — however, the traffic will go thru AWS. Thus the migration path is fastest and the most beneficial if either the source or target (or both) instances are in AWS (for example, ec2 or rds instances).

Advantages and disadvantages:

  • Plus: successful migration for Jira
  • Plus: Multithreaded execution (this is a huge advantage for migrating large databases)
  • Plus: In addition to migration, you can also use a replication link between the SQL Server and the new MySQL to fetch the new changes. This is a huge advantage when migrating a large database with lots of traffic and tight downtime requirements.
  • Minus: replication should be enabled to perform a migration, which means that migrating from SQL Server Express isn’t supported.
  • (Can be plus and minus): All traffic is going through a cloud environment.

Potential issues

(This section has been updated) I’ve figured out that the table structures generated by Workbench, Amazon DMS and SQLWays are different. For example:

Workbench generated:

CREATE TABLE `AO_38321B_CUSTOM_CONTENT_LINK` (
  `CONTENT_KEY` varchar(255) DEFAULT NULL,
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `LINK_LABEL` varchar(255) DEFAULT NULL,
  `LINK_URL` varchar(255) DEFAULT NULL,
  `SEQUENCE` int(11) DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `index_ao_38321b_cus1828044926` (`CONTENT_KEY`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

AWS DMS generated:

CREATE TABLE `AO_38321B_CUSTOM_CONTENT_LINK` (
  `CONTENT_KEY` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
  `ID` int(11) NOT NULL,
  `LINK_LABEL` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
  `LINK_URL` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
  `SEQUENCE` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

SQLWays wizard generated:

CREATE TABLE `AO_38321B_CUSTOM_CONTENT_LINK` (
  `CONTENT_KEY` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `LINK_LABEL` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `LINK_URL` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `SEQUENCE` int(11) DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `index_ao_38321b_cus1828044926` (`CONTENT_KEY`(191))
) ENGINE=InnoDB DEFAULT CHARSET=latin1

In AWS DMS version, the index on CONTENT_KEY is missing and ID is not declared as auto_increment. However, the Workbench “lost” the character set. SQLWays wizard has created partial key (191 characters).

At the same time, I was able to start Jira on top of two versions of the MySQL database (migrated by using Workbench and Amazon DMS).

Ispirer MnMTK 2015

Ispirer MnMTK 2015 toolkit is a commercial (not open-source) software application that lets you migrate from MS SQL Server to MySQL (among other databases). Ispirer has provided me with a demo license so I can test the migration.

I was able to migrate the Jira database from MS SQL to MySQL with the Ispirer SQLWays Wizard:

sql_ways_wizard

One issue with this process is that SQL Ways Wizard relies on the MySQL command line utility (“mysql”), which should be in the path. If you do not have MySQL installed on the migration machine, or it is not in the path, the migration will fail:

sql_ways_no_path

To fix simply add the MySQL “bin” directory to the path. In addition, you can use the SQL Ways Wizard to generate scripts and run those scripts on the destination host where the utilities are installed.

Advantages and disadvantages:

  • Plus: successful migration for Jira
  • Plus: support from SQL ways: can work on fixing potential migration issues (requires paid license)
  • Plus: can convert stored procedures, triggers, foreign key constraints
  • Minus: commercial, not open source software.
  • Minus: only runs on Windows (however, target machine for the database migration can be Linux)
  • Minus: no multi-treaded migration support (can manually run multiple instances of SQL Ways)

Conclusion

All tools I tried finally worked, but at the same time I was surprised with the number of issues I found. Migrating a very simple database (no stored procedures, no triggers, no foreign key constraints) should be easier.

Another surprise was that all tools are focused on a nice GUI with “next” buttons. For migrating one database to another, I would prefer using a command line tool interface (may be similar to Percona toolkit or iconv):

# dbmigrate --source user:pass@sqlserverhost:1433 --target user:pass@mysqlhost:3309 --parallel 8 --verbose --overwrite

Actually, Ispirer MnMTK does have a command line migration utility included, but it only works on Windows.

Until somebody develops a better command line tool, any of the above solutions will help you migrate from MS SQL Server to MySQL.

by Alexander Rubin at June 23, 2016 10:56 PM

Erkan Yanar

Another awesome Conference: IT-Tage 2016

Ahoi,
There will be the IT-Tage 2016 conference. Where I'm going to give two talks.
One will be about Docker being more then a technical revolution.
The other one is about Best Practices for Docker+MySQL/MariaDB \o/

As always: fun first!
Erkan

by erkan at June 23, 2016 11:49 AM

Jean-Jerome Schmidt

Planets9s - How to monitor MongoDB, what to test when upgrading to MySQL 5.7 and more

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

New webinar on how to monitor MongoDB (if you’re really a MySQL DBA)

As you may know, MongoDB offers many metrics through various status overviews and commands, but which ones really matter to you? How do you trend and alert on them? What is the meaning behind the metrics? In this new webinar on July 12th, we’ll discuss the most important ones and describe them in ordinary plain MySQL DBA language. And we’ll have a look at the (open source) solutions available for MongoDB monitoring and trending, including how to leverage ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

Sign up for the webinar

What to test before upgrading to MySQL 5.7

In this blog post, we look at some of the important things to keep in mind when preparing and performing tests for an upgrade to MySQL 5.7. As we saw in a previous post, there are some important changes between MySQL 5.6 and 5.7. Since the behaviour of some existing features been altered, some detailed testing of the upgrade is in order. This new blog post (and associated whitepaper) show you how.

Read the blog

Deploy & monitor MySQL Galera clusters on Digital Ocean with NinesControl

Designed with the needs of developers in mind, NinesControl enables users to easily deploy and monitor (MySQL) Galera clusters on DigitalOcean. Droplets are launched and managed using your own DigitalOcean account. We’d love to get your feedback on this new solution if you haven’t tested it yet, so please check it out and let us know what you think.

Try NinesControl

Become a PostgreSQL DBA: Provisioning & Deployment

In this blog post, we address the following questions from the MySQL DBA standpoint: why use both MySQL and PostgreSQL in one environment? Is there any value in having a replicated PostgreSQL setup running alongside a Galera Cluster? We also discuss different methods of deploying PostgreSQL.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 23, 2016 11:01 AM

June 22, 2016

Peter Zaitsev

Troubleshooting configuration issues: Q & A

Troubleshooting configuration issues

Troubleshooting configuration issuesIn this blog, I will provide answers to the Q & A for the Troubleshooting configuration issues webinar.

First, I want to thank you for attending the June, 9 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: What are the predefined roles installed per default?

A: MySQL does not support roles by default. But, if you need roles you can either use MariaDB or emulate roles with Percona PAM Plugin and proxy users. But MySQL has pre-built user accounts. One of them is

root@localhost
, which has all available privileges, and anonymous account that can access the test database without a password. See the user manual for more information.

Q: How do you determine the root cause for a query that just hangs?

A: There are several possible reasons for a hanging query. They include:

  1. Poorly optimized query – it doesn’t really “hang,” but just requires a lot of time to execute. Dealing with these types of queries was discussed in the Troubleshooting Slow Queries webinar.
  2. The query is waiting for a lock set by another query or transaction. These issues were discussed in the Troubleshooting locking issues webinar
  3. A bug in MySQL.

When I see hanging queries I look into all the above possibilities. But I prefer to start at query optimization.

Q: Can we get the powerpoint slides?

A: I do not use PowerPoint to create slides. Slides are available in PDF format only.

Q: 1) Is it safety to use SQL_LOG_BIN = 0 for specific connections (statements), for example for DELETE, when we need to keep old data on a slave, but delete from master? What are side-effects? Can it break replication?

A: Using

SQL_LOG_BIN = 0
 itself is safe, but you need to understand what you are doing. For example, if you delete data in a table that has a unique key on the master, and then insert a row that has the same unique value that existed in one of rows you deleted, replication will fail with “Duplicate key” error.

Q: Is it reasonable to disable query_cache_type (set 0) on MySQL instances with very big (huge) amount of data?

A: Yes. I would recommend it.

Q: 3) How does the number of innodb_buffer_pool_instances affect performance? Does a lot of innodb_buffer_pool_instances = high performance?

A: InnoDB introduced buffer pool instances to reduce contention “as different threads read and write to cached pages“. However, they improve performance only if you have many concurrent threads inside InnoDB.

Q: I have a question, where can I download the threadpool plugin at Percona? I checked your download page and couldn’t find it. Is it bundled in the Percona official release? wW have 5.6.28,29 and 30 and there is no thread.so in the plugin directory. Can you let me know how to get it?

A: Percona built thread pool into Percona Server; a separate download isn’t necessary. See the user manual for instructions.

Save

by Sveta Smirnova at June 22, 2016 11:18 PM

Sneak peek at the Percona Live Europe Amsterdam 2016 talks

Percona Live Europe Amsterdam 2016 talks

Percona Live Europe Amsterdam 2016 talksOn behalf of the Percona Live Conference Committee, I am excited to announce the sneak peek schedule for the Percona Live Europe Amsterdam 2016 talks!

Percona Live Europe will feature a variety of formal tracks and sessions related to MySQL, MongoDB and ODBMS from members of the open source community. With many slots to fill, there will be no shortage of great content this year. Though we won’t finalize the entire conference schedule until mid-July, this preview list of talks is sure to whet your appetite! So without further ado, here is the SNEAK PEEK SCHEDULE!

Want to Present at Percona Live Europe Amsterdam?

We are still looking for people to give talks! The committee has begun their work rating talks, but there is still time to submit. We are looking for a range of interesting talks and subjects around MySQL, MongoDB, and ODBMS. Some topics of interest include:

  • Database Virtualization
  • Integration or coexistence between SQL and NoSQL
  • Automating service life-cycle
  • Load balancing
  • Cluster control
  • Multiple Data Store Technologies and Management Solutions

But hurry, the call for papers closes July 18th! If selected, you get a free pass, and the chance to dazzle your open source community peers. Apply now!

Become a Percona Live Europe Amsterdam Sponsor

Sponsor the Percona Live Europe Amsterdam conference. There are still sponsorship opportunities at Percona Live Europe. Being a sponsor allows you to demonstrate thought leadership, promote brand awareness and support the open source community. Need a reason? Here are ten reasons to become a sponsor. Sponsor the Percona Live Europe now.

Sponsor Percona Live Europe now.

by Kortney Runyan at June 22, 2016 05:46 PM

Jean-Jerome Schmidt

Webinar: Become a MongoDB DBA - What to Monitor (if you’re really a MySQLer)

To operate MongoDB efficiently, you need to have insight into database performance. And with that in mind, we’ll dive into monitoring in this second webinar in the ‘Become a MongoDB DBA’ series.

MongoDB offers many metrics through various status overviews and commands, but which ones really matter to you? How do you trend and alert on them? What is the meaning behind the metrics?

We’ll discuss the most important ones and describe them in ordinary plain MySQL DBA language. And we’ll have a look at the open source tools available for MongoDB monitoring and trending.

Finally, we’ll show you how to leverage ClusterControl’s MongoDB metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, July 12th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, July 12th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • How does MongoDB monitoring compare to MySQL
  • Key MongoDB metrics to know about
  • Trending or alerting?
  • Available open source MongoDB monitoring tools
  • How to monitor MongoDB using ClusterControl
  • Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 16 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

We look forward to “seeing” you there!

This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a MongoDB DBA’ blog series.

by Severalnines at June 22, 2016 01:44 PM

June 21, 2016

Peter Zaitsev

Docker automatic MySQL slave propagation

Docker automatic MySQL slave propagation

Docker automatic MySQL slave propagationIn this post, we’ll discuss Docker automatic MySQL slave propagation for help with scaling.

In my previous posts on the Docker environment, I covered Percona XtraDB Cluster. Percona XtraDB Cluster can automatically scale by conveniently adding new nodes using the highly automated State Snapshot Transfer. State Snapshot Transfer allows a new node to copy data from an existing node (I still want to see how this is possible with MySQL Group Replication).

This is not the case with regular MySQL Replication. With MySQL Replication, the slave setup still requires manual steps (well, unless you’ve already scripted it for your environment). At least these are “simple” steps (ha!). Percona XtraBackup can setup replication with less work (see this link for details: https://www.percona.com/doc/percona-xtrabackup/2.4/howtos/setting_up_replication.html), but it still requires poking around and switching between servers.

However, nothing stops us from making it more automatic (similar to SST in Percona XtraDB Cluster), especially with Docker images. Why Docker? Because Docker provides a highly-controlled environment where we can orchestrate how scripts get executed. Severalnines provides a great intro into MySQL with Docker.

There are a few more components for this setup:

Before jumping to my solution, I should point to some work in this area by Joyent: https://www.joyent.com/blog/dbaas-simplicity-no-lock-in.

I propose my image https://hub.docker.com/r/perconalab/ps-master-slave/, with sources on GitHub https://github.com/percona/percona-docker/tree/master/percona-server-master-slave.

First, we need to start a master node:

docker run -d -p 3306:3306 --net=replicaset_net
 --name=replicaset_master
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

I assume that we’ve created the network replicaset_net already, either bridge or overlay.

You can create a slave by pointing to the master container:

docker run -d -p 3306 --net=replicaset_net
 --name=replicaset_slave1
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e MASTER_HOST=replicaset_master
 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

The started node will automatically connect to MASTER_HOST, copy the data and perform all the steps needed to start the slave.

You can even copy data from a running slave, instead of the master, like this:

docker run -d -p 3306 --net=replicaset_net
 --name=replicaset_slave2
 -e MYSQL_ROOT_PASSWORD=Theistareyk
 -e MASTER_HOST=replicaset_master
 -e SLAVE_HOST=replicaset_slave1
 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

This node will copy data from SLAVE_HOST, but then will point itself to MASTER_HOST.

Docker Network lets you use container names "replicaset_master" and "replicaset_slave1" instead of IP addresses, which is very convenient.

As the result of above, we have one master and two slaves running. We can start as many slave nodes as needed.

Please remember, this is more proof-of-concept than “production ready” images, but it gives a good direction for implementation.

by Vadim Tkachenko at June 21, 2016 11:42 PM

Jean-Jerome Schmidt

Severalnines Launches #MySQLHA CrowdChat

Today we launch our live CrowdChat on everything #MySQLHA!

This CrowdChat is brought to you by Severalnines and is hosted by a community of subject matter experts. CrowdChat is a community platform that works across Facebook, Twitter, and LinkedIn to allow users to discuss a topic using a specific #hashtag. This crowdchat focuses on the hashtag #MySQLHA. So if you’re a DBA, architect, CTO, or a database novice register to join and become part of the conversation!

Join this online community to interact with experts on Galera clusters. Get your questions answered and join the conversation around everything #MySQLHA.

Register free

Meet the experts

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

Krzysztof Książek is a Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

Ashraf Sharif is a System Support Engineer at Severalnines. He has previously worked as principal consultant and head of support team and delivered clustering solutions for big websites in the South East Asia region. His professional interests focus on system scalability and high availability.

Vinay Joosery is a passionate advocate and builder of concepts and businesses around Big Data computing infrastructures. Prior to co-founding Severalnines, Vinay held the post of Vice-President EMEA at Pentaho Corporation - the Open Source BI leader. He has also held senior management roles at MySQL / Sun Microsystems / Oracle, where he headed the Global MySQL Telecoms Unit, and built the business around MySQL's High Availability and Clustering product lines. Prior to that, Vinay served as Director of Sales & Marketing at Ericsson Alzato, an Ericsson-owned venture focused on large scale real-time databases.

by Severalnines at June 21, 2016 02:50 PM

June 20, 2016

Peter Zaitsev

Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host

Percona XtraDB Cluster Reference Architecture

Percona XtraDB Cluster nodes with Linux Network namespacesThis post is a continuance of my Docker series, and examines Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host.

In this blog I want to look into a lower-level building block: Linux Network Namespace.

The same as with cgroups, Docker uses Linux Network Namespace for resource isolation. I was looking into cgroup a year ago, and now I want to understand more about Network Namespace.

The goal is to both understand a bit more about Docker internals, and to see how we can provide network isolation for different processes within the same host. You might need to isolate process when running several MySQL or MongoDB instances on the same server (which might come in handy during testing). In this case, I needed to test ProxySQL without Docker.

We can always use different ports for different MySQL instances (such as 3306, 3307, 3308), but it quickly gets complicated.

We could also use IP address aliases for an existing network interface, and use bind=<IP.ADD.RE.SS> for each instance. But since Percona XtraDB Cluster can use three different IP ports and network channels for communications, this also quickly gets complicated.

Linux Network Namespace provides greater network isolation for resources so that it can be a better fit for Percona XtraDB Cluster nodes. Now, setting up Network namespaces in and of itself can be confusing; my recommendation is if you can use Docker, use Docker instead. It provides isolation on process ID and mount points, and takes care of all the script plumbing to create and destroy networks. As you will see in our scripts, we need to talk about directory location for datadirs.

Let’s create a network for Percona XtraDB Cluster with Network Namespaces.

I will try to do the following:

  • Start four nodes of Percona XtraDB Cluster
  • For each node, create separate network namespace so the nodes will be able to allocate network ports 3306, 4567, 4568 without conflicts
  • Assign the nodes IP addresses: 10.200.10.2-10.200.10.5
  • Create a “bridge interface” for the nodes to communicate, using IP address 10.200.10.1.

For reference, I took ideas from this post: Linux Switching – Interconnecting Namespaces

First, we must create the bridge interface on the host:

BRIDGE=br-pxc
brctl addbr $BRIDGE
brctl stp $BRIDGE off
ip addr add 10.200.10.1/24 dev $BRIDGE
ip link set dev $BRIDGE up

Next, we create four namespaces (one per Percona XtraDB Cluster node) using the following logic:

for i in 1 2 3 4
do
 ip netns add pxc_ns$i
 ip link add pxc-veth$i type veth peer name br-pxc-veth$i
 brctl addif $BRIDGE br-pxc-veth$i
 ip link set pxc-veth$i netns pxc_ns$i
 ip netns exec pxc_ns$i ip addr add 10.200.10.$((i+1))/24 dev pxc-veth$i
 ip netns exec pxc_ns$i ip link set dev pxc-veth$i up
 ip link set dev br-pxc-veth$i up
 ip netns exec pxc_ns$i ip link set lo up
 ip netns exec pxc_ns$i ip route add default via 10.200.10.1
done

We see the following interfaces on the host:

1153: br-pxc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
 link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff
 inet 10.200.10.1/24 scope global br-pxc
 valid_lft forever preferred_lft forever
 inet6 fe80::2ccd:6ff:fe04:c7d5/64 scope link
 valid_lft forever preferred_lft forever
1154: br-pxc-veth1@if1155: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether c6:28:2d:23:3b:a4 brd ff:ff:ff:ff:ff:ff link-netnsid 8
 inet6 fe80::c428:2dff:fe23:3ba4/64 scope link
 valid_lft forever preferred_lft forever
1156: br-pxc-veth2@if1157: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff link-netnsid 12
 inet6 fe80::3032:4cff:fe36:2287/64 scope link
 valid_lft forever preferred_lft forever
1158: br-pxc-veth3@if1159: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether 8a:3a:c1:e0:8a:67 brd ff:ff:ff:ff:ff:ff link-netnsid 13
 inet6 fe80::883a:c1ff:fee0:8a67/64 scope link
 valid_lft forever preferred_lft forever
1160: br-pxc-veth4@if1161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000
 link/ether aa:56:7f:41:1d:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 11
 inet6 fe80::a856:7fff:fe41:1dc3/64 scope link
 valid_lft forever preferred_lft forever

We also see the following network namespaces:

# ip netns
pxc_ns4 (id: 11)
pxc_ns3 (id: 13)
pxc_ns2 (id: 12)
pxc_ns1 (id: 8)

After that, we can check the namespace IP address:

# ip netns exec pxc_ns3 bash
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
 valid_lft forever preferred_lft forever
 inet6 ::1/128 scope host
 valid_lft forever preferred_lft forever
1159: pxc-veth3@if1158: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
 link/ether 4a:ad:be:6a:aa:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
 inet 10.200.10.4/24 scope global pxc-veth3
 valid_lft forever preferred_lft forever
 inet6 fe80::48ad:beff:fe6a:aac6/64 scope link
 valid_lft forever preferred_lft forever

To enable communication from inside the network namespace to the external world, we should add some iptables rules, e.g.:

iptables -t nat -A POSTROUTING -s 10.200.10.0/255.255.255.0 -o enp2s0f0 -j MASQUERADE
iptables -A FORWARD -i enp2s0f0 -o $BRIDGE -j ACCEPT
iptables -A FORWARD -o enp2s0f0 -i $BRIDGE -j ACCEPT

where enp2s0f0 is an interface that has an external IP address (by some reason modern Linux distros decided to use names like enp2s0f0 for network interfaces, instead old good "eth0").

To start a node (or mysqld instance) inside a network namespace, we should use

ip netns exec prefix
 for commands.

For example to start Percona XtraDB Cluster first node, in the namespace pxc_ns1, with IP address 10.200.10.2, we use:

ip netns exec pxc_ns1 mysqld --defaults-file=node.cnf --datadir=/data/datadir/node1 --socket=/tmp/node1_mysql.sock --user=root --wsrep_cluster_name=cluster1

To start following nodes:

NODE=2 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://10.200.10.2" --wsrep_cluster_name=cluster1  
NODE=3 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://10.200.10.2" --wsrep_cluster_name=cluster1  
etc

As the result of this procedure, we have four Percona XtraDB Cluster nodes running in an individual network namespace, not worrying about IP address and ports conflicts. We also allocated a dedicated IP range for our cluster.

This procedure isn’t trivial, but it is easy to script. I also think provides a good understanding what Docker, LXC or other containerization technologies do behind the scenes with networks.

 

by Vadim Tkachenko at June 20, 2016 10:52 PM

Webinar Thursday June 23: Choosing a MySQL High Availability Solution Today

MySQL High Availability

Please join Percona, Technical Account Manager, Michael Patrick on Thursday, June 23, 2016 at 10 AM PDT (UTC-7) as he presents “Choosing a MySQL High Availability Solution Today.”

High availability (HA) is one of the solutions to improve performance, avoid data outages, and recover quickly from disasters. An HA environment helps guarantee that your database doesn’t have a single point of failure, accommodates rapid growth and exponentially increasing database size, and enables the applications that power your business.

Michael will discuss various topologies for achieving High Availability with MySQL.

Topics include:

  • Percona XtraDB Cluster
  • DRBD
  • MHA
  • MySQL Orchestrator

Each solution has advantages and challenges. Attendees will gain a deeper understanding of how to choose the best solution for their needs while avoiding some of the pitfalls of making the wrong choices. Avoid the costly mistakes that commonly cause outages and lost revenue. Plus get the latest and greatest developments in the technologies!

Register now.

MySQL High AvailabilityMichael Patrick
Technical Account Manager

Mike came to Percona in 2015 after working for a variety of large corporations running hundreds of MySQL and Percona XtraDB Clusters in production environments. He is skilled in performance tuning, server auditing, high availability, multi-data center replication, migration, and other MySQL-related activities. Mike holds a B.S. in Computer Science from East Tennessee State University. In his off time, he enjoys Martial Arts and Cave Exploration. He lives in East Tennessee with his wife and he has four children.

by Dave Avery at June 20, 2016 07:28 PM

Shlomi Noach

Solving the non-atomic table swap, Take II

Following up and improving on Solving the Facebook-OSC non-atomic table swap problem, we present a better, safe solution.

Quick, quickest recap:

We are working on a triggerless online schema migration solution. It is based on an asynchronous approach, similarly to the FB osc and as opposed to the synchronous solution as used by pt-online-schema-change.

We asynchronously synchronize (is that even a valid statement?) between some table tbl and a ghost table ghost, and at some time we want to cut-over: swap the two; kick out tbl and put ghost in its place and under its name.

However, we cannot use the single statement rename tbl to tbl_old, ghost to tbl, because we use the asynchronous approach, where at the time we lock tbl for writes, we still have some events we need to process and apply onto ghost before swapping the two.

And MySQL does not allow a lock tables tbl write; ... ; rename tbl to tbl_old, ghost to tbl.

In Solving the Facebook-OSC non-atomic table swap problem we suggested a way that works, unless when it doesn't work. Read the caveat at the end of the post. Premature death of a connection that participates in the algorithm causes a chain reaction that leads to the premature execution of the rename statement, potentially before we've applied those remaining events. This leads to data inconsistency between the old table and the new table, and is unacceptable.

To that effect, we were more inclined to go with the Facebook solution, which makes a two-step: lock tables tbl write; alter table tbl rename to tbl_old; ... ; alter table ghost rename to tbl;

This two-step solution is guaranteed not to have data inconsistency. Alas, it also implies an outage. There's a brief moment, in between the two renames, and during that time where we apply those last changes, where the table tbl is simply not there.

Not all applications will fail gracefully on such a scenario.

UDF

We looked at a solution based on UDFs, where we would create global wait conditions, that are not connection based.

We don't like UDFs. You need to compile them for every new version. Puppetizing their setup is not fun. We wouldn't like maintaining this. We wouldn't like doing the operations for this. Neither would the community.

We want to make this a community solution. Can we do without UDF?

Rewriting MySQL

We wish to avoid forking our own version of MySQL. It's not what we do and it's a pain.

A pure MySQL solution?

We found a solution to embrace; it is optimistic, and safe. hat optimistic means is explained further on, but let's discuss safe:

The previous solution we came up with as unsafe because breakage of a single component in the algorithm would lead to inconsistent data. The algorithm itself was fine, as long as no one would break it from the outside. This is the concern: what if some crazy cronjob that cleans up connections (kills idle connections, kills long running transactions) or some unfortunate user command kills one of the connections involved in the cut-over phase? This is not something that would happen every day, but can we protect against it? Our priority is to keep our data intact.

The solution allows breakage. Even in the face of death of connections, data is not lost/corrupted, and at worst -- causes a FB-like, recoverable outage scenario.

A step towards the solution, a flawed one

I wish to illustrate something that looks like it would work, but in fact has a hidden flaw. We will later improve on that solution.

Let's assume we have tblghost tables. We execute the following by multiple connections; we call them C1, C2, C3, ...:

  • C1: lock tables tbl write;
  • C2, C3, ..., C17: normal app connections, issuing insert, delete, update on tbl. Because of the lock, they are naturally blocked.
  • We apply those last event we need to apply onto ghost. No new events are coming our way because tbl is blocked.
  • C18: rename table tbl to tbl_old, ghost to tbl; (blocked as well)
  • C1: unlock tables(everything gets released)

Let's consider the above, and see why it is flawed. But first, why it would typically work in the first place.

  • Connections C2, ..., C17 came first, and C18 came later. Nevertheless MySQL prioritizes C18 and moves it up the queue of waiting queries on tbl. When we unlock, C18 is the first to execute.
  • We only issue the rename once we're satisfied we've applied those changes. We only unlock once we're satisfied that the rename has been executed.
  • If for some reason C1 disconnects before we issue the rename - no problem, we just retry from scratch.

What's the flaw?

We rename when C1 holds the lock. We check with C1 that it is alive and kicking. Yep, it's connected and holding the lock. Are you sure? Yep, I'm good! Really really sure? Yep! OK then, let's rename!

"Oh darn", says C1, "now that you went ahead to rename, but just before you actually sent the request, I decided to take time off and terminate". Or, more realistically, some job would kill C1.

What happens now? The rename is not there yet. All those queries get released, and are immediately applied onto tbl, and then the rename applies, kicks all those changes into oblivion, and puts ghost in place, where it immediately receives further writes.

Those blocking queries were committed but never to be seen again.

So here's another way to look at the problem: the rename made it through even though the connection C1 died just prior to that, whereas we would have loved the rename to abort upon such case.

Is there a way in MySQL to cause an operation to fail or block when another connection dies? It's the other way around! Connections hold locks, and those get released when they die!

But there's a way...

Three step, safe, optimistic solution

Here are the steps to a safe solution:

  • C1: lock tables tbl write;
  • C2, C3, ..., C17: normal app connections, issuing insert, delete, update on tbl. Because of the lock, they are naturally blocked.
  • We apply those last event we need to apply onto ghost. No new events are coming our way because tbl is blocked.
  • C18: checking that C1 is still alive, then rename table tbl to tbl_old
  • C19: checking to see that C18's rename is in place (via show processlist), and that C1 is still alive; then issues: rename table ghost to tbl
  • (meanwhile more queries approach tbl, it doesn't matter, they all get deprioritized, same as C2...C17)
  • C1: unlock tables

What just happened? Let's first explain some stuff:

  • C18's rename gets prioritized over the DMLs, even though it came later. That is how MySQL prioritizes queries on metadata-locked tables.
  • C18 checks C1 is still alive, but as before, there's always the chance C1 will die just at the wrong time -- we're going to address that.
  • C19 is interested to see that C18 began execution, but potentially C18 will crash by the time C19 actually issues its own rename -- we're going to address that
  • C19's query sounds weird. At that time tbl still exists. You'd expect it to fail immediately -- but it does not. It's valid. This is because tbl's metadata lock is in use.
  • C19 gets prioritized over all the DMLs, but is known to be behind C18. The two stay in same order of arrival. So, C18 is known to execute before C19.
  • When C1 unlocks, C18 executes first.
  • Metadata lock is still in place on tbl even though it doesn't actually exist, because of C19.
  • C19 operates next.
  • Finally all the DMLs execute.

What happens on failures?

  • If C1 dies just as C18 is about to issue the rename, we get an outage: tbl is renamed to tbl_old, and the queries get released and complain the table is just not there.
    • C19 will not initiate because it is executed after C18 and checks that C1 is alive -- which turns to be untrue.
    • So we know we have outage, and we quickly rename tbl_old to tbl; and go drink coffee, then begin it all again.
    • The outage is unfortunate, but does not put our data in danger.
  • If C1 happens to die just as C19 is about to issue its rename, there's no data integrity: at this point we've already asserted the tables are in sync. As C1 dies, C18 will immediately rename tbl to tbl_old. An outage will occur, but not for long, because C19 will next issue rename ghost to tbl, and close the gap. We suffered a minor outage, but no rollback. We roll forward.
  • If C18 happens to die just as C19 is about to issue its rename, nothing bad happens: C19 is still blocking for as long as C1 is running. We find out C18 died, and release C1. C19 attempts to rename ghost onto tbl, but tbl exists and the query fails. The metadata lock is released and all the queries resume operation on the original tbl. The operation failed but without error. We will need to try the entire cycle again.
  • If both C1 and C18 fail at the time C19 is about to begin its rename, same as above.
  • If C18 fails as C19 is already in place, same as above.
  • If C1 fails as C19 is already in place, it's as good as having it issue the unlock tables. We're happy.
  • If C19 fails at any given point, we suffer outage. We revert by rename tbl_old to tbl

This solution relies on the notion that if a previous connection failed, we would not be able to rename ghost to tbl because the table would still be there. That's what we were looking for; but instead of looking at locks, which get released when a connection terminates, we used a persistent entity: a table.

Conclusion

The algorithm above is optimistic: if no connections get weirdly killed, it's a valid locking solution, and queries & app are unaware that anything happened (granted, app will notice write latency). If connections do get weirdly killed, we get table-outage at worst case -- an outage that is already considered to be a valid solution anyhow. The algorithm will not allow data corruption.

by shlomi at June 20, 2016 09:26 AM

Jean-Jerome Schmidt

What to test before upgrading to MySQL 5.7

As we saw in a previous post, there are some important changes between MySQL 5.6 and 5.7. Since the behaviour of some existing features been altered, some detailed testing of the upgrade is in order. In this blog post, we will look at some of the important things to keep in mind when preparing and performing these tests.

How to design a test environment?

Your test environment has to be as similar to your production environment as possible. We are talking here about using the same hardware, using the same dataset and running the same query mix. If you have a complex replication topology, try to replicate it in test as well. Of course, sometimes it’s not possible to use real data or real queries because of  security concerns. But keep in mind that the results get less and less reliable as you have more differences between test and production.

Collecting queries for regression tests

To run real-world queries, you need to collect them first. There are many ways to do that - you can enable the slow query log and log all queries there (long_query_time=0), you can use tcpdump and capture the traffic somewhere between MySQL and the application. It’s very important to keep in mind that you need to collect all types of queries. If your query mix is stable all the time, probably 20 - 30minutes of traffic will suffice. If you have some variation in it, though, things will be different. Maybe your application runs different queries at different times of the day. Maybe there’s some ETL process that gets kicked off at a certain scheduled time. Maybe you run some reporting. So on and so forth - the bottomline is - you have to collect all types of queries in order to make tests as realistic as possible.

Performance regression tests

Some simple tests can be executed by a DBA - tools like Percona Playback or pt-upgrade can help here. Especially pt-upgrade is useful as it can collect performance data about queries executed on different MySQL hosts and prepare nice reports about them. When running pt-upgrade, you need to make sure your test nodes are at the exact same state - ideally, cold, no data in cache or buffers. You need to keep in mind that query performance can be affected by network latency, therefore it’s better to execute pt-upgrade locally.

Tests on the application side

Simple tests like pt-upgrade won’t replace detailed acceptance tests executed from the application. This step is a must for any upgrade runbook. The more data you can collect while running tests on the application, the better. Exact process will differ between applications but the idea is simple - check everything you can and make sure all aspects of your application work correctly with a new version.

The above is a brief overview of the testing process.  You can find more details in our ebook “Upgrading to MySQL 5.7”.

by Severalnines at June 20, 2016 07:45 AM

June 18, 2016

Henrik Ingo

Creating Impress.js presentations in colored JSON with Highlight.js

Last month I wrote about impress.js, and how I've started using it for my presentations. This has been going well, and during the past month I've actually given 2 more presentations using impress.js:

Dreams and fears of a database consultant
MongoDB and GIS

(You really have to click those links, embedding these presentations wouldn't make them justice!)

read more

by hingo at June 18, 2016 02:54 PM

June 17, 2016

Peter Zaitsev

InnoDB locks and transaction isolation level

Troubleshooting configuration issues

InnoDB locks and transaction isolationWhat is the difference between InnoDB locks and transaction isolation level? We’ll discuss it in this post.

Recently I received a question from a user about one of my earlier blog posts. Since it wasn’t sent as a comment, I will answer it here. The question:

> I am reading your article:
> https://www.percona.com/resources/technical-presentations/troubleshooting-locking-issues-percona-mysql-webinar

> Full table scan locks whole table.

> Some bad select (read) query can do full table scan on InnoDB, does it lock whole table please?

> My understanding was that SELECT (read) blocks another DML only in MyISAM.

To answer this question, we to need understand two different concepts: locking and transaction isolation level.

In the default transaction isolation mode for InnoDB, 

REPEATABLE READ
 and the lower
TRANSACTION ISOLATION
levels,
SELECT
  doesn’t block any DML unless it uses 
SELECT ... FOR UPDATE
 or
SELECT ... LOCK IN SHARE MODE
. On slide 20 (31) of my presentation, I use the 
SELECT ... FOR UPDATE
 statement (this is why a lock is seen).

However, if transaction isolation mode is

SERIALIZABLE
, then 
SELECT
 can block updates. You can see this in the example below:

mysql1> set transaction isolation level serializable;
Query OK, 0 rows affected (0,00 sec)
mysql1> begin;
Query OK, 0 rows affected (0,00 sec)
mysql1> select * from employees join titles using(emp_no);
c936e6fc4c6cbaf77679ba5013339dff  -
443308 rows in set (29,69 sec)
mysql2> begin;
Query OK, 0 rows affected (0,00 sec)
mysql2> update titles set title='Engineer' where emp_no=15504;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Regarding the different level locks set by InnoDB, I recommend that you follow the links on slide 37 (67) of the presentation, especially “InnoDB Locking Explained With Stick Figures” by Bill Karwin. Innodb table locks are also explained very well in this blog post by Peter Zaitsev.

by Sveta Smirnova at June 17, 2016 06:49 PM

MariaDB Foundation

MariaDB 5.5.50 and updated connectors now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.50, MariaDB Connector/J 1.4.6, and MariaDB Connector/ODBC 2.0.11. See the release notes and changelogs for details on these releases. Download MariaDB 5.5.50 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/J 1.4.6 Release Notes Changelog […]

The post MariaDB 5.5.50 and updated connectors now available appeared first on MariaDB.org.

by Daniel Bartholomew at June 17, 2016 05:32 PM

Peter Zaitsev

MongoDB Security: Why pay for Enterprise when Open Source has you covered?

MongoDB Security

MongoDB SecurityDoes ensuring MongoDB security justify the cost of the Enterprise version? In my opinion, the answer is no.

MongoDB Inc© blasted an email with a study showing that the average cost of a data breach can be $5.9M. You can find the key finding in IBM’s 2015 report here:

NH Learning Solutions

Key findings:

Data breaches cost the most in the US and Germany and the lowest in Brazil and India. The average per capita cost of a data breach is $217 in the US and $211 in Germany. The lowest cost is in Brazil ($78) and India ($56). The average total organizational cost in the US is $6.5 million and in Germany $4.9 million. The lowest organizational cost is in Brazil ($1.8 million) and India ($1.5 million).

So it might be even worse than you thought! The study presented a clear per-record cost, which helps companies determine the real cost to them. This brings us to the recently-published MongoDB Security Architecture white paper. While being a great document, it raises some questions that should be addressed. We should dissect exactly what paying for an enterprise build gets you that is not already available in the open source community version. 

The key security features in above white paper are:

  • Authentication. LDAP Authentication centralizes things with your company directory (for PCI)
  • Authorization. What role-based access controls the database provides
  • Encryption. Broken into “At-Rest” and “In-Transit” as part of regular PCI requirements
  • Governance. Document validation and even checking for sensitive data such as an SSN or birth data
  • Auditing. The ability to see who did what in the database (also required for PCI).

That list lets us break down each into why they are important, and is it something that should be free in the MongoDB ecosystem.

Authentication

MongoDB has built-in users (off by default). It misses things, however, like password complexity, age-based rotation, centralization, and identification of user roles versus service functions. These are essential to passing PCI. PCI requires that people don’t use old passwords, easy-to-break passwords, and that user access gets revoked when there is a change in status (such as leaving a department or the company). Thankfully LDAP is an open-source project of its own. Many connectors allow the use of Windows Active Directory (AD) systems to talk with LDAP.

Using LDAP and AD, you can tie users in with your corporate directory. When they change roles or leave the company, they can be removed by HR from your database group. Practically, this means there are automated systems in place to ensure only those you want to access the data manually can do so, without accidently missing something. It is important to note that the MongoDB Community Edition© does not have LDAP support. For this you need MongoDB Inc’s© Enterprise build. You can also use Percona Server© for MongoDB. Percona’s build is open source and free, and we offer a support contract so that if you require support or help it is available.

Authorization

Role-based Authorization or (RBAC) is core to MongoDB Community©, MongoDB Enterprise© and Percona Server for MongoDB. In Mongo 2.6+ you can use built-in roles, or even craft your own down to what actions someone might be able to do – only exposing exactly what you want users to be able to do with razor precision. This is such a core MongoDB feature that it is everywhere and in every build regardless of vendor.

Encryption

As I mentioned before, this is broken into two areas of discussion: At-Rest and In-Transit

At-Rest.: This is defined as on-disk, and typically refers to data saved to an encrypted storage location. This is to prevent theft by physical means and create backups that are stored in a fashion not easily read by any third party. There are practical limits to this. The biggest are trusting your sys-admins and assuming that a hacker has not been able to get access to the system to pretend they are one of your admins. This is not an issue unique to MongoDB or Percona. Such systems used more globally work here as well. They might include things like LUKS/cryptfs, or might go into even more secure methods such as signing encryption keys with LDAP, Smart Cards, and RSA type tokens.

When doing this level of encryption, you need to consider things like automounting and decrypting of drives. However, the general point is this is not something new to your system administrators, and they can manage this requirement in the same way they manage it in other parts of the system. The added benefit is a single procedure for storage encryption, not one per whatever technology a particular function uses.

In-Transit.: To keep this simple, we are just talking about using SSL on all connections (it can be as simple as that in the end). Depending on your internal requirements,©  and Percona Server for MongoDB all additionally support custom Signing Authorities (CA), x509 clients and member certificates. The x509 system is very useful when you want to make sure only authorized machines are allowed to talk to your system because they can even attempt to send a user/password to the system.

Governance

Put simply; this is the ability to enforce complex standards on the system by using Document Validation. This is an important feature that is available to MongoDB Community©, MongoDB Enterprise© and Percona Server for MongoDB. Governance is about the insertion and updating of data. It is also useful for checking if a field name like bday, birthday, ssn, social, ect is defined. We are not limited to those: you could also do string regex’s on things like user_id to check for a $regex such as “^d{3}-d{2}-d{4}$” (which would be a Social Security Number), or a checking for a credit card number. These examples are ways your DBAs and security architects can help prevent developers from exposing the company to added risk.

You can also ensure schema changes only occur when properly vetted by your DBA staff, as the developer code could fail if they change the format of what you’re storing in the database. This brings an additional layer of control to MongoDB’s dynamic production schema (allowing itself to store anything even if it should not).

Auditing

Central to any good security design – and required by PCI – is being able to track what user did what action in the database (very similar to how you need to do it on your actual servers). At this time, MongoDB’s Community© build can’t track this. However, both MongoDB Enterprise©  and Percona Server for MongoDB support this feature. Both work in similar ways, allowing you to filter output to a particular user, database, collection, or source location. This gives you a log to review in any security incident and, more importantly, shows your PCI auditor that you’ve taken the correct steps to both protect your database from an intrusion and understand and incursions depth (should one occur).

Hopefully, this has been a good overview of the security options in MongoDB Community© and Enterprise© versions and Percona Server for MongoDB. Even without an enterprise-style contract, you can fulfill all your PCI compliance needs and protect your company using reasonable and known methods. Please note Percona strives to bring enterprise features to the community, but not to the point of wanting to lock you into a non-open source build. If you need support with anything MongoDB, we have the community and its users as our first priority!

by David Murphy at June 17, 2016 04:55 PM

June 16, 2016

Peter Zaitsev

Scaling Percona XtraDB Cluster with ProxySQL in Kubernetes

Percona XtraDB Cluster nodes with Linux Network namespaces

Percona XtraDB Cluster with ProxySQL in KubernetesHow do you scale Percona XtraDB Cluster with ProxySQL in Kubernetes?

In my previous post I looked how to run Percona XtraDB Cluster in a Docker Swarm orchestration system, and today I want to review how can we do it in the more advanced Kubernetes environment.

There are already some existing posts from Patrick Galbraith (https://github.com/kubernetes/kubernetes/tree/release-1.2/examples/mysql-galera) and Raghavendra Prabhu (https://github.com/ronin13/pxc-kubernetes) on this topic. For this post, I will show how to run as many nodes as I want, see what happens if we add/remove nodes dynamically and handle incoming traffic with ProxySQL (which routes queries to one of working nodes). I also want to see if we can reuse the ReplicationController infrastructure from Kubernetes to scale nodes to a given number.

These goals should be easy to accomplish using our existing Docker images for Percona XtraDB Cluster (https://hub.docker.com/r/percona/percona-xtradb-cluster/), and I will again rely on the running service discovery (right now the images only work with etcd).

The process of setting up Kubernetes can be pretty involved (but it can be done; check out the Kubernetes documentation to see how: http://kubernetes.io/docs/getting-started-guides/ubuntu/). It is much more convenient to use a cloud that supports it already (Google Cloud, for example). I will use Microsoft Azure, and follow this guide: http://kubernetes.io/docs/getting-started-guides/coreos/azure/. Unfortunately the scripts from the guide install previous version of Kubernetes (1.1.2), which does not allow me to use ConfigMap. To compensate, I will duplicate the ENVIRONMENT variables definitions for Percona XtraDB Cluster and ProxySQL pods. This can be done more optimally in the recent version of Kubernetes.

After getting Kurbernetes running, starting Percona XtraDB Cluster with ProxySQL is easy using following pxc.yaml file (which you also can find with our Docker sources https://github.com/percona/percona-docker/tree/master/pxc-57/kubernetes):

apiVersion: v1
kind: ReplicationController
metadata:
 name: pxc-rc
 app: pxc-app
spec:
 replicas: 3 # tells deployment to run N pods matching the template
 selector:
 app: pxc-app
 template: # create pods using pod definition in this template
 metadata:
 name: pxc
 labels:
 app: pxc-app
 spec:
 containers:
 - name: percona-xtradb-cluster
 image: perconalab/percona-xtradb-cluster:5.6test
 ports:
 - containerPort: 3306
 - containerPort: 4567
 - containerPort: 4568
 env:
 - name: MYSQL_ROOT_PASSWORD
 value: "Theistareyk"
 - name: DISCOVERY_SERVICE
 value: "172.18.0.4:4001"
 - name: CLUSTER_NAME
 value: "k8scluster2"
 - name: XTRABACKUP_PASSWORD
 value: "Theistare"
 volumeMounts:
 - name: mysql-persistent-storage
 mountPath: /var/lib/mysql
 volumes:
 - name: mysql-persistent-storage
 emptyDir: {}
 imagePullPolicy: Always
---
apiVersion: v1
kind: ReplicationController
metadata:
 name: proxysql-rc
 app: proxysql-app
spec:
 replicas: 1 # tells deployment to run N pods matching the template
 selector:
 front: proxysql
 template: # create pods using pod definition in this template
 metadata:
 name: proxysql
 labels:
 app: pxc-app
 front: proxysql
 spec:
 containers:
 - name: proxysql
 image: perconalab/proxysql
 ports:
 - containerPort: 3306
 - containerPort: 6032
 env:
 - name: MYSQL_ROOT_PASSWORD
 value: "Theistareyk"
 - name: DISCOVERY_SERVICE
 value: "172.18.0.4:4001"
 - name: CLUSTER_NAME
 value: "k8scluster2"
 - name: MYSQL_PROXY_USER
 value: "proxyuser"
 - name: MYSQL_PROXY_PASSWORD
 value: "s3cret"
---
apiVersion: v1
kind: Service
metadata:
 name: pxc-service
 labels:
 app: pxc-app
spec:
 ports:
 # the port that this service should serve on
 - port: 3306
 targetPort: 3306
 name: "mysql"
 - port: 6032
 targetPort: 6032
 name: "proxyadm"
 # label keys and values that must match in order to receive traffic for this service
 selector:
 front: proxysql

Here is the command to start the cluster:

kubectl create -f pxc.yaml

The command will start three pods with Percona XtraDB Cluster and one pod with ProxySQL.

Percona XtraDB Cluster nodes will register themselves in the discovery service and we will need to add them to ProxySQL (it can be done automatically with scripting, for now it is a manual task):

kubectl exec -it proxysql-rc-4e936 add_cluster_nodes.sh

Increasing the cluster size can be done with the scale command:

kubectl scale --replicas=6 -f pxc.yaml

You can connect to the cluster using a single connection point with ProxySQL: You can find it this way:

kubectl describe -f pxc.yaml
Name: pxc-service
Namespace: default
Labels: app=pxc-app
Selector: front=proxysql
Type: ClusterIP
IP: 10.23.123.236
Port: mysql 3306/TCP
Endpoints: <none>
Port: proxyadm 6032/TCP
Endpoints: <none>
Session Affinity: None

It exposes the endpoint IP address 10.23.123.236 and two ports: 3306 for the MySQL connection and 6032 for the ProxySQL admin connection.

So you can see that scaling Percona XtraDB Cluster with ProxySQL in Kubernetes is pretty easy. In the next post, I want to run benchmarks in the different Docker network environments.

by Vadim Tkachenko at June 16, 2016 06:38 PM

Why MongoRocks: Deprecating PerconaFT and MongoDB Optimistic locking

deprecating PerconaFT

deprecating PerconaFTIn this post, we’ll discuss the rationale behind deprecating PerconaFT and embracing RocksDB.

Why is Percona deprecating PerconaFT in favor of RocksDB?

Many of you may have seen Peter Zaitsev’s recent post about Percona embracing RocksDB and deprecating PerconaFT. I’m going to shed a bit more light on the issues between the locking models for PerconaFT’s and MongoDB’s core servers. When making this decision, we looked at how the differences between the engines measure up and impact other improvements we could make. In the end, we can do more for the community by focusing on engines that are in line with assumptions the core server makes every second in your daily operations.Then we have more resources available for improving the users’ experience by adding new tools, features, and improving the core server.

What is pessimistic locking?

Pessimistic locking locks an entity in the database for the entire time that it is actively used in application memory. A lock either limits or prevents other users from working with the entity in the database. A write lock indicates that the holder of the lock intends to update the entity and disallows anyone from reading, updating, or deleting the object. A read lock means that the owner of the lock does not want the object to change while it holds the lock, allowing others to read the entity but not update or delete it. The scope of a lock might be the entire database, a table, a collection of ranges of documents or a single document.

You can order pessimistic locks as follows (from broad to granular):

  1. Database locks
  2. Collection locks
  3. Range locks
  4. Document locks

The advantage of pessimistic locking is that changes to the database get made consistently and safely. The primary disadvantage is that this approach isn’t as scalable. The chance of waiting for a lock to be released increases when:

  • A system has a lot of users
  • The transactions (in MongoDB, there are transactions in the engine but not at the user level) involve a greater number of entities
  • When transactions are long-lived

Therefore, pessimistic locks limit the practical number of simultaneous users that your system can support.

What is optimistic locking?

In most database systems (NoSQL and RDBMS) expect collisions to be relatively uncommon. For example, although two clients are working with user objects, one might be working with the Bob Vader object while another works with the Luke Vader object. These won’t collide. In this case, optimistic locking becomes the most viable concurrency control strategy. If you accept the premise that collisions infrequently occur, instead of trying to prevent them you can choose to detect and then resolve them when they do occur.

MongoDB has something called a Write Conflict Exception (WCE). A WCE is like an engine-level deadlock. If a record inside the engine changes due to thread #1, thread #2 must wait for a time that it is safe to change the record, and retry then. Typically this occurs when a single document gets updated frequently. It can also occur when there are many updates, or there are ranges of locks happening concurrently. This is a perfect case of optimistic locking, preferring to resolve or retry operations when they occur rather than prevent them from happening.

Can you make these play well while limiting the amount of development resources needed?

These views are as polar opposite as you can get in the database world. In one view you lock as much as possible, preventing anyone else from making a change. In the other view you let things be as parallel as possible, and accept you will retry if two clients are updating the same document. With the nature of how many documents fit in a single block of memory, this has some real-world concerns. When you have more than one document in a memory block, you could have a situation where locking one document means 400% more documents get affected. For example, if we have an update using the IN operator with 25 entries, you could be blocking 125 documents (not 25 documents)!

That escalated rather quickly, don’t you think? Using optimistic locking in the same situation, you at most would have to retry five document write locks as the data changed. The challenge for optimistic locking is that if I have five clients that are all updating all documents, you get a flurry of updates. WCE’s come in and eventually resolve things. If you use pessimistic locking, everybody waits their turn, and each one would finish before the next could run.

Much of Percona’s engineering effort goes into what types of systems we should put in place to simulate cheap latches or locks in optimistic locking to allow pessimistic locking to work (without killing performance). This requires an enormous amount of work just to get on-par behavior from the system – specifically in update type workloads, given delete/inserts are very similar in the systems. As a result, we’ve spent more time improving the engine rather than adding additional variables and instrumentation.

Looking forward, MongoRocks aligns more to WiredTiger in its locking structure (they both run as log sequence merges or LSMs), and this means more time working on new optimizer patterns, building things to improve diagnostics or tuning the engine/system to your specific needs. We think you will be excited to see some of the progress we have been discussing for Mongo 3.4 (some of which might even make it directly into 3.2).

What is the MongoRocks anyhow and how does it compare to PerconaFT?

The last concept I want to cover is what RocksDB is exactly, what its future is and how it stacks up to PerconaFT. The most important news is Facebook is working on the core engine, which is used both by MyRocks and MongoRocks (you might have seen some of their talks on the engine). This means Percona can leverage some of the brilliant people working on RocksDB inside Facebook and focus instead on the API linking the engine into place, as well as optimizing how it uses the engine – rather than building the engine completely. Facebook is making some very significant bets on the backend use of RocksDB in several parts of the system, and potentially some user parts of the system (which have historically used InnoDB).

So what is RocksDB, and how does it work? Its core is an LSM system, which means it puts new data into the newest files as an append. Over time, the files get merged into five different levels (not covered here). As part of this process, when you have an update a new copy of the data is saved at the end of the latest file, and a memory table points a record to that location for the latest “version” of that data. In the background, the older records are marked for deletion by something called a “tombstone.” There is a background thread merging still-relevant data from old files into new files so that empty old files get unlinked from the filesystem.

This streamlines the process better than B-Tree’s constant rebalancing and empty blocks in files that need to be compacted or re-used over time. Being write-optimized means that, like PerconaFT previously, it will be faster for write-based operations than something like WiredTiger. (WiredTiger in some cases can be faster on reads, with MMAP being the fastest possible approach for reads.) This also means things like TTL’s can work pretty well in an LSM since all the items that were inserted in time order age out, and the engine can just delete the unneeded file. This solves some of the reasons people needed partitions to begin with, and it still allows sharding to work well.

We are also very excited about creating additional tools that let you query your backups in MongoRocks, as well as some of the very simple ways it will take binary-based backups quickly, consistently (even when sharded) and continually.

I hope this explains more about lock types and what their implications mean as a follow up to Peter’s blog post about the direction Percona is moving regarding PerconaFT and MongoRocks. If you want to ask any more questions, or would like another blog that covers some areas of the system more deeply, please let us know via this blog, email, twitter or even pigeon!

by David Murphy at June 16, 2016 01:51 PM

Jean-Jerome Schmidt

Planets9s - MySQL Performance Tuning, Upgrading to 5.7, Docker Containers & more

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch the replay of our MySQL Database Performance Tuning webinar

Thanks to everyone who participated in this week’s popular webinar on MySQL Database Performance Tuning, which discussed the following topics: database tuning overview, principles of the tuning process, tuning the Operating System configuration and the MySQL configuration, useful tools such as pt-summary and pt-mysql-summary and what to avoid when tuning OS and MySQL configuration. The replay of this webinar is now available online!

Watch the replay

Upgrading to MySQL 5.7 - The Database Upgrade Guide

During this week’s webinar on MySQL database performance tuning, a lot of participants expressed their interest in upgrading to MySQL 5.7. If that’s the case for you as well, then our new whitepaper on how to do so will hopefully be of help. Upgrading to a new major version involves risk, and it is important to plan the whole process carefully. In this whitepaper, we look at the important new changes in MySQL 5.7 and how to plan the test process, do a live system upgrade without downtime, avoid connection failures during slave restarts and switchover, or how to leverage ProxySQL to achieve a graceful upgrade process.

Download the whitepaper

Try NinesControl: deploy and monitor MySQL Galera clusters on Digital Ocean

We recently announced the public availability of NinesControl (beta), our database infrastructure management solution for the cloud. Designed with the needs of developers in mind, NinesControl enables users to easily deploy and monitor (MySQL) Galera clusters on DigitalOcean. Droplets are launched and managed using your own DigitalOcean account. We’d love to get your feedback on this new solution if you haven’t tested it yet, so please check it out and let us know what you think.

Try NinesControl

MySQL Docker containers: understanding the basics

Welcome to our new blog series - “MySQL on Docker” - in which we will touch upon swarms, shared volumes, data-only-containers, security and configuration management, multi-host networking, service discovery and implications on monitoring when we move from host-centric to role-centric services with shorter life cycles. In this initial post, we cover some basics around running MySQL in a container.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

by Severalnines at June 16, 2016 12:49 PM

June 15, 2016

Peter Zaitsev

Troubleshooting hardware resource usage webinar: Q & A

Troubleshooting configuration issues

Troubleshooting hardware resourceIn this blog, I provide answers to the Q & A for the Troubleshooting hardware resource usage webinar.

First, I want to thank everybody who attended the May 26 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: How did you find the memory IO LEAK?

A: Do you mean the replication bug I was talking about in the webinar? I wrote about this bug here. See also comments in the bug report itself.

Q: Do you have common formulas you use to tune MySQL?

A: There are best practices: relate thread concurrency to number of CPU cores you have, set InnoDB buffer pool size large enough so it can contain all your working dataset (which is not always possible), and do not set the Query Cache size larger than 512MB (or even better, turn it off) to avoid issues with global lock set when it needs to be de-fragmented. I prefer not to call them “formulas,” because all options need to be adjusted to match the workload. If this weren’t the case, MySQL Server would have an automatic configuration. There is also a separate webinar on configuration (Troubleshooting configuration issues) where I discuss these practices.

Q: Slide 11: is this real time? Can we get this info for a job that has already finished?

A: Yes, this is real time. No, it is not possible to get this info for a thread that does not exist.

Q: Slide 11: what do negative numbers mean?

A: Numbers are taken from the 

CURRENT_NUMBER_OF_BYTES_USED
 field for table
memory_summary_by_thread_by_event_name
 in Performance Schema. These values, in turn, are calculated as (memory allocated by thread) – (memory freed by thread). Negative numbers here mean either a memory leak or incorrect calculation of memory used by the thread. I reported this behavior in the MySQL Bugs database. Please subscribe to the bug report and wait to see how InnoDB and Performance Schema developers answer.

Q: Are TokuDB memory usage stats recorded in the 

sys.memory_by_thread_by_current_bytes
  table also?  Do we have to set something to enable this collection? I ran the query, but it shows 0 for everything.

A: TokuDB currently does not support Performance Schema, thus its memory statistics are not instrumented. See the user manual on how memory instrumentation works.

Q: With disk what we will check for disk I/O?

A: I quite don’t understand the question. Are you asking on which disk we should check IO statistics? For datadir and other disks, look at the locations where MySQL stores data and log files (if you set custom locations).

Q: How can we put CPU in parallel to process multiple client requests? Put multiple requests In memory by parallel way. By defining transaction. Or there any query hints?

A: We cannot directly put CPU in parallel, but we can affect it indirectly by tuning InnoDB threads-related options (

innodb_threads_concurrency, innodb_read_io_threads, innodb_write_io_threads
) and using the thread pool.

Q: Is there any information the Performance Schema that is not found in the SYS schema?

A: Yes. For example, sys schema does not have a view for statistics about prepared statements, while Performance Schema does, because sys schema takes its statement statistics from digest tables (which make no sense for prepared statements).

Q: What is your favorite tool to investigate a memory issue with a task/job that has already finished?

A: I don’t know that there is such a tool suitable for use in production. In a test environment, you can use valgrind or similar tools. You can also make core dumps of the mysqld process and investigate them after the issue is gone.

by Sveta Smirnova at June 15, 2016 10:12 PM

Installing MongoDB 3.2 and upgrading MongoDB replica set

Upgrading MongoDB replica set

Upgrading MongoDB replica setIn this post, we’ll examine a couple of ways for upgrading MongoDB replica set.

With the release of MongoDB 3.2, comes a rash of new features and improvements. One of these enhancements is improved replica sets. From MongoDB: “A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments.”

Config servers are replica sets!

This is HUGE. It signals a significant advancement in backups, metadata stability and overall maturity. It is a very long-awaited feature that shows MongoDB is maturing. It means:

  • Mongos’ can retry connection vs error
  • Unified and consistent backups!
  • Up to 50 secondaries
    • Remove bugs with Mongos’ not near config servers!

How do we activate all these new awesome features? Let’s do it!

Upgrading to 3.2

  • Replace binaries and restart one secondary at a time
  • Then primaries as well
  • Restart configs in reverse order
    • If configdb=con1, con2, con3
      • Restart con3, con2, and then finally con1 with 3.2
      • Do con1 as FAST as possible, while the balancer is also disabled
    • You no longer need to restart a mongos –upgrade (as of 3.2)
    • Restart all mongos, this will reset ALL connections at some point (whether you do at once or space it out).

Upgrading the replset to the new protocol

This is by far the easiest upgrade bit but DON’T do it until you know your stable on 3.2. Log into each primary and run:

>cfg = rs.conf();
 {
     "_id" : "r1",
     "version" : 2,
     "members" : [
         {
             "_id" : 0,
             "host" : "localhost:17001"
         },
         {
             "_id" : 1,
             "host" : "localhost:17002"
         },
         {
            "_id" : 2,
            "host" : "localhost:17003",
         }
     ]
}
>cfg.protocolVersion=1;
>rs.reconfig(cfg);
{
    "ok" : 1,
    "$gleStats" : {
        "lastOpTime" : Timestamp(1464947003, 1),
        "electionId" : ObjectId("7fffffff0000000000000018")
    }
}

Or:

>db.getSiblingDB(‘config’).shards.forEach(function(shard){
    x = new Mongo(shard.host); /* Assumes no auth needed */
    conf =x.getDB("local").system.replset.findOne()
    conf.protcolVersion=1;conf.version++;
    x.getDB(‘admin’).runCommand({ replSetReconfig: conf });
});

The quick upgrade scripts

  • Upgrade_all_to_3.2.4.sh

Does what it says: kills every process and launches them on 3.2 binaries with no other changes.

  • Upgrade_replica_proto_version.sh

Simply runs the quick rs.reconfig() on each primary, adds the new settings to enable to new replication features.

Let’s upgrade the configs the right way!

This is not included as part of a normal upgrade so only do this AFTER you’re stable and don’t do it before upgrading the protocolVersion we just talked about. (I mean it! Disregard this advice and your life will seriously not be awesome!)

Upgrading to a Config ReplicaSet ( the official way)

  1. Run rs.initiate on the first config in the list (must be 3.2.4+)
    • Must be a fully configured document with configsrv:true defined.
  2. Restart same config server adding
    • configsvrMode = sccc
    • replSet = <name used in rs.initiate()>
    • storageEngine= WiredTiger
  3. Start the new config servers for the other two nodes (should be a new dbpath and port)
  4. Add those nodes to the replSet and check their status
  5. Remove the second original config server from the running
  6. Restart the 1st node you set “sccc” on to not have that setting
  7. At this point, the 1st node will transition to removed if using MMAP.
  8. Restart a mongos with a new configdb line
    • –configdb <replSetName>/node1:port,node2:port,…
    • Only replset members should be listed
  9. Verify you can work and query through mongos
  10. Repeat on all mongos
  11. Remove the 1st node with rs.remove
  12. Shutdown final original config and enable balancer

There is also an easy way.

The easy way, with a small maintenance window, which lets you just restore a good backup and have a nice and simple rollback plan:

  1. Stop all mongos after backing up the config directory
  2. Run rs.initiate on first config server
  3. Stop the 2nd, then the 3rd, restarting them with an empty dbpath directory
  4. Check the rs.status now
  5. Stop the 1st config server and restart with an empty dbpath directory
  6. Check Status
  7. Restart all mongos, adding <replSetName>/ to the front of the configdb line.
  8. Done!

Oh look there is a quick script we have made for you:

  • Upgrade_config_to_repliaset.sh
    • Kill all  config and mongos processes
    • Restart the first config server on non-standard port
    • Mongodump config database
    • Restart c1 as WiredTiger, clearing that data path
    • Import dump back into first config server
    • Restart on normal port
    • Initialize Replica Set
    • Restart second and third config server after clearing dbpath folder
    • After the initial sync, start all the mongos.
    • Done and script exits!

 

by David Murphy at June 15, 2016 08:55 PM

Jean-Jerome Schmidt

The Holy Grail Webinar Replay: MySQL Database Performance Tuning

Thanks to everyone who participated in this week’s popular webinar on MySQL Database Performance Tuning, which looked at answering the following questions:

  • You’re running MySQL as backend database, how do you tune it to make best use of the hardware?
  • How do you optimize the Operating System?
  • How do you best configure MySQL for a specific database workload?

The replay of this webinar is now available online:

Watch the replay

In this webinar, Krzysztof Książek, Senior Support Engineer at Severalnines, discussed some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL database. Krzysztof  also covered some of the variables which are frequently modified even though they should not. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Agenda

  • Database tuning - the what and why
  • Principles of the tuning process
  • Tuning the Operating System configuration
  • Tuning the MySQL configuration
  • Useful tools
    • pt-summary
    • pt-mysql-summary
  • What to avoid when tuning OS and MySQL configuration

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

To view all the blogs of the ‘Become a MySQL DBA’ series visit: http://www.severalnines.com/blog-categories/db-ops

by Severalnines at June 15, 2016 03:11 PM

June 14, 2016

Peter Zaitsev

MongoDB and non-existent collections

MongoDB and non-existent collections

MongoDB and non-existent collectionsIn this blog, I will discuss how I found some of my basic SQL assumptions that don’t hold true when dealing with MongoDB and non-existent collections.

Coming from a MySQL background, I have some assumptions about databases that don’t apply to MongoDB (or other kinds of databases that are neither SQL-based nor relationally-inspired).

An example of this is the assumption that data is organized in rows that are part of a table, with all tables having a strict schema (i.e., a single type of row). When working with MongoDB, this assumption must be transformed into the idea that data is organized in documents that are part of a collection and have a flexible schema so that different types of documents can reside in the same collection.

That’s an easy adjustment to make because a dynamic schema is one of the defining features of MongoDB. There are other less-obvious assumptions that need to be adjusted or redefined as you get familiar with a new product like MongoDB (for example, MySQL does not currently support built-in sharding, while MongoDB does).

There is a more fundamental kind of assumption, and by “fundamental” I mean an assumption that is deeply ingrained because you rely on it so often it’s automatic (i.e., unconscious). We’re usually hit by these when changing programming languages, especially in dynamic ones (“Will I be able to add a number to a string? If so, how will it behave?”). These can make it hard to adjust to a new database (or programming language, operating system, etc.) because we don’t consciously think about them and so we may forget to verify if they hold in the new system. This can happen in “real” life too: try going to a country where cars drive on the other side of the road from yours!

While working on a MongoDB benchmark recently, I was hit by one of these assumptions. I thought sharing my mistake may help others who are also coming to MongoDB from an SQL background.

One of my computing assumptions can be summarized as “reading from a non-existent source will fail with an error.”

Sure enough, it seems to be true for my operating system:

telecaster:~ fernandoipar$ cat notexists
cat: notexists: No such file or directory

And for MySQL:

mysql> select 1 from notexists;
ERROR 1146 (42S02): Table 'test.notexists' doesn't exist

But what happens in MongoDB?

> db.notexists.find()
> db.notexists.find().count()
0

No errors. You get no results. Interestingly, you can even count the number of documents in a cursor that is associated with a non-existent collection.

As I said, I hit this while working on a benchmark. How? I was comparing the throughput for different engines and various configurations, and after preparing the graphs they all showed the same behavior. While it’s not impossible for this to be accurate, it was very unlikely given what I was trying to measure. Some investigation led me to discover a mistake in the preparation phase of my benchmarks. To save time and to use the same data baseline for all tests, I was only running sysbench prepare once, backing up the data directory for each engine, and then restoring this backup before each experiment. The error was that I was restoring one subdirectory below MongoDB’s expected dbpath (i.e., to /data/db/db instead of /data/db), and so my scripts were reading from non-existent collections.

On a MySQL experiment, this would have immediately blown up in my face; with MongoDB, that is not the case.

On reflection, this behavior makes sense for MongoDB in that it is consistent with the write behavior. You don’t need to create a new collection, or even a new database. It’s enough that you write a document to it, and it gets created for you. If writing to a non-existent collection produces no errors, reading from one shouldn’t either.

Still, sometimes an application needs to know if a collection exists. How can you do this? There are multiple ways to do this, and I think the best approach is to verify their existence during the application initialization stage. Here are a couple of examples:

> db.notexists.stats()
{ "ok" : 0, "errmsg" : "Collection [test.notexists] not found." }
> db.getCollectionNames().indexOf("notexists") >= 0
false

I hope you find this useful, and if you’ve been hit by similar problems (such as MongoDB and non-existent collections) due to broken assumptions when moving to MongoDB with an SQL background, I’d love to read about them in the comments!

by Fernando Ipar at June 14, 2016 10:29 PM