Daniël van EedenMySQL User Group NL Meetup May 31 at Snow (14.5.2013, 12:00 UTC)
The third meeting for the MySQL User Group NL will be hosted by Snow B.V. in the Snow office in Geldermalsen.
 
The Agenda:
  • Choosing the Best Sharding Policy - Doran Levari (ScaleBase, using a video link)
  • Performance Monitoring with Statsd and Graphite - Art van Scheppingen (Spil Games)
  • Basic MySQL performance tuning for sysadmins - Daniël van Eeden (Snow)
Please RSVP on the meetup.com page.

The user group now has more than 100 members!
Link
Peter ZaitsevIs Synchronous Replication right for your app? (14.5.2013, 10:00 UTC)

I talk with lot of people who are really interested in Percona XtraDB Cluster (PXC) and mostly they are interested in PXC as a high-availability solution.  But, what they tend not to think too much about is if moving from async to synchronous replication is right for their application or not.

Facts about Galera replication

There’s a lot of different facts about Galera that come into play here, and it isn’t always obvious how they will affect your database workload.  For example:

  • Transaction commit takes approximately the worst packet round trip time (RTT) between any two nodes in your cluster.
  • Transaction apply on slave nodes is still asynchronous from client commit (except on the original node where the transaction is committed)
  • Galera prevents writing conflicts to these pending transactions while they are inflight in the form of deadlock errors.  (This is actually a form of Eventual Consistency where the client is forced to correct the problem before it can commit.  It is NOT the typical form of Eventual Consistency, known as asynchronous repair, that most people think of).

Callaghan’s Law

But what does that all actually mean?  Well, at the Percona Live conference a few weeks ago I heard a great maxim that really helps encapsulate a lot of this information and puts it into context with your application workload:

[In a Galera cluster] a given row can’t be modified more than once per RTT

This was attributed to Mark Callaghan from Facebook by Alexey Yurchenko from Codership at his conference talk.  Henceforth this will be known as “Callaghan’s law” in Galera circles forever, though Mark didn’t immediately recall saying it.

Applied to a standalone Innodb instance

Let’s break it down a bit.  Our unit of locking in Innodb is a single row (well, the PRIMARY KEY index entry for that row).  This means typically on a single Innodb node we can have all sorts modifications floating around as long as they don’t touch the same row.  Row locks are held for modifications until the transaction commits and that takes an fsync to the redo log by default, so applying Callaghan’s law to single-server Innodb, we’d get:

[On a single node Innodb server] a given row can’t be modified more than the time to fsync

You can obviously relax that by simply not fsyncing every transaction (innodb_flush_log_at_trx_commit != 1), or work around it with by fsyncing to memory (Battery or capacitor-backed write cache), etc., but the principle is basically the same.  If we want this transaction to persist after a crash, it has to get to disk.

This has no effect on standard MySQL replication from this instance, since MySQL replication is asynchronous.

What about semi-sync MySQL replication?

It’s actually much worse than Galera.  As I illustrated in a blog post last year, semi-sync must serialize all transactions and wait for them one at a time.  So, Callaghan’s law applied to semi-sync is:

[On a semi-sync replication master] you can’t commit (at all) more than once per RTT. 

Applied to a Galera cluster

In the cluster we’re protecting the data as well, though not by ensuring it goes to disk (though you can do that).  We protect the data by ensuring it gets to every node in the cluster.

But why every node and not just a quorum?  Well, it turns out transaction ordering really, really matters (really!).  By enforcing replication to all nodes, we can (simultaneously) establish global ordering for the transaction, so by the time the original node gets acknowledgement of the transaction back from all the other nodes, a GTID will also (by design) be established.  We’ll never end up with non-deterministic ordering of transactions as a result.

So this brings us back to Callaghan’s law for Galera.  We must have group communication to r

Truncated by Planet PHP, read more at the original (another 7525 bytes)

Link
Peter ZaitsevWebinar: MySQL 5.6 Performance Schema (13.5.2013, 13:40 UTC)

Using MySQL 5.6 Performance Schema to Troubleshoot Typical Workload BottlenecksThis Wednesday, May 15 at 10 a.m. Pacific, I’ll be leading  a Webinar titled, “Using MySQL 5.6 Performance Schema to Troubleshoot Typical Workload Bottlenecks.

In this Webinar I will offer an overview of Performance Schema, focusing on new features that have been added in MySQL 5.6, go over the configuration and spend most time showing how you can use the wealth of information Performance Schema gathers to understand some of the typical performance bottlenecks.

 

Other areas of focus include:

  • Bottlenecks with Disk IO
  • Problems with excessive temporary tables and external sorts
  • Excessive internal mutex contention
  • Slow queries due to waits on InnoDB locks and Meta Data locks

Interested ? Sign up today!

The post Webinar: MySQL 5.6 Performance Schema appeared first on MySQL Performance Blog.

Link
Open QueryLEVENSHTEIN MySQL stored function (13.5.2013, 00:10 UTC)

At Open Query we steer clear of code development for clients. We sometimes advise on code, but as a company we don’t want to be in the programmer role. Naturally we do write scripts and other necessities to do our job.

Assisting with an Open Source project, I encountered three old UDFs. User Defined Functions are native functions that are compiled and then loaded by the server similar to a plugin. As with plugins, compiling can be a pest as it requires some of the server MySQL header files and matching build switches to the server it’s going to be loaded in. Consequentially, binaries cannot be considered safely portable and that means that you don’t really want to have a project rely on UDFs as it can hinder adoption quite severely.

Since MySQL 5.0 we can also use SQL stored functions and procedures. Slower, of course, but functional and portable. By the way, there’s one thing you can do with UDFs that you (at least currently) can’t do with stored functions, and that’s create a new aggregate function (like SUM or COUNT).

The other two functions were very specific to the app, but the one was a basic levenshtein implementation. A quick google showed that there were existing SQL and even MySQL stored function implementations, most derived from a single origin which was actually broken (and the link is now dead, as well). I grabbed one that appeared functional, and reformatted it for readability then cleaned it up a bit as it was doing some things in a convoluted way. Given that the stored function is going to be much slower than a native function anyway, doing things inefficiently inside loops can really hurt.

The result is below. Feel free to use, and if you spot a bug or can improve the code further, please let me know!
Given the speed issue, I’m actually thinking this should perhaps be added as a native function in MariaDB. What do you think?

-- core levenshtein function adapted from
-- function by Jason Rust (http://sushiduy.plesk3.freepgs.com/levenshtein.sql)
-- originally from http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/
-- rewritten by Arjen Lentz for utf8, code/logic cleanup and removing HEX()/UNHEX() in favour of ORD()/CHAR()
-- Levenshtein reference: http://en.wikipedia.org/wiki/Levenshtein_distance

-- Arjen note: because the levenshtein value is encoded in a byte array, distance cannot exceed 255;
-- thus the maximum string length this implementation can handle is also limited to 255 characters.

DELIMITER $$
DROP FUNCTION IF EXISTS LEVENSHTEIN $$
CREATE FUNCTION LEVENSHTEIN(s1 VARCHAR(255) CHARACTER SET utf8, s2 VARCHAR(255) CHARACTER SET utf8)
  RETURNS INT
  DETERMINISTIC
  BEGIN
    DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;
    DECLARE s1_char CHAR CHARACTER SET utf8;
    -- max strlen=255 for this function
    DECLARE cv0, cv1 VARBINARY(256);

    SET s1_len = CHAR_LENGTH(s1),
        s2_len = CHAR_LENGTH(s2),
        cv1 = 0x00,
        j = 1,
        i = 1,
        c = 0;

    IF (s1 = s2) THEN
      RETURN (0);
    ELSEIF (s1_len = 0) THEN
      RETURN (s2_len);
    ELSEIF (s2_len = 0) THEN
      RETURN (s1_len);
    END IF;

    WHILE (j <= s2_len) DO
      SET cv1 = CONCAT(cv1, CHAR(j)),
          j = j + 1;
    END WHILE;

    WHILE (i <= s1_len) DO
      SET s1_char = SUBSTRING(s1, i, 1),
          c = i,
          cv0 = CHAR(i),
          j = 1;

      WHILE (j <= s2_len) DO
        SET c = c + 1,
            cost = IF(s1_char = SUBSTRING(s2, j, 1), 0, 1);

        SET c_temp = ORD(SUBSTRING(cv1, j, 1)) + cost;
        IF (c > c_temp) THEN
          SET c = c_temp;
        END IF;

        SET c_temp = ORD(SUBSTRING(cv1, j+1, 1)) + 1;
        IF (c > c_temp) THEN
          SET c = c_temp;
        END IF;

        SET cv0 = CONCAT(cv0, CHAR(c)),
            j = j + 1;
      END WHILE;

      SET cv1 = cv0,
          i = i + 1;
    END WHILE;

    RETURN (c);
  END $$

DELIMITER ;
Link
Peter ZaitsevOpen Source, the MySQL market (and TokuDB in particular) (10.5.2013, 14:31 UTC)

Open Source & the MySQL marketI was reviewing the Percona Live sponsors list the other day and pondering the potential success stories associated with this product or that one…. and as I was preparing to put more thought on the topic, a PlanetMySQL post caught my eye. It was penned by Mike Hogan and titled, “Thoughts on Xeround and Free!

For some reason the author of that post makes a connection between a free account in a cloud-based service and Open Source software. I think it’s an incorrect analogy, as they are two totally different things. A “free account” in this case is really just a marketing tool. Well, I admit there are companies that also use the “Open Source” mark as a marketing tool, too – we often can see this in products based on Open Core models. But in my opinion Open Core is not Open Source, and the Open Source model is something different.

Now let me state that I am not an Open Source fanatic and I totally accept different models.

Open Source should be considered as a way of providing additional value to customers of  your product. Namely, if your product is Open Source, you provide the following benefits to your customers:

  • No vendor lock-in. And this is significant. Customers often choose products that allow them to avoid being locked in to one product. For example, I believe there is no way a closed-source software product will ever be deployed in a Facebook data center.
  • No service provider lock-in. Customers should be able to choose who provides services for your product.
  • Independent expertise. Customers like to get trustworthy information that does not come from the vendor of the product.
  • A wide user base and community around your product.
  • Growing public knowledge: discussions on blogs, forums, social networks.

Well, of course, some or all of these values are achievable with proprietary products, but for me Open Source is the easiest path to all of them.

Now, I understand that you as a vendor may not like some of above. I expect that some vendors tend to love “vendor lock-in,” and it also helps your bottom line if you are the one and only service provider for a given product. But the question here is: Do you care more about your customers – or do you and your investors come first?

And it is fine (it really is) with me if you decide to take the proprietary license path, but in that case you should expect that users will choose an alternative Open Source solution, even if this solution is less functional and somewhat lower quality. And this choice is not because it is “free” as in “a free cloud database account,” rather it’s because it is “free” as in proving the freedom to choose vendors, service providers, expertise, etc.

TokuDB is a good example. Even with a great technology, before becoming Open Source they had “no fewer than 12 customers” (source: Forbes). And I name that “struggling.” I think it was the same for Schooner and Kickfire, two companies that based their products on a closed-source version of MySQL… and you know their fate – sold for assets with a loss for investors as their products did not reach a sustainable number of customers.

As for TokuDB: after its Open Source announcement, as anecdotal evidence, their website got so much traffic all at once that it went down, and within two weeks they have had more interest from the community than during all previous years. I expect that TokuDB will see a tenfold increase in users during the next year.

However, becoming Open Source, by itself, is by no means enough to have a successful business. There are two examples: First, the PBXT storage engine from PrimeBase – even with an open source engine, the company eventually could not fund further development, and this engine is pretty much dead today. Second is a recent example from the Monty Program – they have the Open Source product MariaDB with raising popularity. But as a business they failed to attract paying customers and had to merge with SkySQL.  I name that a business failure (even though it is widely publicized as a “success” by their marketing teams) – it is quite hidden among all buzz, but you can find some grains of truth in the article

Truncated by Planet PHP, read more at the original (another 1624 bytes)

Link
SkySQL TeamWhere to find the reunited Original MySQL Gang in the next few weeks! (9.5.2013, 18:56 UTC)

SkySQL & Monty Program’s MariaDB Team coming to a city near you!

read more

Link
Cédric PEINTREYour databases in a glance with Glimpsee (sign up now!) (9.5.2013, 17:53 UTC)

glimpsee_plus

During the last months we’ve worked hard on a new and exiting project!

“We’re making it easier

to show your databases in a glance”

 

I’m very proud to reveal the project for the first time because I am convinced that Glimpsee will change the way you “watch” the status of your databases…

Glimpsee is not another monitoring tool, it is the easiest way we’ve found to make DBA’s life easier.

I can’t say more about Glimpsee at this point but keep in mind these few words: simple, simple and… simple!

An alpha release of Glimpsee is already running for a lucky few. Register now to be the firsts to discover Glimpsee!
 

www.glimpsee.net

 

<script type="text/javascript">// !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs'); //

Link
Peter ZaitsevHow to create a new (or repair a broken) GTID based slave with Percona XtraBackup (9.5.2013, 10:00 UTC)

Percona XtraBackup 2.0.7 has been published with support for GTID based replication. As promised, here is the step-by-step guide on how to create a new GTID based slave (or repair a broken one) using XtraBackup. The process is pretty straightforward.

1- Take a backup from any server on the replication environment, master or slave:

# innobackupex /destination/

In the destination folder there will be a file with the name xtrabackup_binlog_info:

# cat xtrabackup_binlog_info
mysql-bin.000002        1232            c777888a-b6df-11e2-a604-080027635ef5:1-4

Now it contains both, binary log coordinates and GTID information.

That information is also printed by innobackupex after backup is taken:

innobackupex: MySQL binlog position: filename 'mysql-bin.000002', position 1232, gtid_executed c777888a-b6df-11e2-a604-080027635ef5:1-4

2- Apply the logs to the backup:

# innobackupex --apply-log /destination/2013-05-07_08-33-33/

3- Move the backup to the destination server and put the content on the mysql’s datadir. Follow the usual restore procedure, for example remember to change the permissions to mysql:mysql.

4- Start the new slave from that GTID position:

slave1 > SET GLOBAL gtid_purged="c777888a-b6df-11e2-a604-080027635ef5:1-4";
slave1 > CHANGE MASTER TO MASTER_HOST="10.0.1.1", master_user="msandbox", master_password="msandbox", MASTER_AUTO_POSITION = 1;

5- Check the replication status:

slave1 > show slave status\G
[...]
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
           Retrieved_Gtid_Set: c777888a-b6df-11e2-a604-080027635ef5:5
            Executed_Gtid_Set: c777888a-b6df-11e2-a604-080027635ef5:1-5

We can see that the slave has retrieved a new transaction with number 5, so transactions from 1 to 5 are already on this slave.

That’s all, we have created a new slave in our GTID based replication environment.

The post How to create a new (or repair a broken) GTID based slave with Percona XtraBackup appeared first on MySQL Performance Blog.

Link
Peter ZaitsevMySQL and Percona Server in LinkBench benchmark (8.5.2013, 13:48 UTC)

Around month ago Facebook has announced the Linkbench benchmark that models the social graph OLTP workload. Sources, along with a very nice description of how to setup and run this benchmark, can be found here. We decided to run this benchmark for MySQL Server 5.5.30, 5.6.11 and Percona Server 5.5.30 and check how these servers will handle such OLTP workloads in the CPU and IO-bound cases. For this test we used a PowerEdge R720 box with a fast PCI-e flash card as storage.

By default linkbench dataset has 10M ids(after load of data size of datadir ~10GB). We used this dataset to check server behavior when data fully fits buffer pool(size of buffer pool is 30GB). So basically this is a CPU-bound case.

linkbench.1x.v3

As you can see there is a very slight difference between servers at 64 threads but a much more notable drop for 5.6.11 at 128 threads.

Then we loaded 10x dataset – 100M ids (size of datadir ~100GB), size of the buffer pool is the same – 30GB. So now we explore the IO-bound scenario.

linkbench.10x.v3

Percona Server 5.5 outperforms MySQL in about 2x times.
Both MySQL 5.5.30 and MySQL 5.6.11 demonstrate notable drops in performance. What can be the reason for that?
Below is a chart with top mutexes for each server at 64 threads:

linkbench.10x.mutexes

For MySQL 5.5.30 top mutex is &doublewrite->mutex: trx0sys.c:196. And most likely this symptom is related to BUG#67808.

For MySQL 5.6.11 top mutexes is &buf_pool->mutex,&new_index->lock. I profiled 5.6.11 in this IO bound scenario with the perf – see profile below:

# Overhead  Samples    Command        Shared Object
# ........ ..........  .......  ...................  ..................................................................................................................................
#
    35.85%   17738833   mysqld  mysqld               [.] buf_LRU_free_block(buf_page_t*, unsigned long)
             |
             --- buf_LRU_free_block(buf_page_t*, unsigned long)
                |
                |--99.94%-- buf_LRU_scan_and_free_block(buf_pool_t*, unsigned long)
                |          buf_LRU_get_free_block(buf_pool_t*)
                |          |
                |          |--94.84%-- buf_page_init_for_read(dberr_t*, unsigned long, unsigned long, unsigned long, unsigned long, long, unsigned long)
...
    31.41%   15534570   mysqld  mysqld               [.] rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned long)
             |
             --- rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned long)
                |
                |--99.14%-- buf_LRU_free_block(buf_page_t*, unsigned long)
                |          |
                |          |--100.00%-- buf_LRU_scan_and_free_block(buf_pool_t*, unsigned long)
                |          |          buf_LRU_get_free_block(buf_pool_t*)
     2.53%    1338484   mysqld  mysqld               [.] ut_delay(unsigned long)

So basically most of the time 5.6.11 spent in LRU_scan. I tried to increase innodb_lru_scan_depth variable to 8k,16k,32k but that had no notably impact on result. In the best possible combination I got ~15k operations per second for MySQL 5.6.11.

Conclusion:

In CPU-bounds case MySQL performs quite well, though we can see small performance drop in MySQL 5.6.
In IO-bound cases MySQL still has performance issues around mutexes and Percona Server shows much better results

Configurations and how to run benchmark:

[mysqld]
user=root
port=3306

innodb_buffer_pool_size = 30G
innodb_flush_method = O_DIRECT
innodb_log_file_size = 2000M
inno

Truncated by Planet PHP, read more at the original (another 1675 bytes)

Link
Peter ZaitsevPercona XtraBackup 2.1.0 ‘release candidate’ for MySQL available for download (7.5.2013, 22:14 UTC)

Percona XtraBackup for MySQLPercona is glad to announce the release of Percona XtraBackup 2.1.0-rc1 on May 7, 2013. Downloads are available from our download site here. For this RC release, we will not be making APT and YUM repositories available, just base deb and RPM packages

This is an Release Candidate quality release and is not intended for production. If you want a high-quality, generally available release, the current stable version should be used (currently 2.0.7 in the 2.0 series at the time of writing).

New Features:

  • This version of Percona XtraBackup has implemented full support for new MySQL 5.6 features (GTID, remote/transportable tablespaces, separate undo tablespace, 5.6-style buffer pool dump files).
  • Percona XtraBackup has implemented support for the InnoDB Buffer Pool Preloading introduced in MySQL 5.6. Starting with MySQL 5.6 buffer pool dumps can be produced and loaded for faster server warmup after the start. This feature is similar to the Dump/Restore of the Buffer Pool in Percona Server. MySQL 5.6 buffer pool dump is copied into backup directory during the backup stage. During the copy back stage (restore) it is copied back to data directory. After the backup is restored buffer pool dump can be loaded by the server either automatically on startup or on demand.
  • Time interval between checks done by log copying thread is now configurable by innobackupex –log-copy-interval. Making the interval configurable allows to reduce the time between checks which can prevent XtraBackup failures that are caused by the log records in the transactional log being overwritten before they are copied by the log copying thread.
  • Percona XtraBackup now stores the GTID value in the xtrabackup_binlog_info when doing the backup of MySQL and Percona Server 5.6 with the GTID mode enabled. Example of how this information can be used to create/restore a slave can be found in this blogpost.
  • Percona XtraBackup option xtrabackup –export now supports transportable tablespaces introduced in MySQL 5.6. This option can be used to produce 5.6-style metadata files, that can be imported by ALTER TABLE IMPORT TABLESPACE on MySQL and Percona Server 5.6 as described in Exporting and Importing Tables guide.

Bugs Fixed:

  • Percona XtraBackup would crash when preparing the 5.6 backup with partitioned tables. Bug fixed #1169169.
  • Tables that were dropped between taking a full backup and an incremental one were present in the full backup directory, and were not removed when incremental backups has been merged. Fixed by removing files corresponding to tables that are missing in the incremental backup directory. Bug fixed #856400.
  • Percona XtraBackup would leave stale xtrabackup_tmp* files in the datadir after applying incremental backups. Bug fixed #1079135.
  • If there are thousands of tables and slow IO then XtraBackup can spend a lot of time opening all the tablespaces. Optimization has been implemented and XtraBackup now avoids loading non-relevant tablespaces when partial backup is being taken which speeds up the backup process. Bug fixed #1130145.
  • Due to different implementation in MyS

Truncated by Planet PHP, read more at the original (another 1954 bytes)

Link
LinksRSS 0.92   RDF 1.
Atom Feed