Daniël van EedenWhen your query is blocked, but there is no blocking query - Part 2 (25.10.2014, 13:05 UTC)
In my previous post I talked about a transaction which blocked other transactions without doing anything. I talked about finding data from the blocking transaction using SYS and performance_schema.

But what are the possible solutions?

The first solution is to (automatically) kill the blocking transactions. Long running transactions can also stall the purging in InnoDB. See this blog post by Mark Leith about a possible solution.

The second solution would be make the application end the transaction sooner and/or to commit more often. Depending on your application this might or might not work. I consider this the best solution.

The third solution is to change the transaction isolation level of the blocking transaction to READ COMMITTED.

mysql [test] > set transaction isolation level read committed;
Query OK, 0 rows affected (0.00 sec)

mysql [test] > start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql [test] > insert into t2 select * from t1;
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0

This works. The transaction won't need to lock as much as it did with REPEATABLE READ and the INSERT/DELETE on t1 from another transaction won't be blocked. Be careful to set the binlog_format to ROW before trying this as READ COMMITTED requires this to work.

mysql [test] > set session binlog_format=STATEMENT;
ERROR 1559 (HY000): Cannot switch out of the row-based binary log
format when the session has open temporary tables
mysql [test] > drop table t2;
Query OK, 0 rows affected (0.01 sec)

mysql [test] > set session binlog_format=STATEMENT;
Query OK, 0 rows affected (0.00 sec)

mysql [test] > create temporary table t2 (id int);
Query OK, 0 rows affected (0.01 sec)

mysql [test] > set transaction isolation level read committed;
Query OK, 0 rows affected (0.00 sec)

mysql [test] > insert into t2 select * from t1;
ERROR 1665 (HY000): Cannot execute statement: impossible to write to
binary log since BINLOG_FORMAT = STATEMENT and at least one table
uses a storage engine limited to row-based logging. InnoDB is limited
to row-logging when transaction isolation level is READ COMMITTED or
READ UNCOMMITTED.

Setting the transaction isolation can also be done in my.cnf.
Link
Daniël van EedenWhen your query is blocked, but there is no blocking query (25.10.2014, 10:22 UTC)
When I queried information_schema.innodb_trx (introduced in 5.1 with the InnoDB Plugin) I noticed there were a few transactions in LOCK WAIT state.

Example:
mysql [information_schema] > select trx_id,trx_state 
    -> from information_schema.innodb_trx;
+--------+-----------+
| trx_id | trx_state |
+--------+-----------+
| 7974 | LOCK WAIT |
| 7962 | RUNNING |
+--------+-----------+
2 rows in set (0.00 sec)

Then I made a query to join a few information_schema and performance_schema tables to find out which query was blocking my transactions. It turned out that the blocking transaction had a trx_query=NONE. So my query was block by a transaction doing nothing. That's not really helpful.

Let's try to recreate the situation and see exactly what happened. I'll use two sessions for the transactions and a third to monitor the server.

The first session:
mysql [test] > create table t1 (id int);
Query OK, 0 rows affected (0.01 sec)

mysql [test] > insert into t1 values(1),(2),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

mysql [test] > create temporary table t2 (id int);
Query OK, 0 rows affected (0.01 sec)

mysql [test] > start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql [test] > insert into t2 select * from t1;
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

The second session:
mysql [test] > insert into t1 values(4);
Now the second session is waiting.

Let's start a third session to see what's happening. If the second session hits a wait timeout, you can just restart it.

mysql [information_schema] > select * from innodb_trx\G
*************************** 1. row ***************************
trx_id: 8022
trx_state: LOCK WAIT
trx_started: 2014-10-25 11:38:25
trx_requested_lock_id: 8022:18:3:1
trx_wait_started: 2014-10-25 11:38:25
trx_weight: 2
trx_mysql_thread_id: 19
trx_query: insert into t1 values(4)
trx_operation_state: inserting
trx_tables_in_use: 1
trx_tables_locked: 1
trx_lock_structs: 2
trx_lock_memory_bytes: 360
trx_rows_locked: 1
trx_rows_modified: 0
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 2. row ***************************
trx_id: 8020
trx_state: RUNNING
trx_started: 2014-10-25 11:35:07
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 6
trx_mysql_thread_id: 1
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 0
trx_lock_structs: 3
trx_lock_memory_bytes: 360
trx_rows_locked: 4
trx_rows_modified: 3
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
trx_is_read_only: 0
trx_autocommit_non_locking: 0
2 rows in set (0.00 sec)

mysql [information_schema] > select * from innodb_lock_waits\G
*************************** 1. row ***************************
requesting_trx_id: 8022
requested_lock_id: 8022:18:3:1
blocking_trx_id: 8020
blocking_lock_id: 8020:18:3:1
1 row in set (0.00 sec)

mysql [information_schema] > select * from innodb_locks\G
*************************** 1. row ***************************
lock_id: 8022:18:3:1
lock_trx_id: 8022
lock_mode: X
lock_type: RECORD
lock_table: `test`.`t1`
lock_index: GEN_CLUST_INDEX
lock_space: 18
lock_page: 3
lock_rec: 1
lock_data: supremum pseudo-record
*************************** 2. row ***************************
lock_id: 8020:18:3:1
lock_trx_id: 8020
lock_mode: S
lock_type: RECORD
lock_table: `test`.`t1`
lock_index: GEN_CLUST_INDEX
lock_space: 18
lock_page: 3
lock_rec: 1
lock_data: supremum pseudo-record
2 rows in set (0.00

Truncated by Planet PHP, read more at the original (another 4370 bytes)

Link
Peter ZaitsevMySQL 5.6 Full Text Search Throwdown: Webinar Q&A (23.10.2014, 13:01 UTC)

MySQL 5.6 Full Text Search Throwdown: Webinar Q&AYesterday (Oct. 22) I gave a presentation titled “MySQL 5.6 Full Text Search Throwdown.” If you missed it, you can still register to view the recording and my slides.

Thanks to everyone who attended, and especially to folks who asked the great questions. I answered as many as we had time for during the session, but here are all the questions with my complete answers:

Q: Does Solr automatically maintain its index against MySQL? Do you have to hit the Solr server with a specific query to keep the index ‘warm’?

There are several strategies for updating a Solr index. In my examples, I showed only a “full import” which is what you would do to create an index by reading all the source data.

You can also perform a “delta import” periodically, to add a subset of the source data to an existing index, for example to add data that has changed since the last time you updated the Solr index. See the documentation for Using delta-import command and also Using query attribute for both full and delta import.

The delta import would typically be something you would invoke from a cron job, perhaps every hour. But that means that a Solr search might not find data that has changed in MySQL more recently than the last delta import. Depending on the application, a delay of up to 60 minutes might be acceptable, or else maybe you have strict requirements that all data must be in sync instantly.

You could also update the Solr index one document at a time using its Java API or web service API. This would require you to write code in your application. Every time you INSERT or UPDATE or DELETE a document in MySQL that you want to be kept in sync with the Solr index, you would write more code to do a similar operation in the Solr index. That way every single text change would be searchable nearly immediately.

Q: Did you test Elasticsearch? (several people asked about this)

I did not test Elasticsearch, but according to their technology overview: “Elasticsearch uses Lucene under the covers.” So I expect that this part of Elasticsearch performs similarly to what I saw from Apache Solr, which also uses Lucene internally.

Q: One question I could not understand, how to maintain Sphinx index in sync with data? Can be it in real time?

The Sphinx Search index does not automatically refresh as your MySQL data changes. You would have to write application code to invoke the indexing process. There’s a page in the Sphinx Search documentation about Live Index Updates, that gives an overview of the two methods, and links to further reading.

This is definitely the most inconvenient aspect of Sphinx Search. Queries are very fast, but it’s expensive to do incremental updates to an index. So it’s ideal for indexing an archive of text that doesn’t change very frequently, but not as easy to use it for indexing rapidly-changing content.

Q: I have over 800,000 PDF documents to index (in several languages), any recommendations?

I said during the webinar that I recalled there exists tools to extract searchable text from a PDF file. I found one such project called Apache PDFBox includes this capability, and they have a page describing a helper class for doing PDF parsing and extraction combined with Lucene indexing. I haven’t used it myself, so I can’t comment on its performance for indexing 800,000 PDF documents, but it seems like you could write a Java program to iterate over your collection of PDF’s, and index them using this class.

Q: What is your suggestion to use Sphinx

Truncated by Planet PHP, read more at the original (another 4344 bytes)

Link
Jean-Jerome SchmidtResources for Database Clusters: 9 DevOps Tips, ClusterControl 1.2.8 Release, HAProxy Webinar Replay & More (23.10.2014, 11:24 UTC)
October 23, 2014
By Severalnines

Check Out Our Latest Technical Resources for MySQL, MariaDB & MongoDB Clusters

 

Here is a summary of resources & tools that we’ve made available to you in the past weeks. If you have any questions on these, feel free to contact us!

 

New Technical Webinar

If you are in DevOps, you will know that deploying and managing databases has its challenges! Monitoring, managing schema changes and pushing them in production, performance optimizations, configurations, version upgrades, backups; these are all aspects to consider – preferably before going live!

In this new webinar, we will walk you through 9 key tips to consider before specifically taking Galera Cluster into production. So if you are in devops, then this webinar is for you ;-)

Read more details and register here!

 

Technical Webinar - Replay

If you missed this webinar with our guest speaker from HAProxyTechnologies (or would just like to view it again), it’s online on demand.

read more

Link
Shlomi NoachRefactoring replication topology with Pseudo GTID (23.10.2014, 10:37 UTC)

This post describes in detail the method of using Pseudo GTID to achieve unplanned replication topology changes, i.e. connecting two arbitrary slaves, or recovering from a master failure even as all its slaves are hanging in different positions.

Please read Pseudo GTID and Pseudo GTID, RBR as introduction.

Consider the following case: the master dies unexpectedly, and its three slaves are all hanging, not necessarily at same binary log file/position (network broke down while some slaves managed to salvage more entries into their relay logs than others)

orchestrator-failed-master

(Did you notice the "Candidate for master" message? To be discussed shortly)

GTID

With GTID each transaction (and entry in the binary log) is associated with a unique mark -- the Global Transaction ID. Just pick the slave with the most advanced GTID to be the next master, and just CHANGE MASTER TO MASTER_HOST='chosen_slave' on the other slaves, and everything magically works. A slave knows which GTID it has already processed, and can look that entry on its master's binary logs, resuming replication on the one that follows.

How does that work? The master's binary logs are searched for that GTID entry. I'm not sure how brute-force this is, since I'm aware of a subtlety which requires brute-force scan of all binary logs; I don't actually know if it's always like that.

Pseudo GTID

We can mimick that above, but our solution can't be as fine grained. With the injection of Pseudo GTID we mark the binary log for unique entries. But instead of having a unique identifier for every entry, we have a unique identifier for every second, 10 seconds, or what have you, with otherwise normal, non-unique entries in between our Pseudo GTID entries.

Recognizing which slave is more up to date

Given two slaves, which is more up to date?

  • If both replicate(d) from same master, a SHOW SLAVE STATUS comparison answers (safe method: wait till SQL thread catches up with broken IO thread, compare relay_master_log_file, exec_master_log_pos on both machines). This is the method by which the above "Candidate for master" message is produced.
  • If one is/was descendent of the other, then obviously it is less advanced or equals its ancestor.
  • Otherwise we're unsure - still solvable via bi-directional trial & error, as explained later on.

For now, let's assume we know which slave is more up to date (has received and executed more relay logs). Let's call it S1, whereas the less up-to-date will be S2. This will make our discussion simpler.

Prerequisites

  • We require a Pseudo GTID in place: a periodic injection of a known-to-be-unique query, and which we know how to intercept
  • We require log_slave_updates on all slaves. We will need to parse the binary logs on slaves. I have little trust in the availability of relay logs: these are flushed, rotated and auto-erased all too quickly. The proposed solution does not require any daemon running on the MySQL servers themselves. There will be nothing to back up the relay logs, so I can't trust these to exist. Binary logs, on the other hand, have expiry period in days, and so I can trust them to exist for a duration of a few minutes.
  • Normal replication. Not multi threaded. Not multi-source.

The process of rematching slaves

S1 is more up to date, hence we want to make S2 a slave of S1. We expect the statements/entries found in S2's binary logs to exist in S1, in the same order, but somewhere back in the past, padded by additional entries (zero or more) that are not found in S2. Steps are:

  • Find latest Pseudo-GTID entry in S2's logs. This can be done by iterating S2's binary logs newest to oldest. The first (time DESC) binary log where such entry is found is to be searched for the last entry (latest). Keep record of the binlog file2:pos2 coordinates.
  • Take note of the exact entry made in the above. This is the unique value.
  • Search said unique value in S1's binary logs. Since it is unique, your method of search is arbirtary, you just need to find it. Br

Truncated by Planet PHP, read more at the original (another 3376 bytes)

Link
Shlomi NoachPseudo GTID, Row Based Replication (23.10.2014, 04:18 UTC)

This post continues Pseudo GTID, in a series of posts describing an alternative to using MySQL GTIDs.

The solution offered in the last post does not work too well for row based replication. The binary log entries for the INSERT statement look like this:

# at 1020
# at 1074
#141020 12:36:21 server id 1  end_log_pos 1074  Table_map: `test`.`pseudo_gtid` mapped to number 33
#141020 12:36:21 server id 1  end_log_pos 1196  Update_rows: table id 33 flags: STMT_END_F

BINLOG '
lddEVBMBAAAANgAAADIEAAAAACEAAAAAAAEABHRlc3QAC3BzZXVkb19ndGlkAAMDBw8CQAAE
lddEVBgBAAAAegAAAKwEAAAAACEAAAAAAAEAA///+AEAAACL10RUJDg2ZmRhMDk1LTU4M2MtMTFl
NC05NzYyLTNjOTcwZWEzMWVhOPgBAAAAlddEVCQ4Y2YzOWMyYy01ODNjLTExZTQtOTc2Mi0zYzk3
MGVhMzFlYTg=
'/*!*/;

Where's our unique value? Encoded within something that cannot be trusted to be unique. Issuing mysqlbinlog --verbose helps out:

BEGIN
/*!*/;
# at 183
# at 237
#141020 12:35:51 server id 1  end_log_pos 237   Table_map: `test`.`pseudo_gtid` mapped to number 33
#141020 12:35:51 server id 1  end_log_pos 359   Update_rows: table id 33 flags: STMT_END_F

BINLOG '
d9dEVBMBAAAANgAAAO0AAAAAACEAAAAAAAEABHRlc3QAC3BzZXVkb19ndGlkAAMDBw8CQAAE
d9dEVBgBAAAAegAAAGcBAAAAACEAAAAAAAEAA///+AEAAABt10RUJDc1MWJkYzEwLTU4M2MtMTFl
NC05NzYyLTNjOTcwZWEzMWVhOPgBAAAAd9dEVCQ3YjExZDQzYy01ODNjLTExZTQtOTc2Mi0zYzk3
MGVhMzFlYTg=
'/*!*/;
### UPDATE `test`.`pseudo_gtid`
### WHERE
###   @1=1
###   @2=1413797741
###   @3='751bdc10-583c-11e4-9762-3c970ea31ea8'
### SET
###   @1=1
###   @2=1413797751
###   @3='7b11d43c-583c-11e4-9762-3c970ea31ea8'

and that's something we can work with. However, I like to do stuff from within MySQL, and rely as little as possible on external tools. How do the binary log entries look via SHOW BINLOG EVENTS? Not good.

master [localhost] {msandbox} (test) > show binlog events in 'mysql-bin.000058' limit 20;
+------------------+------+-------------+-----------+-------------+---------------------------------------+
| Log_name         | Pos  | Event_type  | Server_id | End_log_pos | Info                                  |
+------------------+------+-------------+-----------+-------------+---------------------------------------+
| mysql-bin.000058 |    4 | Format_desc |         1 |         107 | Server ver: 5.5.32-log, Binlog ver: 4 |
| mysql-bin.000058 |  107 | Query       |         1 |         183 | BEGIN                                 |
| mysql-bin.000058 |  183 | Table_map   |         1 |         237 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 |  237 | Update_rows |         1 |         359 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 |  359 | Xid         |         1 |         386 | COMMIT /* xid=5460 */                 |
| mysql-bin.000058 |  386 | Query       |         1 |         462 | BEGIN                                 |
| mysql-bin.000058 |  462 | Table_map   |         1 |         516 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 |  516 | Update_rows |         1 |         638 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 |  638 | Xid         |         1 |         665 | COMMIT /* xid=5471 */                 |
| mysql-bin.000058 |  665 | Query       |         1 |         741 | BEGIN                                 |
| mysql-bin.000058 |  741 | Table_map   |         1 |         795 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 |  795 | Update_rows |         1 |         917 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 |  917 | Xid         |         1 |         944 | COMMIT /* xid=5474 */                 |
| mysql-bin.000058 |  944 | Query       |         1 |        1020 | BEGIN                                 |
| mysql-bin.000058 | 1020 | Table_map   |         1 |        1074 | table_id: 33 (test.pseudo_gtid)       |
| mysql-bin.000058 | 1074 | Update_rows |         1 |        1196 | table_id: 33 flags: STMT_END_F        |
| mysql-bin.000058 | 1196 | Xid         |         1 |        1223 | COMMIT /* xid=5476 */                 |
| mysql-bin.000058 | 1223 | Query       |     

Truncated by Planet PHP, read more at the original (another 17963 bytes)

Link
Oli SennhauserGet rid of wrongly deleted InnoDB tables (22.10.2014, 20:10 UTC)
Taxonomy upgrade extras: 

Precaution: Before you try this out on your production system do a BACKUP first! FromDual Backup Manager can help you with this.

Situation

A MySQL user has delete its InnoDB table files for example like this:

shell> rm -f $datadir/test/test.*

Analysis

We do some analysis first:

mysql> DROP TABLE test;
ERROR 1051 (42S02): Unknown table 'test'

mysql> CREATE TABLE test (id INT) ENGINE = InnoDB;
ERROR 1050 (42S01): Table '`test`.`test`' already exists

The MySQL error log shows us the following information:

141022 17:09:04  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
141022 17:09:04  InnoDB: Error: trying to open a table, but could not
InnoDB: open the tablespace file './test/test.ibd'!
InnoDB: Have you moved InnoDB .ibd files around without using the
InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
InnoDB: It is also possible that this is a temporary table #sql...,
InnoDB: and MySQL removed the .ibd file for this.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.

Fix

User claims that he does NOT need the table and/or the data any more but wants to get rid of the error messages and/or create a new table with the same name.

mysql> CREATE SCHEMA recovery;
mysql> use recovery
mysql> CREATE TABLE test (id INT) ENGINE = InnoDB;
mysql> \! cp $datadir/recovery/test.frm $datadir/test/
mysql> DROP SCHEMA recovery;
mysql> use test
mysql> DROP TABLE test;

Prove

To prove it works we create a new table and fill in some records:

mysql> CREATE TABLE test (id int UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, data VARCHAR(64), ts TIMESTAMP) ENGINE = InnoDB;
mysql> INSERT INTO test VALUES (NULL, 'Test data', NULL);

Literature

Link
Peter ZaitsevMySQL community set to meet at Percona Live London 2014 (22.10.2014, 07:00 UTC)

The countdown is on for Europe’s largest annual MySQL event, Percona Live London 2014. The two days of technical tutorials and sessions, November 3-4, will focus on the latest MySQL industry trends, news, best practices – and a look at what’s on the near- and long-term horizon within the global MySQL ecosystem.

Percona Live London 2014 will bring attendees up to date on key areas including MySQL 5.7, database security, database as a service (DBaaS), Hadoop and high availability (HA), disaster recovery, replication and backup, performance and scalability, WebScaleSQL and much, much more.

Team Oracle will be in London, led by Tomas Ulin, vice president of Oracle’s MySQL engineering team, who will explain why MySQL just keeps getting better with the latest news for the MySQL Database, MySQL Cluster, MySQL Workbench… and more. Oracle’s Luis Soares, principle software engineer, and Andrew Morgan, MySQL HA product management, will provide insight into what’s in the latest 5.7 development milestone release and also what’s going on in the labs… particularly around MySQL replication. Seize the opportunity to learn how to leverage MySQL 5.7 replication to grow your business from their session, “MySQL Replication: What’s New in 5.7 and Beyond.”

If anything keeps DBAs up at night it’s database security – especially with recent revelations of vulnerabilities like the POODLE SSLv3 security flaw (CVE-2014-3566) and “Bash Bug,” also known as Shellshock (CVE-2014-6271). Attendees will have the opportunity to talk face-to-face with database security expert David Busby of Percona, who will also lead a session titled, “Security it’s more than just your database you should worry about.”

The official Percona Live London 2014 t-shirt!

The official Percona Live London 2014 t-shirt!
(Click image for larger view)

Observe how to incorporate semi-synchronous replication to achieve failover – without data loss. Facebook’s Yoshinori Matsunobu and Santosh Banda will share how they did it at scale (across data centers) by extending MySQL internals along with some handy new self-made tools.

Meet the next-generation C connector for MySQL: libAttachSQL. It’s a new lightweight async C connector library for MySQL being developed from scratch by HP’s Advanced Technology Group. Andrew Hutchings, principal software engineer at Hewlett-Packard, will be on hand to share the latest on libAttachSQL.

Successful applications often become limited by MySQL performance. But tracking down and fixing those issues can be a huge drain on time and resources. Unless you think smart – spending time on what gives you the best return. Percona CEO Peter Zaitsev will explain how in his session, “Practical MySQL Performance Optimization.”

Percona Live London attendees will also learn from the real-life experiences of MySQL experts who share case studies. Shake hands with Art van Scheppingen, head of database engineering at Spil Games, who will explain how to serve out any page with an HA Sphinx environment.

Save yourself a quarter century by absorbing Tim Callaghan’s MySQL performance benchmarking tips, tricks and lessons learned. Tim, vice president of engineering at Tokutek, with share what he’s learned in the past 25 years maintaining the performance of database applications.

And of course there will be a MySQL community dinner! But be sure to

Truncated by Planet PHP, read more at the original (another 740 bytes)

Link
Shlomi NoachPseudo GTID (22.10.2014, 05:22 UTC)

Pseudo GTID is a method to implement a GTID-like solution where slaves are easily connected to one another. This blog post and the following ones will describe work in progress (some 80% completed), where simulation of GTID makes for a good enough basis for refactoring replication topologies. I'm coding this in orchestrator, which already provides a substantial infrastructure support for this.

The final goal: orchestrator will allow you to move a slave below another, using only the data available by those two slaves. The usage is obvious:

  • Easy master failover (master dead? Orchestrator will choose the most advanced slave to promote and make it master of its siblings)
  • Slave promotion in complex topologies (with deep nested topologies, be able to move a slave up the hierarchy even if its local master is corrupted).

This can all happen with your normal, non GTID, MySQL replication, using your normal binary log files & positions.

This work in progress is inspired by Sam Lambert at GitHub, who has worked on a similar solution with different implementation. I also recall discussions with other DBAs having similar solution.

Pseudo GTID

First thing's first, the basis for proposed solution is a pseudo-GTID. A unique entry in the binary logs (not necessarily sequential; not necessarily in ascending order). While in GTID implementations we have a unique identifier for each entry in the binary log, with pseudo-GTID we accept an occasional (or frequent) unique entry in the binary log.

There are many ways to do so. Certainly a client can generate a unique Id and invoke some statement on MySQL involving that ID. That would serve as valid grounds for the proposed solution. But I like things to be contained within MySQL. Consider, for example, the following event, which would be my preferred choice in Statement Based Replication (for RBR solution, see next post):

drop table if exists test.pseudo_gtid;
create table if not exists test.pseudo_gtid (
  id int unsigned not null primary key,
  ts timestamp,
  gtid varchar(64) charset ascii
);


drop event if exists test.update_pseudo_gtid_event;

delimiter ;;
create event if not exists
  test.update_pseudo_gtid_event
  on schedule every 10 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @pseudo_gtid := uuid();
      insert into test.pseudo_gtid (id, ts, gtid) values (1, NOW(), @pseudo_gtid) on duplicate key update ts=NOW(), gtid=VALUES(gtid);
    end
;;

delimiter ;

The above is based on Making UUID() and RAND() replication safe. What do we get? Once in 10 seconds (or what have you), a unique entry is written to the binary log.

Consider that the event is already running by now, and the next conventional statements executed by the application:

master [localhost] {msandbox} (test) > create table test.vals(id int);
master [localhost] {msandbox} (test) > insert into test.vals (id) values (17);
master [localhost] {msandbox} (test) > insert into test.vals (id) values (18);
master [localhost] {msandbox} (test) > insert into test.vals (id) values (19);
master [localhost] {msandbox} (test) > insert into test.vals (id) values (23);

master [localhost] {msandbox} (test) > show master logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
| mysql-bin.000036 |       531 |
| mysql-bin.000037 |      1269 |
| mysql-bin.000038 |      6627 |
| mysql-bin.000039 |      3313 |
+------------------+-----------+

Let's look at the binary logs content:

master [localhost] {msandbox} (test) > show binlog events in 'mysql-bin.000039';
+------------------+------+-------------+-----------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------+
| Log_name         | Pos  | Event_type  | Server_id | End_log_pos | Info                                                                                                                                        |
+------------------+------+-------------+-----------+-

Truncated by Planet PHP, read more at the original (another 26593 bytes)

Link
Peter ZaitsevPercona XtraDB Cluster: How to run a 2-node cluster on a single server (21.10.2014, 13:53 UTC)
I reckon there’s little sense in running 2 or more Percona XtraDB Cluster (PXC) nodes in a single physical server other than for educational and testing purposes – but doing so is still useful in those cases. The most popular way of achieving this seems to be with server virtualization, such as making use of Vagrant boxes. But in the same way you can have multiple instances of MySQL running in parallel on the OS level in the form of concurrent mysqld processes, so too can you have multiple Percona XtraDB Cluster nodes. And the way to achieve this is precisely the same: using dedicated datadirs and different ports for each node.

 

Which ports?

4 tcp ports are used by Pecona XtraDB Cluster:
  • the regular MySQL port (default 3306)
  • port for group (Galera) communication (default 4567)
  • port for State Transfer (default 4444)
  • port for Incremental State Transfer (default is: port for group communication (4567) + 1 = 4568)
Of course, when you have multiple instances in the same server default values won’t work for all of them so we need to define new ports  for the additional instances and make sure to have the local firewall open to them, if there is one active (iptables, selinux,…).

[{ loading ... }]

Installing Percona XtraDB Cluster, configuring and starting the first node

My test server was a fresh CentOS 6.5 configured with Percona yum repository, from which I installed the latest Percona XtraDB Cluster (5.6.20-25.7.888.el6); note that you’ll need the EPEL repository as well to install socat, which is a dependency (see this bug). To avoid confusion, I’ve prevented the mysql service to start automatically:
chkconfig --level 3 mysql off
chkconfig --del mysql

I could have installed PXC from the tarball but I decided to do it from the repositories to have all dependencies covered by yum. This is how my initial /etc/my.cnf looked like (note the use of default values):
[mysqld]
datadir = /var/lib/mysql
port=3306
socket=/var/lib/mysql/mysql-node1.sock
pid-file=/var/lib/mysql/mysql-node1.pid
log-error=/var/lib/mysql/mysql-node1.err
binlog_format=ROW
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node1
wsrep_cluster_address=gcomm://
I’ve started by manually bootsrapping the cluster with this single node with the command:
$ mysqld_safe --defaults-file=/etc/my.cnf --wsrep-new-cluster

You should then be able to access this node through the local socket:
$ mysql -S /var/lib/mysql/mysql-node1.sock

 

Configuring and starting the second node

Then I created a similar configuration configuration file for the second instance, which I named /etc/my2.cnf, with the following modifications:
[mysqld]
datadir = /var/lib/mysql2
port=3307
socket=/var/lib/mysql2/mysql-node2.sock
pid-file=/var/lib/mysql2/mysql-node2.pid
log-error=/var/lib/mysql2/mysql-node2.err
binlog_format=ROW
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node2
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020
wsrep_provider_options = "base_port=5020;"

Note the use of base_port: by having it defined, port 5020 is used for group communication and 5021 (the one above it) is reserved for IST (it’s the same as using gmcast.listen_addr=tcp://127.0.0.1:5021, just simpler).
You need to create and setup the right permissions to the datadir on this second instance, otherwise MySQL won’t be able to create some files (like .pid and .err), though you don’t need to run the mysql_install_db script:
$ chown -R mysq

Truncated by Planet PHP, read more at the original (another 3368 bytes)

Link
LinksRSS 0.92   RDF 1.
Atom Feed