Peter ZaitsevOpenStack Trove Day 2014 Recap: MySQL and DBaaS (28.8.2014, 08:00 UTC)
OpenStack Trove Day

OpenStack Trove Day

I just returned from a week in Cambridge, Massachusetts where I was attending the OpenStack Trove Day and the Trove mid-cycle meetup, both sponsored by the great folks at Tesora.

I am relatively new to the OpenStack and Trove arenas so this was a fantastic opportunity for me to learn more about the communities, the various components within OpenStack, and what part Trove plays. I found the entire event very worthwhile – I met a lot of key people in the community, learned more about Trove and its potential, and in general felt a great energy and excitement surrounding Trove and OpenStack as a whole.

There were more than 120 attendees at Trove Day. That is almost four times the initial estimate! I think I would call that a success. There were 7 very high quality topics that covered material ranging from new and coming features within Trove, to deep inspection of how it is currently used in several big name companies to an investor’s perspective of the OpenStack market. There were also 2 panel style discussions that covered a lot of ground with all participants being ‘guys on the ground’ actively working with OpenStack deployments including one of my fellow Perconians, Mr. Tim Sharp.

One of the main takeaways for me from the entire day was the forward looking adoption estimates for Trove. This came up over and over through the various talks and panels. There seems to be a tremendous amount of interest in Trove deployments for late 2014/2015 but very few actual live users today. There also seems to be a bit of a messaging issue and confusion amongst potential users as to what Trove really is and is not. Simply reading the Trove Mission Statement should quickly clarify:

The OpenStack Open Source Database as a Service Mission: To provide scalable and reliable Cloud Database as a Service provisioning functionality for both relational and non-relational database engines, and to continue to improve its fully-featured and extensible open source framework.

So allow me to expand on that a bit based on some specific comments or questions that I overheard:
- Trove is NOT a database abstraction layer nor any sort of database unification tool; all applications still communicate with their respective datastores directly through their native APIs.
- Trove is NOT a database monitoring, management or analysis tool; all of your favorite debugging and monitoring tools like Percona Toolkit will still work exactly as advertised, and yes, you do need a monitoring tool.
- Although Trove does have some useful backup scheduling options, Trove is NOT a complete backup and recovery tool that can accommodate every backup strategy; you may still use 3rd party options such as scripting your own around Percona XtraBackup or make your life a lot easier and sign up for the Percona Backup Service.
- Trove IS a very nice way to add resource provisioning for many disparate datastores and has some ‘smarts’ built in for each. This ensures a common user experience when provisioning and managing datastore instances.

To that final point, our friends at Tesora introduced their new Database Certification Program at Trove Day. This new program will ensure a high level of compatibility between the various participating database vendors and the Trove project. Of course, Percona Server has already been certified.

I see the future of Trove as being very bright with a huge potential for expansion into other areas, once it is stabilized. I am very excited to begin contributing to this project and watch it grow.

Until next time…

The post OpenStack Trove Day 2014 Recap: MySQL and DBaaS appeared first on MySQL Performance Blog.

Link
Jean-Jerome SchmidtResources for Database Clusters: Performance Tuning for HAProxy, Support for MariaDB 10, Technical Blogs & More (28.8.2014, 07:28 UTC)
August 28, 2014
By Severalnines

Check Out Our Latest Resources for MySQL, MariaDB & MongoDB Clusters

 

Here is a summary of resources & tools that we’ve made available to you in the past weeks. If you have any questions on these, feel free to contact us!

 

New Technical Webinars

 

performance-haproxy-webinar.png

Performance Tuning of HAProxy for Database Load Balancing

09 September 2014 - with Baptiste Assmann of HAProxy Technologies

Do you know what HAProxy can tell you about your application and database instances? Do you know the difference between short-lived connections and persistent connections, and how they affect the performance and availability of your applications? As a follow-up to our previous webinar on MySQL Load Balancing and HAProxy, we are glad to organize a deep dive into Performance Tuning of HAProxy.

Register here!

 

mariadbgaleracluster.jpg

Automation & Management of MariaDB Galera Clusters: European Webinars with SkySQL - The MariaDB Company

30 Sept. (English) / 07 Oct. (French) / 08 Oct. (German) - with the MariaDB Team

MariaDB Galera Cluster involves more effort and resource to administer than standalone MariaDB systems. In this webinar, we will give you practical advice on how to introduce clusters into your MariaDB environment, automate deployment and make it easier for operational staff to manage and monitor the cluster using ClusterControl.

Register here!

 

Technical Webinar - Replay

 

webinar-haproxy-replay.png

Webinar Replay & Slides: How To Set Up SQL Load Balancing with HAProxy

In this webinar, we covered the concepts around the popular open-source HAProxy load balancer, and demonstrated how to use it with your SQL-based database clusters. We also discussed HA strategies for HAProxy with Keepalived and Virtual IP.

View the replay & read the slides!

 

read more

Link
Peter ZaitsevTrawling the binlog with FlexCDC and new FlexCDC plugins for MySQL (27.8.2014, 15:15 UTC)

Swanhart-Tools includes FlexCDC, a change data capture tool for MySQL. FlexCDC follows a server’s binary log and usually writes “changelogs” that track the changes to tables in the database. I say usually because the latest version of Swanhart-Tools (only in github for now) supports FlexCDC plugins, which allow you to send the updates to a remote data source, or to any other place of your liking.  You can find out more about FlexCDC basics in a previous blog post.

Please note that FlexCDC still needs to have source and destination instances defined in the configuration, even when using plugins.  This is because the FlexCDC state (how much into which binary log has FlexCDC progressed, and what tables are being captured) is stored in the “dest”.  Normally when using a plugin, the source and destination instances will be the same. FlexCDC will create a ‘flexviews’ database with a number of state tables in the destination instance.  This also means you still have to use the create_mvlog.php add_table.php or Flexview’s create_mvlog(…) procedure to mark which tables to capture!  See the previous blog post about FlexCDC.

When you create the mvlog, there will still be a changelog table created in the dest, just like when not using a plugin. This is because the INFORMATION_SCHEMA is used to get column datatypes and additional information (such as if an int is signed or unsigned) and this lookup is done against the table in the dest. The reason this is needed, is because mysqlbinlog, the utility used to scrape the binlog, produces strange output for large signed integers (it provides the signed and unsigned version), thus FlexCDC must figure out the right one to choose from the actual DDL of the changelog table. FlexCDC can’t look at the DDL of the source table though, because the consumer may be behind, and the current structure may not match the structure of the rows in the log.

The new plugin system allows you to do a lot of nifty things like:

  • Replicate to external databases
  • Publish changes to a message queue (this is like Facebook’s Wormhole)
  • Keep a remote cache server in sync
  • and more…

The latest version of Swanhart-Tools includes an Example plugin (in flexviews/consumer/include/example_plugin.php) that simply prints the events that come through it, not logging them into the changelog table at all. There is an example of the output at the end of the post.

The example plugin looks like this:

<?php
class FlexCDC_Plugin {
        static function begin_trx($uow_id, $gsn) {
                echo "START TRANSACTION: trx_id: $uow_id, Prev GSN: $gsn";
        }
        static function commit_trx($uow_id, $gsn) {
                echo "COMMIT: trx_id: $uow_id, Last GSN: $gsn";
        }
        static function rollback_trx($uow_id) {
                echo "ROLLBACK: trx_id: $uow_id";
        }
        static function insert($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: INSERT, AT: $gsn"; print_r($row);
        }
        static function delete($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: DELETE, AT: $gsn"; print_r($row);
        }
        static function update_before($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: UPDATE (OLD), AT: $gsn"; print_r($row);
        }
        static function update_after($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: UPDATE (NEW), AT: $gsn"; print_r($row);
        }
}

Important Note: You must define all seven of these functions in your plugin, even if you do not plan to have actions for each of the callbacks – just leave the function body empty to do no action (the call is simply a noop that case.) Note that the plugin functions must be declared STATIC.  This is due to the way that FlexCDC calls the functions.

Transaction state callbacks
There are three callback functions which notify the plugin of changes in transaction state. Before I go into what they do, I want to note the $trx_id and $gsn parameters which are present in every callback. Each transaction is assigned a monotonically incre

Truncated by Planet PHP, read more at the original (another 4043 bytes)

Link
Peter ZaitsevTrawling the binlog with FlexCDC and new FlexCDC plugins for MySQL (27.8.2014, 15:15 UTC)

Swanhart-Tools includes FlexCDC, a change data capture tool for MySQL. FlexCDC follows a server’s binary log and usually writes “changelogs” that track the changes to tables in the database. I say usually because the latest version of Swanhart-Tools (only in github for now) supports FlexCDC plugins, which allow you to send the updates to a remote data source, or to any other place of your liking.  You can find out more about FlexCDC basics in a previous blog post.

Please note that FlexCDC still needs to have source and destination instances defined in the configuration, even when using plugins.  This is because the FlexCDC state (how much into which binary log has FlexCDC progressed, and what tables are being captured) is stored in the “dest”.  Normally when using a plugin, the source and destination instances will be the same. FlexCDC will create a ‘flexviews’ database with a number of state tables in the destination instance.  This also means you still have to use the create_mvlog.php add_table.php or Flexview’s create_mvlog(…) procedure to mark which tables to capture!  See the previous blog post about FlexCDC.

When you create the mvlog, there will still be a changelog table created in the dest, just like when not using a plugin. This is because the INFORMATION_SCHEMA is used to get column datatypes and additional information (such as if an int is signed or unsigned) and this lookup is done against the table in the dest. The reason this is needed, is because mysqlbinlog, the utility used to scrape the binlog, produces strange output for large signed integers (it provides the signed and unsigned version), thus FlexCDC must figure out the right one to choose from the actual DDL of the changelog table. FlexCDC can’t look at the DDL of the source table though, because the consumer may be behind, and the current structure may not match the structure of the rows in the log.

The new plugin system allows you to do a lot of nifty things like:

  • Replicate to external databases
  • Publish changes to a message queue (this is like Facebook’s Wormhole)
  • Keep a remote cache server in sync
  • and more…

The latest version of Swanhart-Tools includes an Example plugin (in flexviews/consumer/include/example_plugin.php) that simply prints the events that come through it, not logging them into the changelog table at all. There is an example of the output at the end of the post.

The example plugin looks like this:

<?php
class FlexCDC_Plugin {
        static function begin_trx($uow_id, $gsn) {
                echo "START TRANSACTION: trx_id: $uow_id, Prev GSN: $gsn";
        }
        static function commit_trx($uow_id, $gsn) {
                echo "COMMIT: trx_id: $uow_id, Last GSN: $gsn";
        }
        static function rollback_trx($uow_id) {
                echo "ROLLBACK: trx_id: $uow_id";
        }
        static function insert($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: INSERT, AT: $gsn"; print_r($row);
        }
        static function delete($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: DELETE, AT: $gsn"; print_r($row);
        }
        static function update_before($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: UPDATE (OLD), AT: $gsn"; print_r($row);
        }
        static function update_after($row, $db, $table, $trx_id, $gsn) {
                echo "TRX_ID: $trx_id, Schema:$db, Table: $table, DML: UPDATE (NEW), AT: $gsn"; print_r($row);
        }
}

Important Note: You must define all seven of these functions in your plugin, even if you do not plan to have actions for each of the callbacks – just leave the function body empty to do no action (the call is simply a noop that case.) Note that the plugin functions must be declared STATIC.  This is due to the way that FlexCDC calls the functions.

Transaction state callbacks
There are three callback functions which notify the plugin of changes in transaction state. Before I go into what they do, I want to note the $trx_id and $gsn parameters which are present in every callback. Each transaction is assigned a monotonically increasing transactio

Truncated by Planet PHP, read more at the original (another 4011 bytes)

Link
Peter Zaitsevmysqld_multi: How to run multiple instances of MySQL (26.8.2014, 14:42 UTC)

The need to have multiple instances of MySQL (the well-known mysqld process) running in the same server concurrently in a transparent way, instead of having them executed in separate containers/virtual machines, is not very common. Yet from time to time the Percona Support team receives a request from a customer to assist in the configuration of such an environment. MySQL provides a tool to facilitate the execution of multiple instances called mysqld_multi:

“mysqld_multi is designed to manage several mysqld processes that listen for connections on different Unix socket files and TCP/IP ports. It can start or stop servers, or report their current status.”

For tests and development purposes, MySQL Sandbox might be more practical and I personally prefer to use it for my own tests. Both tools work around launching and managing multiple mysqld processes but Sandbox has, as the name suggests, a “sandbox” approach, making it easy to both create and dispose a new instance (including all data inside it). It is more usual to see mysqld_multi being used in production servers: It’s provided with the server package and uses the same single configuration file that people are used to look for when setting up MySQL. So, how does it work? How do we configure and manage the instances? And as importantly, how do we backup all the instances we create?

Understanding the concept of groups in my.cnf

You may have noticed already that MySQL’s main configuration file (or “option file“), my.cnf, is arranged under what is called group structures: Sections defining configuration options specific to a given program or purpose. Usually, the program itself gives name to the group, which appears enclosed by brackets. Here’s a basic my.cnf showing three such groups:

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock
user            = john
password        = p455w0rd
[mysqld]
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
datadir         = /var/lib/mysql
[xtrabackup]
target_dir = /backups/mysql/

The options defined in the group [client] above are used by the mysql command-line tool. As such, if you don’t specify any other option when executing mysql it will attempt to connect to the local MySQL server through the socket in /var/run/mysqld/mysqld.sock and using the credentials stated in that group. Similarly, mysqld will look for the options defined under its section at startup, and the same happens with Percona XtraBackup when you run a backup with that tool. However, the operating parameters defined by the above groups may also be stated as command-line options during the execution of the program, in which case they they replace the ones defined in my.cnf.

Getting started with multiple instances

To have multiple instances of MySQL running we must replace the [mysqld] group in the my.cnf configuration file by as many [mysqlN] groups as we want instances running, with “N” being a positive integer, also called option group number. This number is used by mysqld_multi to identify each instance, so it must be unique across the server. Apart from the distinct group name, the same options that are valid for [mysqld] applies on [mysqldN] groups, the difference being that while stating them is optional for [mysqld] (it’s possible to start MySQL with an empty my.cnf as default values are used if not explicitly provided) some of them (like socket, port, pid-file, and datadir) are mandatory when defining multiple instances – so they don’t step on each other’s feet. Here’s a simple modified my.cnf showing the original [mysqld] group plus two other instances:

[mysqld]
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
datadir         = /var/lib/mysql
[mysqld1]
user            = mysql
pid-file        = /var/run/mysqld/mysqld1.pid
socket          = /var/run/mysqld/mysqld1.sock
port            = 3307
datadir         = /data/mysql/mysql1
[mysqld7]
user            = mysql
pi

Truncated by Planet PHP, read more at the original (another 13984 bytes)

Link
Peter Zaitsevmysqld_multi: How to run multiple instances of MySQL (26.8.2014, 14:42 UTC)

The need to have multiple instances of MySQL (the well-known mysqld process) running in the same server concurrently in a transparent way, instead of having them executed in separate containers/virtual machines, is not very common. Yet from time to time the Percona Support team receives a request from a customer to assist in the configuration of such an environment. MySQL provides a tool to facilitate the execution of multiple instances called mysqld_multi:

“mysqld_multi is designed to manage several mysqld processes that listen for connections on different Unix socket files and TCP/IP ports. It can start or stop servers, or report their current status.”

For tests and development purposes, MySQL Sandbox might be more practical and I personally prefer to use it for my own tests. Both tools work around launching and managing multiple mysqld processes but Sandbox has, as the name suggests, a “sandbox” approach, making it easy to both create and dispose a new instance (including all data inside it). It is more usual to see mysqld_multi being used in production servers: It’s provided with the server package and uses the same single configuration file that people are used to look for when setting up MySQL. So, how does it work? How do we configure and manage the instances? And as importantly, how do we backup all the instances we create?

Understanding the concept of groups in my.cnf

You may have noticed already that MySQL’s main configuration file (or “option file“), my.cnf, is arranged under what is called group structures: Sections defining configuration options specific to a given program or purpose. Usually, the program itself gives name to the group, which appears enclosed by brackets. Here’s a basic my.cnf showing three such groups:

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock
user            = john
password        = p455w0rd
[mysqld]
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
datadir         = /var/lib/mysql
[xtrabackup]
target_dir = /backups/mysql/

The options defined in the group [client] above are used by the mysql command-line tool. As such, if you don’t specify any other option when executing mysql it will attempt to connect to the local MySQL server through the socket in /var/run/mysqld/mysqld.sock and using the credentials stated in that group. Similarly, mysqld will look for the options defined under its section at startup, and the same happens with Percona XtraBackup when you run a backup with that tool. However, the operating parameters defined by the above groups may also be stated as command-line options during the execution of the program, in which case they they replace the ones defined in my.cnf.

Getting started with multiple instances

To have multiple instances of MySQL running we must replace the [mysqld] group in the my.cnf configuration file by as many [mysqlN] groups as we want instances running, with “N” being a positive integer, also called option group number. This number is used by mysqld_multi to identify each instance, so it must be unique across the server. Apart from the distinct group name, the same options that are valid for [mysqld] applies on [mysqldN] groups, the difference being that while stating them is optional for [mysqld] (it’s possible to start MySQL with an empty my.cnf as default values are used if not explicitly provided) some of them (like socket, port, pid-file, and datadir) are mandatory when defining multiple instances – so they don’t step on each other’s feet. Here’s a simple modified my.cnf showing the original [mysqld] group plus two other instances:

[mysqld]
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
datadir         = /var/lib/mysql
[mysqld1]
user            = mysql
pid-file        = /var/run/mysqld/mysqld1.pid
socket          = /var/run/mysqld/mysqld1.sock
port            = 3307
datadir         = /data/mysql/mysql1
[mysqld7]
user            = mysql
pi

Truncated by Planet PHP, read more at the original (another 13968 bytes)

Link
Peter ZaitsevOpenStack’s Trove: The benefits of this database as a service (DBaaS) (25.8.2014, 12:00 UTC)

In a previous post, my colleague Dimitri Vanoverbeke discussed at a high level the concepts of database as a service (DBaaS), OpenStack and OpenStack’s implementation of a DBaaS, Trove. Today I’d like to delve a bit further into Trove and discuss where it fits in, and who benefits.

Just to recap, Trove is OpenStack’s implementation of a database as a service for its cloud infrastructure as a service (IaaS). And as the mission statement declares, the Trove project seeks to provide a scalable and reliable cloud database service providing functionality for both relational and non-relational database engines. With the current release of Icehouse, the technology has begun to show maturity providing both stability and a rich feature set.

In my opinion, there are two primary markets that will benefit from Trove: the first being service providers such as RackSpace who provide cloud-based services similar to Amazon’s AWS. These are companies that wish to expand beyond the basic cloud services of storage and networking and provide their customer base with a richer cloud experience by providing higher level services such as DBaaS functionality. The other players are those companies that wish to “cloudify” their own internal systems. The reasons for this decision are varied, ranging from the desire to maintain complete control over all the architecture and the cloud components to legal constraints limiting the use of public cloud infrastructures.

With Trove, much of the management of your database system is taken care of by automating a significant portion of the configuration and initial setup steps necessitated when launching a new server. This includes deployment, configuration, patching, backups, restores, and monitoring that can be administered from either a CLI interface, RESTful API’s or OpenStack’s Horizon dashboard. At this point, what Trove doesn’t provide is failover, replication and clustering. This functionality is slated to be implemented in the Kilo release of OpenStack due out in April/2015.

The process flow is relatively simple. The OpenStack Administrator first configures the basic infrastructure by installing the database service. He or she would then create an image for each type of database they wish to support such as MySQL or MongoDB. They would then import the images and offer them to their tenants. From the end users perspective only a few commandes are necessary to get up and running. First issuing the <trove create> command to create a database service instance, followed by <trove list> command to get the ID of the instance and finally trove show command to get the IP address of it.

For example to create a database, you first start off by creating a database instance. This is an isolated database environment with compute and storage resources in a single tenant environment on a shared physical host machine. You can run a database instance with a variety of database engines such as MySQL or MongoDB.

From the Trove client I can issue the following command to create a database instance called PS_troveinstance, with a volume size of 2 GB, a user called PS_user, a password PS_password and the MySQL datastore (or database engine):

$ trove create –size 2 –users PS_user:PS_password –datastore MySQL PS_troveinstance

Next I issue the following command to get the ID of the database instance:

$ trove list PS_troveinstance

And finally, to create a database called PS_trovedb, I execute:

$ trove database-create PS_troveinstance PS_trovedb

Alternatively, I could have just combined the above commands as:

$ trove create –size 2 —-database PS_trovedb users PS_user:PS_password –datastore MySQL PS_troveinstance

And thus we now have a MySQL database server containing a database called PS_trovedb.

In our next post on OpenStack/Trove, we’ll dig even further and discuss the software and hardware requirements, and how to actually set up Trove.

On a related note, Percona has several experts attending this week’s OpenStack Operations Summit in San Antonio, Texas. One of them is Matt Griffin, director of product management, who pointed out in a recent post that many OpenStack operators use Percona open source software including the MySQL drop-in compatible Percona Server and Galera-based Percona XtraDB Cluster as well as tools such as

Truncated by Planet PHP, read more at the original (another 987 bytes)

Link
Peter ZaitsevOpenStack’s Trove: The benefits of this database as a service (DBaaS) (25.8.2014, 12:00 UTC)

In a previous post, my colleague Dimitri Vanoverbeke discussed at a high level the concepts of database as a service (DBaaS), OpenStack and OpenStack’s implementation of a DBaaS, Trove. Today I’d like to delve a bit further into Trove and discuss where it fits in, and who benefits.

Just to recap, Trove is OpenStack’s implementation of a database as a service for its cloud infrastructure as a service (IaaS). And as the mission statement declares, the Trove project seeks to provide a scalable and reliable cloud database service providing functionality for both relational and non-relational database engines. With the current release of Icehouse, the technology has begun to show maturity providing both stability and a rich feature set.

In my opinion, there are two primary markets that will benefit from Trove: the first being service providers such as RackSpace who provide cloud-based services similar to Amazon’s AWS. These are companies that wish to expand beyond the basic cloud services of storage and networking and provide their customer base with a richer cloud experience by providing higher level services such as DBaaS functionality. The other players are those companies that wish to “cloudify” their own internal systems. The reasons for this decision are varied, ranging from the desire to maintain complete control over all the architecture and the cloud components to legal constraints limiting the use of public cloud infrastructures.

With Trove, much of the management of your database system is taken care of by automating a significant portion of the configuration and initial setup steps necessitated when launching a new server. This includes deployment, configuration, patching, backups, restores, and monitoring that can be administered from either a CLI interface, RESTful API’s or OpenStack’s Horizon dashboard. At this point, what Trove doesn’t provide is failover, replication and clustering. This functionality is slated to be implemented in the Kilo release of OpenStack due out in April/2015.

The process flow is relatively simple. The OpenStack Administrator first configures the basic infrastructure by installing the database service. He or she would then create an image for each type of database they wish to support such as MySQL or MongoDB. They would then import the images and offer them to their tenants. From the end users perspective only a few commandes are necessary to get up and running. First issuing the <trove create> command to create a database service instance, followed by <trove list> command to get the ID of the instance and finally trove show command to get the IP address of it.

For example to create a database, you first start off by creating a database instance. This is an isolated database environment with compute and storage resources in a single tenant environment on a shared physical host machine. You can run a database instance with a variety of database engines such as MySQL or MongoDB.

From the Trove client I can issue the following command to create a database instance called PS_troveinstance, with a volume size of 2 GB, a user called PS_user, a password PS_password and the MySQL datastore (or database engine):

$ trove create –size 2 –users PS_user:PS_password –datastore MySQL PS_troveinstance

Next I issue the following command to get the ID of the database instance:

$ trove list PS_troveinstance

And finally, to create a database called PS_trovedb, I execute:

$ trove database-create PS_troveinstance PS_trovedb

Alternatively, I could have just combined the above commands as:

$ trove create –size 2 —-database PS_trovedb users PS_user:PS_password –datastore MySQL PS_troveinstance

And thus we now have a MySQL database server containing a database called PS_trovedb.

In our next post on OpenStack/Trove, we’ll dig even further and discuss the software and hardware requirements, and how to actually set up Trove.

On a related note, Percona has several experts attending this week’s OpenStack Operations Summit in San Antonio, Texas. One of them is Matt Griffin, director of product management, who pointed out in a recent post that many OpenStack operators use Percona open source software including the MySQL drop-in compatible Percona Server and Galera-based Percona XtraDB Cluster as well as tools such as

Truncated by Planet PHP, read more at the original (another 963 bytes)

Link
Jean-Jerome Schmidt9 Tips for Going in Production with Galera Cluster for MySQL (23.8.2014, 00:04 UTC)
August 25, 2014
By Severalnines

Are you going in production with Galera Cluster for MySQL? Here are 9 tips to consider before going live. These are applicable to all 3 Galera versions (Codership, Percona XtraDB Cluster and MariaDB Galera Cluster). 

 

1. Galera strengths and weaknesses

 

There are multiple types of replication and cluster technologies for MySQL, make sure you understand how Galera works so you set the right expectations. Applications that run on single instance MySQL might not work well on Galera, you might need to make some changes to the application or the workload might not be appropriate. We’d suggest you have a look at these resources: 

 

2. Database schema

 

Synchronous replication has implications on how well some transactions will run, so consider the following:

  • Each table must have at least one explicit primary key defined.
  • Each table must run under InnoDB or XtraDB storage engine.
  • Chunk up your big transaction in batches. For example, rather than having one transaction insert 100,000 rows, break it up into smaller chunks of e.g., insert 1000 rows per transaction. 
  • Your application can tolerate non-sequential auto-increment values.
  • Schema changes are handled differently. Watch this webinar for details.
  • Handle hotspots/Galera deadlocks by sending writes to a single node.

 

3. Hardware and network

 

Galera will perform as fast as the slowest node. This is due to the nature of synchronous replication, where all nodes need to acknowledge a writeset before committing. Choosing uniform hardware for all nodes is recommended. If you’re running on virtualized environments, monitor CPU Steal Time (the percentage of time a virtual CPU waits for a real CPU while the hypervisor is servicing another virtual processor). 

The disk subsystem is important (and also applies to single instance MySQL). Remember that upgrading the disk layer is a bigger hassle than upgrading the other components like RAM or network interface card. Hardware upgrades can be performed in round-robin fashion, i.e upgrading one node at a time.

read more

Link
Peter ZaitsevWhen (and how) to move an InnoDB table outside the shared tablespace (22.8.2014, 14:29 UTC)

In my last post, “A closer look at the MySQL ibdata1 disk space issue and big tables,” I looked at the growing ibdata1 problem under the perspective of having big tables residing inside the so-called shared tablespace. In the particular case that motivated that post, we had a customer running out of disk space in his server who was looking for a way to make the ibdata1 file shrink. As you may know, that file (or, as explained there, the set of ibdata files composing the shared tablespace) stores all InnoDB tables created when innodb_file_per_table is disabled, but also other InnoDB structures, such as undo logs and data dictionary.

For example, when you run a transaction involving InnoDB tables, MySQL will first write all the changes it triggers in an undo log, for the case you later decide to “roll them back”. Long standing, uncommited transactions are one of the causes for a growing ibdata file. Of course, if you have innodb_file_per_table disabled then all your InnoDB tables live inside it. That was what happened in that case.

So, how do you move a table outside the shared tablespace and change the storage engine it relies on? As importantly, how does that affects disk space use? I’ve explored some of the options presented in the previous post and now share my findings with you below.

The experiment

I created a very simple InnoDB table inside the shared tablespace of a fresh installed Percona Server 5.5.37-rel35.1 with support for TokuDB and configured it with a 1GB buffer pool. I’ve used a 50G partition to host the ‘datadir’ and another one for ‘tmpdir’:

Filesystem                 Size Used Avail Use% Mounted on
/dev/mapper/vg0-lvFernando1 50G 110M  47G   1%  /media/lvFernando1  # datadir
/dev/mapper/vg0-lvFernando2 50G  52M  47G   1%  /media/lvFernando2  # tmpdir

Here’s the table structure:

CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

I populated it with 134 million rows using the following routine:

INSERT INTO joinit VALUES (NULL, uuid(), time(now()),  (FLOOR( 1 + RAND( ) *60 )));
INSERT INTO joinit SELECT NULL, uuid(), time(now()),  (FLOOR( 1 + RAND( ) *60 )) FROM joinit;  # repeat until you reach your target number of rows

which resulted in the table below:

mysql> show table status from test like 'joinit'G
*************************** 1. row ***************************
Name: joinit
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 134217909
Avg_row_length: 72
Data_length: 9783214080
Max_data_length: 0
Index_length: 0
Data_free: 1013972992
Auto_increment: 134872552
Create_time: 2014-07-30 20:42:42
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)

The resulting ibdata1 file was showing to have 11G, which accounted in practice for 100% of the datadir partition use then. What follows next is a few experiences I did by converting that table to use a different storage engine, moving it outside the shared tablespace, compressing it, and dumping and restoring the database back to see the effects in disk space use. I haven’t timed how long running each command took and focused mostly on the generated files size. As a bonus, I’ve also looked at how to extend the shared table space by adding an extra ibdata file.

#1) Converting to MyISAM

Technical characteristics and features apart, MyISAM tables are know to occupy less disk space than InnoDB’s ones. How much less depends on the actual table structure. Here I made the conversion in the most simplest way:

mysql> ALTER TABLE test.joinit ENGINE=MYISAM;

which created the following files (the .frm file already existed):

$ ls -lh /media/lvFernando1/data/test/
total 8.3G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 31 16:21 joinit.frm
-rw-rw---- 1 fernando.laudares admin 7.0G Jul 31 16:27 joinit.MYD
-rw-rw---- 1 fernando.laudares admin 1.3G Jul 31 16:27 joinit.MYI

The resulting MyISAM files amounted for an additional 8.3G of disk space use:

/dev/mapper/vg0-lvFernando1 50G 19G 29G 40% /media/lvFernando1

I was expecting smaller files but, of course, the result depends largely on the data types of the columns composing the table. The problem (or the consequence) is that we end up with close to the double of the initial disk space being used:

As it happens with th

Truncated by Planet PHP, read more at the original (another 13357 bytes)

Link
LinksRSS 0.92   RDF 1.
Atom Feed