Planet MariaDB

March 23, 2017

Jean-Jerome Schmidt

Video: MySQL Replication & ClusterControl Product Demonstration

The video below details the features and functions that are available in ClusterControl for MySQL Replication.  Included in the video are…

  • How to Deploy Master-Slave Replication
  • How to Deploy Multi-Master Replication
  • MySQL Replication overview including metrics
  • Individual Node overview & management
  • Backup management from Slaves or Masters
  • Adding Nodes
  • Adding Load Balancers
ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

ClusterControl for MySQL Replication

ClusterControl provides advanced deployment, management, monitoring, and scaling functionality to get your MySQL replication instances up-and-running using proven methodologies that you can depend on to work.  It makes MySQL Replication easy and secure with point-and click interfaces and no need to have specialized knowledge about the technology or multiple tools. It covers all aspects one might expect for a production-ready replication setup.

ClusterControl delivers on an array of features to help deploy, manage, monitor, and scale your MySQL Replication environments.

  • Point-and-Click Deployment:  Point-and-click, automatic deployment for MySQL replication is available in both community and enterprise versions of ClusterControl.
  • Management & Monitoring: ClusterControl provides management features to repair and recover broken nodes, as well as test and automate MySQL upgrades. It also provides a unified view of all MySQL nodes across your data centers and lets you drill down into individual nodes for more detailed statistics.
  • Automatic Failure Detection and Handling: ClusterControl takes care of your replication cluster’s health. If a master failure is detected, ClusterControl automatically promotes one of the available slaves to ensure your cluster is always up.
  • Proxy Integration: ClusterControl makes it easy to build a proxy layer over your replication setup; it shields applications from replication topology changes, server failures and changed writable masters. With just a couple of clicks you can improve the availability of your stack.

Learn more about how ClusterControl can simply deployment and enhance performance here.

by Art at March 23, 2017 11:00 AM

MariaDB AB

Go Big with Analytics at M|17

Go Big with Analytics at M|17 MariaDB Team Wed, 03/22/2017 - 20:57

Analytics is at the heart of extracting value out of data. Adopting an effective analytics strategy can be a distinctive competitive advantage, particularly with the growth of big data and IoT in the enterprise.

At M|17, we’ve got a whole track dedicated to helping you learn about a powerful, new approach to big data analytics.

Intro to Analytics with ColumnStore: MariaDB ColumnStore is a new technology for analytic use cases. It’s a powerful open source columnar storage engine option for MariaDB that supports a wide variety of analytical use cases with ANSI SQL in highly scalable distributed environments. Attendees will also get a sneak peek at what’s coming in the next version of ColumnStore.

Analytics deep dive: This three hour workshop will cover everything you need to know to get started using MariaDB ColumnStore for high performance analytics from simple queries to complex analytics like cross engine join, aggregation and window function.

Real-world big data analytics use cases: This session will cover real-world big data analytics use cases from various industries to show how to achieve deep analytical insights that drive business growth and change.

Ingesting data into ColumnStore: Best practices on loading data into the powerful columnar storage engine, ColumnStore, along with showcase examples, guidelines and demos.

Storing and querying in ColumnStore: Learn how ColumnStore stores data on disk, in multi-node configuration and gets to data during query processing. This session will give you the tools to optimize your data and queries to take advantage of the ColumnStore data storage architecture.

The entire engineering team, product management, consulting team and many users of MariaDB ColumnStore will be at M|17. Come to learn new technical skills, ask questions of the experts, hear about a broad variety of analytic use cases, and be inspired to find an easier and faster approach to big data analytics!

Checkout the full detailed agenda and register for M|17 today at https://m17.mariadb.com. We look forward to seeing you in New York!

Analytics is at the heart of extracting value out of data. Adopting an effective analytics strategy can be a distinctive competitive advantage, particularly with the growth of big data and IoT in the enterprise.

Login or Register to post comments

by MariaDB Team at March 23, 2017 12:57 AM

Peter Zaitsev

The Puzzling Performance of the Samsung 960 Pro

samsung 960 pro small

In this blog post, I’ll take a look at the performance of the Samsung 960 Pro SSD NVME.

First, I know the Samsung 960 Pro is a consumer SSD NVME drive, not intended for sustained data center workloads. But the AnandTech review looked good enough that I decided to take it for a test spin to see if it would work well with MySQL benchmarks.

Before that, I decided to do a simple sysbench file IO test to see how the drives handled sustained workloads, and if it would start acting up.

My expectation for a consumer SSD drive is that its write consistency will suffer. Many of those drives can sustain high bursts for short periods of time but have to slow down to keep up with write leveling (and other internal activities SSDs must to do). This is not what I saw, however.

I did a benchmark on E5-2630L V3, 64GB RAM Ubuntu 16.04 LTS, XFS Filesystem, Samsung 960 Pro 512GB (FW:1B6QCXP7):  

sysbench --num-threads=64 --max-time=86400 --max-requests=0 --test=fileio --file-num=1 --file-total-size=260G --file-io-mode=async --file-extra-flags=direct --file-test-mode=rndrd run

Note: I used asynchronous direct IO to keep it close to how MySQL (InnoDB) submits IO requests.

This is what the “Read Throughput” graph looks in Percona Monitoring and Management (PMM):

Samsung 960 Pro

As you can see, in addition to some reasonable ebbs and flows we have some major dips from about 1.5GB/sec of random reads to around 800MB/sec. This almost halves the performance. We can clearly see two of those dips, with the third one starting when the test ended.  

What is really interesting is that as I did a read-write test, it performed much more uniformly:

sysbench --num-threads=64 --max-time=86400 --max-requests=0 --test=fileio --file-num=1 --file-total-size=260G --file-io-mode=async --file-extra-flags=direct --file-test-mode=rndrw run

Samsung 960 Pro

Any ideas on what the cause of such strange periodic IO performance regression for reads could be?

This does not look like overheating throttling. It is much too regular for that (and I checked the temperature – is wasn’t any different during this performance regression).

One theory I have is “read disturb management”: could the SSD need to rewrite the data after so many reads? By my calculations, every cell is read some 166 times during the eight hours between those gaps. This doesn’t sound like a lot.

What are your thoughts?

by Peter Zaitsev at March 23, 2017 12:14 AM

March 22, 2017

Peter Zaitsev

Percona Server for MySQL 5.5.54-38.7 is Now Available

Percona Server for MySQL 5.5.54-38.7

Percona Server for MySQL 5.5.54-38.7Percona announces the release of Percona Server for MySQL 5.5.54-38.7 on March 22, 2017. Based on MySQL 5.5.54, including all the bug fixes in it, Percona Server for MySQL 5.5.54-38.7 is now the current stable release in the 5.5 series.

Percona Server for MySQL is open-source and free. You can find release details in the 5.5.54-38.7 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:
  • Log tracking initialization did not find last valid bitmap data correctly, potentially resulting in needless redo log retracking or hole in the tracked LSN range. Bug fixed #1658055.

Other bugs fixed: #1652912, and #1655587.

Find the release notes for Percona Server for MySQL 5.5.54-38.7 in our online documentation. Report bugs on the launchpad bug tracker.

by Hrvoje Matijakovic at March 22, 2017 05:26 PM

MariaDB Foundation

MariaDB Galera Cluster 10.0.30 and Connector/J 1.5.9 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 10.0.30 and MariaDB Connector/J 1.5.9. Both of these are stable (GA) releases. See the release notes and changelogs for details. Download MariaDB Galera Cluster 10.0.30 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator Download […]

The post MariaDB Galera Cluster 10.0.30 and Connector/J 1.5.9 now available appeared first on MariaDB.org.

by Daniel Bartholomew at March 22, 2017 04:12 PM

Valeriy Kravchuk

Fun With Bugs #50 - On Bugs Tagged as "missing manual"

Back in January 2014, some time after many nice people kindly asked me to shut up stop writing about MySQL bugs on Facebook several times per day, I decided to start reading the fine MySQL Manual more carefully than before and report not only typos there, but also any topic or detail not properly explained. Usually these reports, tagged as "missing manual", were the result of careful study of the documentation based on real user question or customer issue. So, most of these reports came from real life, and missing information badly affected poor MySQL users.

Today, for this issue #50 in my series of posts about MySQL bugs, I decided to list and summarize 20 currently active (of 66 total) bugs (mostly documentation requests) tagged as "missing manual", starting from the oldest:
  • Bug #71293 - "Manual page for P_S.FILE_INSTANCES table does not explain EVENT_NAME values". Performance Schema was one of my favorite topics back then, as I was working on my second talk and presentation about it. No single comment since the bug was verified by Umesh Shastry.
  • Bug #71294 - "Manual page for P_S.FILE_INSTANCES table does not explain '~' in FILE_NAME". The bug was re-classified as server one, but still no further activity since then. Go figure what does this output may mean:

    mysql> select * from performance_schema.file_instances where event_name like '%parse%';
    +-----------------------------+------------------------------+------------+
    | FILE_NAME | EVENT_NAME | OPEN_COUNT |
    +-----------------------------+------------------------------+------------+
    | /var/lib/mysql/test/ti.TRG | wait/io/file/sql/file_parser | 0 |
    | /var/lib/mysql/test/v2.frm~ | wait/io/file/sql/file_parser | 0 |
    +-----------------------------+------------------------------+------------+
    2 rows in set (0,00 sec)
  • Bug #71521 - "Manual does not list all valid values for innodb_flush_method". Actually, it seems the manual now lists them all, but the bug was not properly closed.
  • Bug #71732 - "Garbage value in output when MASTER_LOG_FILE='' is set". The bug was re-classified as Replication one, but I doubt that current state is documented in details.
  • Bug #71808 - "Manual does not explain what TICK timer is and why it's different on Windows". Still waiting for something... Had not checked if anything was documented, but TICK timer still exists in 5.7.17.
  • Bug #72368 - "Empty/zero statistics for imported tablespace until explicit ANALYZE TABLE". This is the InnoDB bug, and it seems there was some work performed on it internally, but the only information in the manual about the need to run ANALYZE is in user comment dated October, 2014. Had I already informed you that I hate persistent InnoDB statistics, the way it is implemented, for many reasons (including this bug)? Now you know. Statistics must be stored, engine-independent and re-estimated only upon explcit DBA request, if you ask me...
  • Bug #73299 - "DEFAULT value for PRIMARY KEY column depends on the way to declare it PRIMARY". It's probably a server bug, but maybe, until it is fixed, manual should explain current server's behavior in some note?
  • Bug #73305 - "Manual does not explain all cases when SHOW VIEW privilege is needed". SHOW VIEW privilege may be needed to run EXPLAIN against query referring the view. See Bug #73306 also ("Manual does not explain what privileges are needed for EXPLAIN explainable_stmt"). I still remember user's confusion that led to these report...
  • Bug #73413 - "Manual does not explain MTS implementation in details". Try to find out in the manual what threads are created for multi-threaded slave, what are their typical statuses, does replication event format (ROW vs STATEMENT) matter for MTS or not...
  • Bug #76563 - "Manual does not explain when exactly AUTO-INC lock is set for "bulk inserts"". There are reasons to think that when target table is different from the source one, AUTO-INC lock is set on the target table after reading the first row from the source one. Check my old blog post for more details. This is the first still "Verified" bug in this list that is explicitly devoted to InnoDB locking. You'll see several more below.
  • Bug #77390 - "Manual does not explain a "deadlock" case of online ALTER". Trust me, online ALTER sets metadata lock at early stage, but it is not exclusive. Check some of my posts about MDL and this documentation request: Bug #84004 - "Manual misses details on MDL locks set and released for online ALTER TABLE".
  • Bug #79665 - "Manual does not explain locks set by INSERT ... ON DUPLICATE KEY UPDATE properly". It wouldbe great to see  the manual describing all the locks set by INSERT ... ON DUPLICATE KEY UPDATE carefully and properly, covering both the duplicate on PRIMARY key case and duplicate on secondary UNIQUE key case.
  • Bug #80067 - "Index on BIT column is not used when column name only is used in WHERE clause". It's a pure optimizer bug/problem, but while it is not resolved it would be nice for the manual to describe current behavior.
  • Bug #82127 - "Deadlock with 3 concurrent DELETEs by UNIQUE key". Manual does not explain locks set on secondary indexes properly, for too many cases, including this one. InnoDB does work as designed, and you can find some explanations (by my colleague Jan Lindström) of this design and reasons behind it in MDEV-10962. Check Bug #83640 - "Locks set by DELETE statement on already deleted record" also for the idea of how one may (mis-)interpret what really happens in similar cases. This is because InnoDB's implementation of locking is not properly explained, including implicit locks (see some details and links here), locking of secondary indexes etc. This missing information leads to all kinds of misunderstanding and speculations about "lock upgrading" etc, for decades already.
  • Bug #82212 - "mysqlbinlog can produce events larger than max_allowed_packet for mysql". This is a server problem, but, as I put it, please, describe "safe" setting of max_allowed_packet in case of row-based replication in the manual clearly, as well as any workarounds for the case when max_allowed_packet was 1G on the server that produced binary long with huge row based event that one needs to restore now.
  • Bug #83024 - "Internals manual does not explain COM_SLEEP in details". One may argue that this is truly irrelevant for most users, but it's hard to explain slow log content sometimes:

    SET timestamp=1473712798;
    # administrator command: Sleep;
    # Time: 160912 20:39:59
    # User@Host: user[host] @ [192.168.1.51]
    # Thread_id: 36310042 Schema: somedb QC_hit: No
    # Query_time: 17.201526 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
    without this.
  • Bug #85557 - "Manual does not explain locks set by UPDATE with subquery referring other table". I had to report it yesterday, as some users considers current behavior (proper, but not documented at all) a bug and complained. My dear friend Sinisa Milivojevic verified it promptly.
I understand that it's hard to keep the quality of MySQL manual, and some of the documentation requests mentioned above stay active for years just because it's really a lot of work to document things properly. Documentation team is, probably, concentrated on desribing new and shiny features of MySQL 8.0 (one day I'll start to read its manual) or InnoDB clusters/Group replication.

If the team needs somebody to help, please, get in touch with me as I may have a suggestion whom you can hire. (It's not me, I am not qualified as I am not a native speaker. I'd better report problems and missing details here and there...)

by Valeriy Kravchuk (noreply@blogger.com) at March 22, 2017 12:02 PM

MariaDB Foundation

Who are you? The history of MySQL and MariaDB authentication protocols from 1997 to 2017

MySQL 3.20 to 4.0 In the good old days, when 32MB of RAM justified the name my-huge.cnf, when nobody knew Google and Facebook didn’t even exist, security was… how do I put it… kind of cute. Computer viruses didn’t steal millions and didn’t disrupt elections — they played Yankee Doodle or told you not to […]

The post Who are you? The history of MySQL and MariaDB authentication protocols from 1997 to 2017 appeared first on MariaDB.org.

by Sergei at March 22, 2017 11:13 AM

Jean-Jerome Schmidt

How to Secure MySQL/MariaDB Servers

After attacks on MongoDB databases, we have recently also seen that MySQL servers are being targeted by ransomware. This should not come as a surprise, given the increasing adoption of public and private clouds. Running a poorly configured database in the cloud can become a major liability.

In this blog post, we’ll share with you a number of tips on how to protect and secure your MySQL or MariaDB servers.

Understanding the Attack Vector

Quoting SCMagazine:
The attack starts with brute-forcing the root password for the MySQL database. Once logged in, the MySQL databases and tables are fetched. The attacker then creates a new table called ‘WARNING' that includes a contact email address, a bitcoin address and a payment demand.

Based on the article, the attack vector starts by guessing the MySQL root password via brute-force method. Brute-force attack consists of an attacker trying many passwords or passphrases with the hope of eventually guessing correctly. This means short passwords can usually be discovered quite quickly, but longer passwords may take days or months.

Brute-force is a common attack that would happen to any service. Unfortunately for MySQL (and many other DBMS), there is no out-of-the-box feature that detects and blocks brute-force attacks from specific addresses during user authentication. MySQL does capture authentication failures in the error log though.

Review your Password Policy

Reviewing the MySQL password policy is always the first step to protect your server. MySQL root password should be strong enough with combination of alphabets, numbers and symbols (which makes it harder to remember) and stored in a safe place. Change the password regularly, at least every calendar quarter. Based on the attack vector, this is the weakest point that hackers target. If you value your data, don’t overlook this part.

MySQL deployments performed by ClusterControl will always follow the vendor’s security best practices, for example there will be no wildcard host defined during GRANT and sensitive login credentials stored in the configuration file is permissible only to OS’s root user. We strongly recommend our users to specify a strong password during the deployment stage.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Isolate the MySQL Server

In a standard production environment, database servers are usually located in a lower level tier. This layer should be protected and only accessible from the upper tier, such as application or load balancer. If the database is co-located with the application, you can even lockdown against non-local addresses and use MySQL socket file instead (less overhead and more secure).

Configuring the "bind-address" parameter is vital here. Take note that MySQL binding is limited to either none, one or all IP addresses (0.0.0.0) on the server. If you have no choice and need MySQL to listen to all network interfaces, restrict the access to the MySQL service from known good sources. Use a firewall application or security group to whitelist access only from hosts that need to access the database directly.

Sometimes, the MySQL server has to be exposed to a public network for integration purposes (e.g, monitoring, auditing, backup etc). That’s fine as long as you draw a border around it. Don’t let unwanted sources to “see” the MySQL server. You can bet how many people in the world know 3306 is the default port for MySQL service, and by simply performing a port scan against a network address, an attacker can create a list of exposed MySQL servers in the subnet in less than a minute. Advisedly, use a custom MySQL port by configuring the "port" parameter in the MySQL configuration file to minimize the exposure risk.

Review the User Policy

Limit certain users to hold the critical administration rights, especially GRANT, SUPER and PROCESS. You can also enable super_read_only if the server is a slave, only available on MySQL 5.7.8 and Percona Server 5.6.21 and later (sadly not with MariaDB). When enabled, the server will not allow any updates, beside updating the replication repositories if slave status logs are tables, even for the users that have SUPER privilege. Remove the default test database and any users with empty passwords to narrow the scope of penetration. This is one of the security checks performed by ClusterControl, implemented as a database advisor.

It’s also a good idea to restrict the number of connections permitted to a single account. You can do so by setting the max_user_connections variable in mysqld (default is 0, equal to unlimited) or use the resource control options in GRANT/CREATE USER/ALTER USER statements. The GRANT statement supports limiting the number of simultaneous connections to the server by an account, for example:

mysql> GRANT ALL PRIVILEGES ON db.* TO 'db_user'@'localhost' WITH MAX_USER_CONNECTIONS 2;
Create MySQL account with MAX_USER_CONNECTIONS resource control option using ClusterControl
Create MySQL account with MAX_USER_CONNECTIONS resource control option using ClusterControl

The default administrator username on the MySQL server is “root”. Hackers often attempt to gain access to its permissions. To make this task much harder, rename “root” to something else. MySQL user names can be up to 32 characters long (16 characters before MySQL 5.7.8). It is possible to use a longer username for the super admin user by using the RENAME statement as shown below:

mysql> RENAME USER root TO new_super_administrator_username;

A side note for ClusterControl users, ClusterControl needs to know the MySQL root user and password to automate and manage the database server for you. By default, it will look for ‘root’. If you rename the root user to something else, specify “monitored_mysql_root_user={new_user}” inside cmon_X.cnf (where X is the cluster ID) and restart CMON service to apply the change.

Backup Policy

Even though the hackers stated that you would get your data back once the ransom is paid, this was usually not the case. Increasing the backup frequency would increase the possibility to restore your deleted data. For example, instead of a full backup once a week with daily incremental backup, you can schedule a full backup once a day with hourly incremental backup. You can do this easily with ClusterControl’s backup management feature, and restore your data if something goes wrong.

If you have binary logs (binlogs) enabled, that’s even better. You can create a full backup every day and backup the binary logs. Binlogs are important for point-in-time recovery and should be backed up regularly as part of your backup procedure. DBAs tend to miss this simple method, which is worth every cent. In case if you got hacked, you can always recover to the last point before it happened, provided the hackers did not purge the binary logs. Take note that binary logs purging is only possible when the attacker has SUPER privilege.

One more important thing is that the backup files must be restorable. Verify the backups every now and then, and avoid bad surprises when you need to restore.

Safeguard your Web/Application Server

Well, if you have isolated your MySQL servers, there are still chances for the attackers to access them via web or application server. By injecting a malicious script (e.g, Cross-Site Scripting, SQL injection) against the target website, one can get into the application directory, and have the ability to read the application files. These might contain sensitive information, for instance, the database login credentials. By looking at this, an attacker can simply log into the database, delete all tables and leave a “ransom” table inside. It doesn’t necessarily have to be a MySQL root user to ransom a victim.

There are thousands of ways to compromise a web server and you can’t really close the inbound port 80 or 443 for this purpose. Another layer of protection is required to safeguard your web server from HTTP-based injections. You can use Web Application Firewall (WAF) like Apache ModSecurity, NAXSI (WAF for nginx), WebKnight (WAF for IIS) or simply running your web servers in a secure Content Delivery Network (CDN) like CloudFlare, Akamai or Amazon CloudFront.

Always Keep Up-to-date

You have probably heard about the critical zero-day MySQL exploit, where a non-privileged user can escalate itself to super user? It sounds scary. Luckily, all known vendors has updated their repository to include a bug fix for this issue.

For production use, it’s highly recommended for you to install the MySQL/MariaDB packages from the vendor’s repository. Don’t rely on the default operating system repository, where the packages are usually outdated. If you are running in a cluster environment like Galera Cluster, or even MySQL Replication, you always have the choice to patch the system with minimal downtime. Make this into a routine and try to automate the upgrade procedure as much as possible.

ClusterControl supports minor version rolling upgrade (one node at a time) for MySQL/MariaDB with a single click. Major versions upgrade (e.g, from MySQL 5.6 to MySQL 5.7) commonly requires uninstallation of the existing packages and it is a risky task to automate. Careful planning and testing is necessary for such kind of upgrade.

Conclusion

Ransomware is an easy-money gold pot. We will probably see more security breaches in the future, and it is better to take action before something happens. Hackers are targeting many vulnerable servers out there, and very likely this attack will spread to other database technologies as well. Protecting your data is a constant challenge for database administrators. The real enemy is not the offender, but our attitude towards protecting our critical assets.

by ashraf at March 22, 2017 11:00 AM

Daniël van Eeden

Network attacks on MySQL, Part 3: What do you trust?

In my previous blogs I told you to enable SSL/TLS and force the connection to be secured. So I followed my advice and did forced SSL. Great!

So now everything is 100% secure isn't it?

No it isn't and I would never claim anything to be 100% secure.

There are important differences in the SSL/TLS implementations of browers and the implementation in MySQL. One of these differences is that your browser has a trust store with a large set of trusted certificate authorities. If the website you visit has SSL enabled then your browser will check if the certificate it presents is signed by a trusted CA. MySQL doesn't use a list of trusted CA's, and this makes sense for many setups.

The key difference is that a website has clients (browsers) which are not managed by the same organization. And for MySQL connections the set of clients is often much smaller are more or less managed by one organization. Adding a CA for a set of MySQL connections if ok, adding a CA for groups of websites is not.

The result is that a self signed certificate or a certificate which is signed by an internal CA is ok. An public CA also won't issue a certificate for internal hostnames, so if your server has an internal hostname this isn't even an option. Note that the organization running public CA's sometimes offer a service where they manage your internal CA, but then your CA is not signed by the public CA.

But if you don't tell your MySQL client or application which CA's it should trust it will trust all certifictes. This allows an attacker to use a man-in-the-middle proxy which terminates the SSL connection between your client and the proxy and setup another connection to the server, which may or may not be useing SSL.

To protect against this attack:

  1. Use the --ssl-ca option for the client to specify the CA certificate.
  2. Use the --ssl-mode=VERIFY_CA option for the client.

You could use a CA for each server or a CA you use for all MySQL servers in your organization. If you use multiple CA's then you should bundle them in one file or use --ssl-capath instead.

by Daniël van Eeden (noreply@blogger.com) at March 22, 2017 08:00 AM

March 21, 2017

Peter Zaitsev

Dropping the Foreign Key Constraint Using pt-online-schema-change

pt-online-schema-change

Foreign KeyIn this blog post, we’ll look at how to get rid of the unused Foreign Key (FK) constraint and/or related columns/keys with the help of pt-online-schema-change and the power of its plugins.

Before we proceed, here is a useful blog post written by Peter Zaitsev on Hijacking Innodb Foreign Keys.

If you are trying to get rid of an unused foreign key (FK) constraint and related columns from versions older than MySQL 5.6, or tables that cannot be executed with

ALTER TABLE ... ALGORITHM=INPLACE
 because of limitations mentioned here (specifically, tables with 5.5 TIMESTAMP formats), you can use
pt-online-schema-change
.

For DROP FOREIGN KEY

constraint_name
  with
pt-online-schema-change
 requires specifying
_constraint_name
 rather than the real
constraint_name
. This is due to a limitation in MySQL:
pt-online-schema-change
 adds a leading underscore to foreign key constraint names when creating the new table. 
Here’s is a simple example of one such case:

CREATE TABLE `test3` (
  `Id` int(11) NOT NULL DEFAULT '0',
  `Firstname` varchar(32) DEFAULT NULL,
  `City` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  CONSTRAINT `FKID` FOREIGN KEY (`Id`) REFERENCES `test4` (`Id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1

To drop the constraint, we are supposed to add an underscore prior to

constraint_name
 FKID:

[root@siddhant ~]# pt-online-schema-change --user=root --execute --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="DROP FOREIGN KEY _FKID" D=apps02,t=test3 --socket=/tmp/mysql-master5520.sock
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
……...Altering `apps02`.`test3`...
Creating new table...
Created new table apps02._test3_new OK.
Altering new table….... …….
2017-02-11T12:45:12 Dropped old table `apps02`.`_test3_old` OK.
2017-02-11T12:45:12 Dropping triggers...
2017-02-11T12:45:12 Dropped triggers OK.
Successfully altered `apps02`.`test3`.

Below is one case where if, for some reason, you already have an FK constraint with an underscore the above method of adding an additional underscore to already underscored _FK will fail with an error while dropping it:

Error altering new table `apps02`.`_test3_new`: DBD::mysql::db do failed: Error on rename of './apps02/_test3_new' to './apps02/#sql2-697-19' (errno: 152) [for Statement "ALTER TABLE `apps02`.`_test3_new` DROP FOREIGN KEY ___FKID"] at /usr/bin/pt-online-schema-change line 9069.

In such cases, we will have to make use of the

--plugin
  option used along with a file that calls the 
pt_online_schema_change_plugin
 
class and a hook
after_alter_new_table
 
to drop the FK constraint. For example, a table with the FK constraint with an underscore is:

CREATE TABLE `test` (
  `Id` int(11) NOT NULL DEFAULT '0',
  `Firstname` varchar(32) DEFAULT NULL,
  `City` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  CONSTRAINT `___fkId` FOREIGN KEY (`Id`) REFERENCES `test2` (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Here we have a table with foreign key

___fkid
 using three underscores. Our plugin for dropping the constraint should be as follows:

[root@siddhant ~]# cat ptosc_plugin_drop_fk.pl
package pt_online_schema_change_plugin;
use strict;
sub new {
   my ($class, %args) = @_;
   my $self = { %args };
   return bless $self, $class;
}
sub after_alter_new_table {
   my ($self, %args) = @_;
   my $new_tbl = $args{new_tbl};
   my $dbh     = $self->{cxn}->dbh;
   my $sth = $dbh->prepare("ALTER TABLE $new_tbl->{name} DROP FOREIGN KEY __fkId");
   $sth->execute();
}
1;

NOTE: DROP FOREIGN KEY CONSTRAINT in the plugin has one underscore less than original foreign key constraint, 

__fkId
 vs.
___fkId
. Also, the alter statement will be NOOP alter (i.e., 
--alter ="ENGINE=INNODB"
).

Here is the

pt-online-schema-change
 execution example with the plugin.

[root@siddhant ~]#  pt-online-schema-change --user=root --execute  --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="ENGINE=INNODB" --plugin=/root/ptosc_plugin_drop_fk.pl  D=apps01,t=test --socket=/tmp/mysql-master5520.sock
Created plugin from /root/ptosc_plugin_drop_fk.pl.
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `apps01`.`test`...
Creating new table...
Created new table apps01._test_new OK.
Altering new table...
Altered `apps01`.`_test_new` OK.
2017-02-11T11:26:14 Creating triggers...
2017-02-11T11:26:14 Created triggers OK.
2017-02-11T11:26:14 Copied rows OK.
2017-02-11T11:26:14 Swapping tables...
2017-02-11T11:26:14 Swapped original and new tables OK.
2017-02-11T11:26:14 Dropping old table...
2017-02-11T11:26:14 Dropped old table `apps01`.`_test_old` OK.
2017-02-11T11:26:14 Dropping triggers...
2017-02-11T11:26:14 Dropped triggers OK.
Successfully altered `apps01`.`test`.

by Siddhant Sawant at March 21, 2017 10:35 PM

Webinar Wednesday March 22, 2017: TokuDB Troubleshooting

TokuDB Troubleshooting

TokuDB TroubleshootingPlease join Percona’s Principal Technical Services Engineer, Sveta Smirnova, Senior Software Engineer, George Lorch and Software Engineer, Vlad Lesin as they present TokuDB Troubleshooting on March 22, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

 TokuDB is an alternative storage engine, designed for big data applications. It provides great write workload scalability features. While the feature set is similar to InnoDB’s, this engine has its own specific configuration settings and troubleshooting instruments. This webinar will look at how to use them for optimal performance.

We will discuss how to deal with:

  • Data corruption issues
  • Inconsistent data
  • Locks
  • Slow performance

We will use well-known instruments and tools, and how they work with the TokuDB storage engine.

Register for the webinar here.

Vladislav Lesin, Software Engineer

Vladislav Lesin is a software engineer at Percona, joining in April 2012. Before coming to Percona, he worked on improving the performance and reliability of high load projects with LAMP architectures. His work consisted of developing fast servers and modules with C and C++, projects state monitoring, searching bottlenecks, and open source projects patching including nginx, memcache, sphinx, php, ejabberd. He took part in developing not only server-side applications, but desktop and mobile ones too. He also has experience in project/product management, hiring, partners negotiations.

Before that he worked in several IT companies, where he developed desktop applications on C++ for such areas as industrial automation, parallel computing, media production. He holds a Master’s Degree in Technique and Technology from Tula State University. Now he lives in Tula City with his wife and daughter.

MySQL Character SetsSveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as a Support Engineer in the MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle.

She is the author of the book “MySQL Troubleshooting” and JSON UDF functions for MySQL.

George Lorch, Software Engineer

George joined the Percona development team in April 2012. George has over 20 years of experience in software support, development, architecture and project management. Prior to joining Percona, George was focused on Windows-based enterprise application server development and network protocol classification and optimization, with heavy doses of database schema design, architecture and tuning.

by Dave Avery at March 21, 2017 02:02 PM

Jean-Jerome Schmidt

MySQL Replication: All the Severalnines Resources

MySQL Replication has become an instrumental part of scale-out architectures in LAMP environments. MySQL offers plenty of solutions when there is a need to scale out, the most common being to add read replicas.

Building a database HA stack for production can be daunting. It is not just about setting up replication between a master and some slave servers, it’s also about how to restore broken topologies and fail-over, how applications can keep track of the writable master and the read-only slaves, what to do when servers are corrupted, how to perform backups, and more.

We’ve produced a number of resources aimed at helping users to get started with MySQL Replication or to get more out of their existing setups.

The White Papers

The MySQL© Replication Blueprint

This is a great resource for anyone wanting to build or optimise a MySQL replication setup. The MySQL Replication Blueprint is about having a complete ops-ready solution from end to end. From monitoring, management and through to load balancing, all important aspects are covered.

Download the whitepaper

MySQL Replication for High Availability

This whitepaper covers MySQL Replication with information on the latest features introduced in 5.6 and 5.7. There is also a hands-on, practical section on how to quickly deploy and manage a replication setup using ClusterControl.

Download the whitepaper

The On-Demand Webinars

Top 9 Tips for Building a Stable MySQL© Replication Environment

MySQL replication is a widely known and proven solution to build scalable clusters of databases. It is very easy to deploy, even easier with GTID. However, ease of deployment doesn't mean you don't need knowledge and skills to operate it correctly. If you'd like to learn what is needed to build a stable environment using MySQL replication, then this webinar is for you.

Watch the replay!

Introducing the Severalnines MySQL© Replication Blueprint

The Severalnines Blueprint for MySQL Replication includes all aspects of a MySQL Replication topology with the ins and outs of deployment, setting up replication, monitoring, upgrades, performing backups and managing high availability using proxies as ProxySQL, MaxScale and HAProxy. This webinar provides an in-depth walk-through of this blueprint and explains how to make best use of it.

Watch the replay!

Managing MySQL Replication for High Availability

This webinar covers deployment and management of MySQL replication topologies using ClusterControl. We show you how to schedule backups, promote slaves and what the most important metrics are worth keeping a close eye on. We also demonstrate how you can deal with schema and topology changes and how to solve the most common replication issues.

Watch the replay!

Become a MySQL DBA: Schema Changes for MySQL Replication & Galera Cluster

Find out how to implement schema changes in the least impacting way to your operations and ensure availability of your database. This webinar also covers some real-life examples and discusses how to handle them.

Watch the replay!

Become a MySQL DBA: Replication Topology Changes for MySQL and MariaDB

Discover how to perform replication topology changes in MySQL / MariaDB, and what the failover process may look like. This webinar also discusses some external tools you may find useful when dealing with these operations.

Watch the replay!

Tutorials

MySQL Replication for High Availability - Tutorial

Learn about a smarter Replication setup that uses a combination of advanced replication techniques including mixed binary replication logging, auto-increment offset seeding, semi-sync replication, automatic fail-over/resynchronization and one-click addition of read slaves.  Our tutorial covers the concepts behind our MySQL Replication solution and explains how to deploy and manage it.

Read the Tutorial!

Top Blogs

How to deploy and manage MySQL multi-master replication setups with ClusterControl 1.4

MySQL replication, while simple and popular, may come in different shapes and flavors. Master slave or master master topologies can be configured to suit your environment.  ClusterControl 1.4 brings a list of enhancements to deploy and manage different types of MySQL replication setups. This blog outlines the different topologies that can be deployed, the merits of each topology, and shows how each can be managed in a live environment.

Read More!

Automatic failover of MySQL Replication - New in ClusterControl 1.4

MySQL replication setups are inevitably related to failovers - what do you do when your master fails and your applications are not able to write to the database anymore? Automated failover is required if you need to quickly recover an environment to keep your database up 24x7. This blog post discusses this new replication feature recently introduced in ClusterControl 1.4.

Read More!

Automating MySQL Replication with ClusterControl 1.4.0 - what’s new

This blog post will go through new replication features in ClusterControl 1.4.0, including enhanced multi-master deployment, managing replication topology changes, automated failover and handling of replication errors.

Read More!

MySQL Replication failover: Maxscale vs MHA (A Four Part Series)

This series describes how you can implement automated failover with MariaDB MHA, how you can implement automated failover with MariaDB using Maxscale and MariaDB Replication Manager, how you can implement automated failover with MariaDB using Maxscale and MHA and compares the two with each other, and an addendum on the MariaDB Replication Manager covering the new improved

Read More!

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

ClusterControl for MySQL Replication

ClusterControl provides advanced deployment, management, monitoring, and scaling functionality to get your MySQL replication instances up-and-running using proven methodologies that you can depend on to work. It makes MySQL Replication easy and secure with point-and click interfaces and no need to have specialized knowledge about the technology or multiple tools. It covers all aspects one might expect for a production-ready replication setup.

ClusterControl delivers on an array of features to help deploy, manage, monitor, and scale your MySQL Replication environments.

  • Point-and-Click Deployment: Point-and-click, automatic deployment for MySQL replication is available in both community and enterprise versions of ClusterControl.
  • Management & Monitoring: ClusterControl provides management features to repair and recover broken nodes, as well as test and automate MySQL upgrades. It also provides a unified view of all MySQL nodes across your data centers and lets you drill down into individual nodes for more detailed statistics.
  • Automatic Failure Detection and Handling: ClusterControl takes care of your replication cluster’s health. If a master failure is detected, ClusterControl automatically promotes one of the available slaves to ensure your cluster is always up.
  • Proxy Integration: ClusterControl makes it easy to build a proxy layer over your replication setup; it shields applications from replication topology changes, server failures and changed writable masters. With just a couple of clicks you can improve the availability of your stack.

Learn more about how ClusterControl can simply deployment and enhance performance here.

We trust that these resources prove useful!

Happy replicating!

by Severalnines at March 21, 2017 11:00 AM

March 20, 2017

Peter Zaitsev

Running Percona XtraBackup on Windows … in Docker

Percona XtraBackup

Percona XtraBackupIn this blog, we’ll look at running Percona XtraBackup on Windows via a Docker container.

The question whether Percona XtraBackup is available for Windows comes up every so often. While we are not planning to provide regular releases for Windows, I decided to share a way to run Percona XtraBackup in a Docker container (especially since Docker support for Windows has become more and more stable).

For this exercise, I created a playground Docker image: perconalab/percona-xtrabackup.

First, we need to prepare a few things to make it work:

  1. Install Docker on Windows (the current version I am running is 17.03)
  2. Enable the sharing of disk C in Docker settings
  3. Find out the IP address MySQL is running on (192.168.1.122 in my case)
  4. Grant backup-required privileges for the xtrabackup user:

GRANT RELOAD,PROCESS,LOCK TABLES,REPLICATION CLIENT ON *.* TO 'xtrabackup'@'192.%' IDENTIFIED by 'xtrapassword'

Now, let’s assume our datadir is in C:/mysqldata, and we want to backup to C:/mysqlbackup. Needless to say, that XtraBackup image must run on the same server as MySQL’s datadir (since XtraBackup needs to access the data to copy it).

Now to take a backup we execute:

docker run --rm -it -v //C/mysqldata:/var/lib/mysql -v //C/mysqlbackup:/xtrabackup_backupfiles perconalab/percona-xtrabackup --backup --host=192.168.1.122 --user=xtrabackup --password=xtrapassword

We find our backup in C:/mysqlbackup when it is done.

Enjoy!

by Vadim Tkachenko at March 20, 2017 11:06 PM

Prophet: Forecasting our Metrics (or Predicting the Future)

Prophet

In this blog post, we’ll look at how Prophet can forecast metrics.

Facebook recently released a forecasting tool called Prophet. Prophet can forecast a particular metric in which we have an interest. It works by fitting time-series data to get a prediction of how that metric will look in the future.

For example, it could be used to:

  • Predict how much HTTP traffic we will get, and scale accordingly when needed
  • See if a particular feature of our application will have success or if its usage will decline
  • Get an approximate date when our database server’s resources will be exhausted
  • Forecast new customer’s sign up and resize the staff accordingly
  • See what next year’s Black Friday or Cyber Monday will look like, and if we have the resources to handle them
  • Predict how many animals will enter a shelter in the coming years, as I did in a personal project I will show here

At its core, it uses a Generalized Additive Model. It is basically the merging of two models. First, a generalized linear model that, in the case of Prophet, can be a linear or logistic regression (depending on what we choose). Second, an additive model applied to that regression. The final graph represents the combination of those two. That is, the smoothed regression area of the variable to predict. For more technical details of how it works, check out Prophet’s paper.

Most of the previous points can be summarized in a simple concept, capacity planning. Let’s see how it works.

Usage Example

Prophet provides either a Python or R library. The following example will use the Python one. You can install it using:

pip install prophet

Prophet expects the metrics with a particular structure: a Pandas DataFrame with two columns, ds and y:

ds y
0 2013-10-01 34
1 2013-10-02 43
2 2013-10-03 20
3 2013-10-04 12
4 2013-10-05 46

 

The data I am going to use here is from Kaggle Competition Shelter Animal Outcomes. The idea is to find out how Austin Animal Center‘s workload will evolve in the future by trying to predict the number of animal outcomes per day for the next three years. I am using this dataset because it has enough data, shows a very simple trend and it is a non-technical metric (no previous knowledge on the topic is needed). The same method can be applied to most of the services or business metrics you could have.

At this point, we have the metric stored in a local variable, called “series” in this particular example. Now we only need to fit it into our model:

m = Prophet()
m.fit(series);

and define how far into the future we want to predict (three years in this case):

future = m.make_future_dataframe(periods=365*3)

Now, just plot the data:

m.plot(forecast)
plt.title("Outcomes forecast per Year",fontsize=20)
plt.xlabel("Year",fontsize=20)
plt.ylabel("Number of outcomes",fontsize=20)
plt.show()

Prophet

The graph shows a smoothed regression surface. We can see that the data provided covers from the last months 2013 to the first of 2016. From that point, those are the predictions.

We can already find some interesting data. Our data shows a large increase during the summer months and predicts it to continue in the future. But this representation also has some problems. As we can see, there are at least three outliers with values > 65. The fastest way to deal with outliers is to just remove them. 🙂

series[series["y"]>65]

ds y
0 2014-07-12 129
1 2015-07-18 97
2 2015-07-19 81

 

series.drop(series[series["y"]>60].index,inplace=True)

Now the graph looks much better. Let’s also add a horizontal line that will help to see the trend:

Prophet

From that forecast, Austin Animal Center should expect an increase in the next few years but not a large one. Therefore, the increase trend year-over-year won’t cause problems in the near future. But there could be a moment when we reach the shelter’s maximum capacity.

Recommendations

  • If we want to forecast a metric, we recommend you have at least one year of data to fit the model. If we have less data, we could miss some seasonal effects. In our model above, for example, the large increase of work during summer months.
  • In some cases, you might only want information about particular holidays (for example Black Fridays or Christmas). In that case, it is possible to create a model for those particular days. The documentation explains how to do this. But in summary, you need to create a new Pandas DataFrame that includes all previous Black Friday dates, and those from the future that you want to predict. Then, create the model as before, but specify that you are interested in a holiday effect:
    m = Prophet(holidays=holidays)
  • We recommend you use daily data. The graph could show strange results if we want daily forecasts from non-daily data. In case the metric shows monthly information, freq=’M’ can be used (as shown in the documentation).

Conclusion

When we want to predict the future of a particular metric, we can use Prophet to make that forecast, and then plan for it based on the information we get from the model. It can be used on very different types of problems, and it is very easy to use. Do you want to know how loaded your database will be in the future? Ask Prophet!

by Miguel Angel Nieto at March 20, 2017 07:54 PM

Valeriy Kravchuk

Testing MyRocks vs InnoDB Performance Using sysbench 1.x oltp_point_select.lua

It seems MyRocks is going to become a hot topic in April 2017. Previously (here and there) I tried to compare its performance and scalability vs InnoDB from MySQL 5.7.17 using test case from famous bug #68079. It's an interesting case that took a lot of efforts from Oracle to make InnoDB scale properly, and InnoDB (on my QuadCore box at least, others reported different results on other hardware in comments) still outperformed MyRocks. But maybe it's corner case that is not a big deal in general?

Earlier this month I decided to give MyRocks another chance and try it with "industry-standard" benchmarks, like those provided by sysbench tool. At the same time, I studied the impact of adaptive hash indexing (AHI) on InnoDB (for the reason i am not yet ready to share), along the lines of this great post by Peter Zaitsev. The study is not yet complete, and I am not yet sure that it makes sense to continue doing it on my ages old QuadCore box with Fedora 25, but in the process I've got one interesting and repeatable result that I'd like to share in any case.

For that study I decided to use recent sysbench 1.1.x, so I had to build it from source to begin with. I did the following:
[openxs@fc23 git]$ git clone https://github.com/akopytov/sysbench.git
but then during ./configure run I've got a small problem:
...
checking for pkg-config... yes
checking for xxd... no
configure: error: "xxd is required to build sysbench (usually comes with the vim package)"
So, I had to install vim package:
[openxs@fc23 sysbench]$ sudo yum install vim
...
Installed:
  gpm-libs.x86_64 1.20.7-9.fc24         vim-common.x86_64 2:8.0.386-1.fc25
  vim-enhanced.x86_64 2:8.0.386-1.fc25  vim-filesystem.x86_64 2:8.0.386-1.fc25

Complete!
and then build and installation process (with all defaults and MariaDB software provided by Fedora present) completed without any problem, and I've ended up with nice new sysbench version:
[openxs@fc23 sysbench]$ /usr/local/bin/sysbench --version
sysbench 1.1.0-2343e4b

[openxs@fc23 sysbench]$ ls /usr/local/share/sysbench/
bulk_insert.lua  oltp_point_select.lua  oltp_update_non_index.lua  tests
oltp_common.lua  oltp_read_only.lua     oltp_write_only.lua
oltp_delete.lua  oltp_read_write.lua    select_random_points.lua
oltp_insert.lua  oltp_update_index.lua  select_random_ranges.lua
As I use all default settings for both MyRocks and InnoDB, I decided to start testing with the oltp_point_select.lua simplest test and table size that does NOT fit into the default 128M of buffer pool in InnoDB case:
[openxs@fc23 sysbench]$ sysbench /usr/local/share/sysbench/oltp_point_select.lua --report-interval=1 --oltp-table-size=1000000 --max-time=0 --oltp-read-only=off --max-requests=0 --num-threads=1 --rand-type=uniform --db-driver=mysql --mysql-db=test --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-password= prepare
sysbench 1.1.0-2343e4b (using bundled LuaJIT 2.1.0-beta2)

invalid option: --oltp-table-size=1000000
Note that good old command lines copied from older sysbench versions verbatim may NOT work any more in 1.1.x. Some options changed, now the names are shorter:
[openxs@fc23 sysbench]$ sysbench /usr/local/share/sysbench/oltp_point_select.lua help
sysbench 1.1.0-2343e4b (using bundled LuaJIT 2.1.0-beta2)

oltp_point_select.lua options:
  --distinct_ranges=N           Number of SELECT DISTINCT queries per transaction [1]
  --sum_ranges=N                Number of SELECT SUM() queries per transaction [1]
  --skip_trx[=on|off]           Don't start explicit transactions and execute all queries in the AUTOCOMMIT mode [off]
  --secondary[=on|off]          Use a secondary index in place of the PRIMARY KEY [off]
  --create_secondary[=on|off]   Create a secondary index in addition to the PRIMARY KEY [on]
  --index_updates=N             Number of UPDATE index queries per transaction [1]
  --range_size=N                Range size for range SELECT queries [100]
  --auto_inc[=on|off]           Use AUTO_INCREMENT column as Primary Key (for MySQL), or its alternatives in other DBMS. When disabled, use client-generated IDs [on]
  --delete_inserts=N            Number of DELETE/INSERT combination per transaction [1]
  --tables=N                    Number of tables [1]
  --mysql_storage_engine=STRING Storage engine, if MySQL is used [innodb]
  --non_index_updates=N         Number of UPDATE non-index queries per transaction [1]
  --table_size=N                Number of rows per table [10000]
  --pgsql_variant=STRING        Use this PostgreSQL variant when running with the PostgreSQL driver. The only currently supported variant is 'redshift'. When enabled, create_secondary is automatically disabled, and delete_inserts is set to 0
  --simple_ranges=N             Number of simple range SELECT queries per transaction [1]
  --order_ranges=N              Number of SELECT ORDER BY queries per transaction [1]
  --range_selects[=on|off]      Enable/disable all range SELECT queries [on]
  --point_selects=N             Number of point SELECT queries per transaction [10]
I've ended up creating the table like this for InnoDB case:
[openxs@fc23 sysbench]$ sysbench /usr/local/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 --num-threads=1 --rand-type=uniform --db-driver=mysql --mysql-db=test --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-password= prepare
sysbench 1.1.0-2343e4b (using bundled LuaJIT 2.1.0-beta2)

Creating table 'sbtest1'...
Inserting 1000000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'
to end up with the following table:
mysql> show table status like 'sbtest1'\G
*************************** 1. row ***************************
           Name: sbtest1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 986400
 Avg_row_length: 228
    Data_length: 225132544
Max_data_length: 0
   Index_length: 0
      Data_free: 4194304
 Auto_increment: 1000001
    Create_time: 2017-03-02 16:18:57
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)
For MyRocks I also had to specify storage engine explicitly:
[openxs@fc23 fb56]$ sysbench /usr/local/share/sysbench/oltp_point_select.lua --table-size=1000000 --threads=1 --rand-type=uniform --db-driver=mysql --mysql-db=test --mysql-socket=/tmp/mysql.sock --mysql-user=root --mysql-password= --mysql_storage_engine=rocksdb prepare
sysbench 1.1.0-2343e4b (using bundled LuaJIT 2.1.0-beta2)

Creating table 'sbtest1'...
Inserting 1000000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...
to end up with the following table:
mysql> show table status like 'sbtest1'\G
*************************** 1. row ***************************
           Name: sbtest1
         Engine: ROCKSDB
        Version: 10
     Row_format: Fixed
           Rows: 1000000
 Avg_row_length: 198
    Data_length: 198545349
Max_data_length: 0
   Index_length: 16009534
      Data_free: 0
 Auto_increment: 1000001
    Create_time: NULL
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)
Note that in case of InnoDB I've used MySQL 5.7.17 from Oracle, and MyRocks was built from this commit using my usual cmake options:
[openxs@fc23 mysql-5.6]$ git log -1
commit 01c386be8b02e6469b934c063aefdf8403844d99
Author: Herman Lee <herman@fb.com>
Date:   Wed Mar 1 18:14:25 2017 -0800

[openxs@fc23 mysql-5.6]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DWITH_EMBEDDED_SERVER=OFF -DENABLED_LOCAL_INFILE=1 -DENABLE_DTRACE=0 -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/fb56
I've run the tests for InnoDB with adaptive hash indexing set to ON (by default) and OFF (changed at run time), and then for MyRocks, using 1, 2, 4, 8, 16, 32 and 64 (all cases but InnoDB with AHI ON) concurrent threads, with sysbench command line like this to run the test for 60 seconds (note new options syntax of sysbench 1.x: --time, --threads etc):
[openxs@fc23 fb56]$ sysbench /usr/local/share/sysbench/oltp_point_select.lua --table-size=1000000 --time=60 --threads=1 --rand-type=uniform --db-driver=mysql --mysql-db=test --mysql-socket=/tmp/mysql.sock --mysql-user=root run
sysbench 1.1.0-2343e4b (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Initializing worker threads...

Threads started!

SQL statistics:
    queries performed:
        read:                            821511
        write:                           0
        other:                           0
        total:                           821511
    transactions:                        821511 (13691.77 per sec.)
    queries:                             821511 (13691.77 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          60.0003s
    total number of events:              821511

Latency (ms):
         min:                                  0.06
         avg:                                  0.07
         max:                                  1.11
         95th percentile:                      0.08
         sum:                              59537.46

Threads fairness:
    events (avg/stddev):           821511.0000/0.00
    execution time (avg/stddev):   59.5375/0.00
and then summarized the results into the following chart:


One day I'll share raw results, as a gist or somehow else, but for now let me summarize my findings as of March 3, 2017:
  1. MyRocks really rocks with this oltp_point_select.lua --table-size=1000000 test of sysbench 1.1.0! With default settings of server variables it outperformed InnoDB from MySQL 5.7.17 at all number of threads tested, from 1 to 64, and proved good scalability on up to 64 threads on my QuadCore box. I've got more than 45K QPS starting from 4 threads.
  2. InnoDB with disabled AHI is somewhat faster for this test than with enabled AHI, highest result was almost 44K QPS with AHI OFF on 4 threads.
  3. It seems my QuadCore is not relevant any more for serious benchmarks, as for quite a some time people use 8 to 18 cores per socket etc and start with 200K QPS with 8 threads already.

by Valeriy Kravchuk (noreply@blogger.com) at March 20, 2017 01:08 PM

Jean-Jerome Schmidt

High Availability in ProxySQL: new webinar with René Cannaò

Following the interest we saw in this topic during our recent introduction webinar to ProxySQL, we’re pleased to invite you to join this new webinar on high availability in ProxySQL.

As you will know, the proxy layer is crucial when building a highly available MySQL infrastructure. It is therefore imperative to not let it become a single point of failure on its own. And building a highly available proxy layer creates additional challenges, such as how to manage multiple proxy instances, how to ensure that their configuration is in sync, Virtual IP and fail-over.

In this new webinar with ProxySQL’s creator, René Cannaò, we’ll discuss building a solid, scalable and manageable proxy layer using ProxySQL. And we will demonstrate how you can make your ProxySQL highly available when deploying it from ClusterControl.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, April 4th at 09:00 BST (UK) / 10:00 CEST (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, April 4th at 9:00 Pacific Time (US) / 12:00 Eastern Time (US)

Register Now

Agenda

  • Introduction
  • High Availability in ProxySQL
    • Layered approach
    • Virtual IP
    • Keepalived
  • Configuration management in distributed ProxySQL clusters
  • Demo: ProxySQL + keepalived in ClusterControl
    • Deployment
    • Failover
  • Q&A

Speakers

René Cannaò, Creator & Founder, ProxySQL. René has 10 years of working experience as a System, Network and Database Administrator mainly on Linux/Unix platform. In the last 4-5 years his experience was focused mainly on MySQL, working as Senior MySQL Support Engineer at Sun/Oracle and then as Senior Operational DBA at Blackbird, (formerly PalominoDB). In this period he built an analytic and problem solving mindset and he is always eager to take on new challenges, especially if they are related to high performance. And then he created ProxySQL …

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

We look forward to “seeing” you there and to insightful discussions!

If you have any questions or would like a personalised live demo, please do contact us.

by jj at March 20, 2017 12:29 PM

MariaDB AB

CONNECT BY is dead, long live CTE! In MariaDB Server 10.2!

CONNECT BY is dead, long live CTE! In MariaDB Server 10.2! anderskarlsson4 Mon, 03/20/2017 - 06:50

Yes, you got that right, the old CONNECT BY as used by recursive SQL with Oracle has been replaced by Common Table Expressions, or the WITH statement in SQL:1999 which is now also available in MariaDB Server 10.2.4 (RC). Now, the SQL WITH construct, using Common Table Expressions or CTE, is useful for other things than just recursive queries, but this is the one feature that WITH enables that was previously very hard to do without some procedural code, the non-recursive use of Common Table Expressions could previously mostly be replaced by using temporary tables.


This blog post explains what recursive SQL is all about and why this is useful, and I will show some examples of both CONNECT BY and how the same SQL is written using the WITH clause.


The most common example for recursive SQL is probably for doing a parts explosion, where we have a table of parts of some component where each part is either a main, top level, part or is a part of another part. For example, a car with an engine, where the engine consists of pistons, cylinders and a camshaft, where the latter also includes some camshaft bearings. I think you get the basic idea here. To query this data to create a list of components that make up some other component, you need to recursively visit the data, i.e. each row is evaluated using conditions from any other row already fetched, except the first row fetched that is.


Now, let's look at some data first. I assume we have two tables here, one table that contains information on the different parts and then one table that contains information on the individual parts and then one table that contains the hierarchy of the parts, called components. Like this:

CREATE TABLE parts(part_id INTEGER NOT NULL PRIMARY KEY,
  part_name VARCHAR(60) NOT NULL);

CREATE TABLE components(comp_id INTEGER NOT NULL PRIMARY KEY,
  comp_name VARCHAR(60),
  comp_count INTEGER NOT NULL,
  comp_part INTEGER NOT NULL,
  comp_partof INTEGER,
  FOREIGN KEY(comp_part) REFERENCES parts(part_id));
ALTER TABLE components ADD FOREIGN KEY(comp_partof) REFERENCES components(comp_id);


The two things to note here is that the components table has a column, comp_partof, that implements the hierarchy and that there is a self-referencing FOREIGN KEY constraint on this table. Given these tables, assuming that we are a small privately held car-manufacturing company in southern Germany, let's insert some data:

INSERT INTO parts VALUES(1, 'Car');
INSERT INTO parts VALUES(2, 'Bolt');
INSERT INTO parts VALUES(3, 'Nut');
INSERT INTO parts VALUES(4, 'V8 engine');
INSERT INTO parts VALUES(5, '6-cylinder engine');
INSERT INTO parts VALUES(6, '4-cylinder engine');
INSERT INTO parts VALUES(7, 'Cylinder block');
INSERT INTO parts VALUES(8, 'Cylinder');
INSERT INTO parts VALUES(9, 'Piston');
INSERT INTO parts VALUES(10, 'Camshaft');
INSERT INTO parts VALUES(11, 'Camshaft bearings');
INSERT INTO parts VALUES(12, 'Body');
INSERT INTO parts VALUES(13, 'Gearbox');
INSERT INTO parts VALUES(14, 'Chassie');
INSERT INTO parts VALUES(15, 'Rear axle');
INSERT INTO parts VALUES(16, 'Rear break');
INSERT INTO parts VALUES(17, 'Wheel');
INSERT INTO parts VALUES(18, 'Wheel bolts');

INSERT INTO components VALUES(1, '320', 1, 1, NULL);
INSERT INTO components VALUES(2, NULL, 1, 6, 1);
INSERT INTO components VALUES(3, NULL, 1, 7, 2);
INSERT INTO components VALUES(4, NULL, 4, 8, 3);
INSERT INTO components VALUES(5, NULL, 4, 9, 3);
INSERT INTO components VALUES(6, NULL, 1, 10, 3);
INSERT INTO components VALUES(7, NULL, 3, 11, 6);
INSERT INTO components VALUES(8, NULL, 1, 12, 1);
INSERT INTO components VALUES(9, NULL, 1, 14, 1);
INSERT INTO components VALUES(10, NULL, 1, 15, 9);
INSERT INTO components VALUES(11, NULL, 2, 16, 10);

INSERT INTO components VALUES(12, '323 i', 1, 1, NULL);
INSERT INTO components VALUES(13, NULL, 1, 5, 12);

If you are not into mechanics, let me tell you that there are more parts than this to a car, for example, I left out a few critical components, such as the cupholder, the dog that stands on the pickup cargo area and the insulting bumber-sticker, but I think you get the idea. Note that there are two "main" components, the '320' and '323 i' and that these are top level components are indicated by the comp_partof column being set to NULL.


Now, assume you want to list all the parts that make up a 320. The way this works when using the CONNECT BY syntax, you compose one single SQL statement and provide a CONNECT BY clause to indicate the relationship. Like this:

SELECT LPAD('-', level, '-')||'>' level_text, comp_count, NVL(comp_name, part_name) name
FROM components c, parts p
WHERE c.comp_part = p.part_id
START WITH c.comp_name = '320'
CONNECT BY PRIOR c.comp_id = c.comp_partof;

 

Let me explain this a bit, but there is nothing really magic here. We are selecting from the two tables and joining them just as usual. Then we use the START WITH clause to define the top level component and then the rest of the components are have a comp_partof of a component that matches the comp_id of the START WITH component or a  comp_id of any other component that has been fetched.

This way of writing recursive SQL has some advantages, such as it is relatively compact and is easy to understand. The disadvantage is that there are some quirks and limitation to this and that once your queries gets more complex, CONNECT BY gets a bit hairy. One sure sign that CONNECT BY is going away, even though I and many others tend to like it because of the ease of use, is that even Oracle, as of Oracle 11g, also has implemented the WITH construct, or Common Table Expressions or CTE. So looking at the above statement how this would work in MariaDB 10.2, this is what it would look like using the WITH construct:

WITH RECURSIVE comp(comp_id, comp_name, comp_partof, comp_count) AS (
  SELECT comp_id, comp_name, comp_partof, comp_count
    FROM components JOIN parts ON comp_part = part_id
    WHERE comp_partof IS NULL AND comp_name = '320'
  UNION ALL
  SELECT c1.comp_id, p.part_name, c1.comp_partof, c1.comp_count
  FROM components c1 JOIN parts p ON c1.comp_part = p.part_id
    JOIN comp c2 ON c1.comp_partof = c2.comp_id)
SELECT comp_count, comp_name FROM comp;

 

Comparing this CTE version to the CONNECT BY version as above, this is a bit more complex, but how it works is actually pretty clear once you look at it carefully. To begin with, the top level item or anchor is the first SELECT in the UNION ALL and the following components are fetched using the second SELECT. Then the recursive aspect is handled by this UNION being run until there are no more rows returned from it? As you can see, although this requires more text and more complex SQL to write, it is also a fair bit more flexible. For example, the anchor point is defined by a completely separate SELECT which means it can be whatever SELECT you want, selecting from any odd table. Secondly, the column you use and the conditions for defining the hierarchy can be as complex as you want. And thirdly, there is also the power of that last SELECT which in the case above just gets the data from the UNION, but you can actually apply any kind of filter, ordering or column filter to this query. The result of the query above is this:

comp_count      comp_name
1               320
1               4-cylinder engine
1               Body
1               Chassie
1               Cylinder block
1               Rear axle
4               Cylinder
4               Piston
1               Camshaft
2               Rear break
3               Camshaft bearings

 

Before I finish this off, the WITH RECURSIVE statement is somewhat overly complex, in MariaDB 10.2 you can, for example, skip listing the column names of the recursive table, like this:

WITH RECURSIVE comp AS (
  SELECT comp_id, comp_name, comp_partof, comp_count
    FROM components JOIN parts ON comp_part = part_id
    WHERE comp_partof IS NULL AND comp_name = '320'
  UNION ALL
  SELECT c1.comp_id, p.part_name, c1.comp_partof, c1.comp_count
  FROM components c1 JOIN parts p ON c1.comp_part = p.part_id
    JOIN comp c2 ON c1.comp_partof = c2.comp_id)
SELECT comp_count, comp_name FROM comp;

 

And although Oracle 11 and up supports the CTEs, it works a bit differently. For one thing, the RECURSIVE keyword isn't supported (it is assumed to be recursive by default) and the way I read the SQL standard, this is actually wrong, for recursive queries you have to use the RECURSIVE keyword. Second, Oracle does require the SELECT-list. So in Oracle, you would see something like this:

WITH comp(comp_id, comp_name, comp_partof, comp_count) AS (
  SELECT comp_id, comp_name, comp_partof, comp_count
    FROM components JOIN parts ON comp_part = part_id
    WHERE comp_partof IS NULL AND comp_name = '320'
  UNION ALL
  SELECT c1.comp_id, p.part_name, c1.comp_partof, c1.comp_count
  FROM components c1 JOIN parts p ON c1.comp_part = p.part_id
    JOIN comp c2 ON c1.comp_partof = c2.comp_id)
SELECT comp_count, comp_name FROM comp;

Yes, we are all happily following the same SQL standard. Somewhat...
See the MariaDB Knowledge Base for more information on common table expressions.


Happy SQL'ing

/Karlsson

MariaDB Server 10.2 introduces Common Table Expressions (CTEs) that allows recursive SQL, that replaces the CONNECT BY syntax of older database systems. Here, we take a look at CTEs, how they relate to CONNECT BY and what these can be used for.

Login or Register to post comments

by anderskarlsson4 at March 20, 2017 10:50 AM

March 17, 2017

Peter Zaitsev

Percona XtraDB Cluster 5.7.17-27.20 is now available

Percona XtraDB Cluster

Percona XtraDB ClusterPercona announces the release of Percona XtraDB Cluster 5.7.17-27.20 on March 16, 2017. Binaries are available from the downloads section or our software repositories.

NOTE: You can also run Docker containers from the images in the Docker Hub repository.

Percona XtraDB Cluster 5.7.17-27.20 is now the current release, based on the following:

All Percona software is open-source and free. Details of this release can be found in the 5.7.17-27.20 milestone on Launchpad.

There are no new features or bug fixes to the main components, besides upstream changes and the following fixes related to packaging:

  • BLD-512: Fixed startup of garbd on Ubuntu 16.04.2 LTS (Xenial Xerus).
  • BLD-519: Added the garbd debug package to the repository.
  • BLD-569: Fixed grabd script to return non-zero if it fails to start.
  • BLD-570: Fixed service script for garbd on Ubuntu 16.04.2 LTS (Xenial Xerus) and Ubuntu 16.10 (Yakkety Yak).
  • BLD-593: Limited the use of rm and chown by mysqld_safe to avoid exploits of the CVE-2016-5617 vulnerability. For more information, see 1660265.
    Credit to Dawid Golunski (https://legalhackers.com).
  • BLD-610: Added version number to the dependency requirements of the full RPM package.
  • BLD-643: Fixed systemctl to mark mysql process as inactive after it fails to start and not attempt to start it again. For more information, see 1662292.
  • BLD-644: Added the which package to PXC dependencies on CentOS 7. For more information, see 1661398.
  • BLD-645: Fixed mysqld_safe to support options with a forward slash (/). For more information, see 1652838.
  • BLD-647: Fixed systemctl to show correct status for mysql on CentOS 7. For more information, see 1644382.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at March 17, 2017 06:29 PM

Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs. Apache Spark

Column Store Database

This blog shares some column store database benchmark results, and compares the query performance of MariaDB ColumnStore v. 1.0.7 (based on InfiniDB), Clickhouse and Apache Spark.

I’ve already written about ClickHouse (Column Store database).

The purpose of the benchmark is to see how these three solutions work on a single big server, with many CPU cores and large amounts of RAM. Both systems are massively parallel (MPP) database systems, so they should use many cores for SELECT queries.

For the benchmarks, I chose three datasets:

  1. Wikipedia page Counts, loaded full with the year 2008, ~26 billion rows
  2. Query analytics data from Percona Monitoring and Management
  3. Online shop orders

This blog post shares the results for the Wikipedia page counts (same queries as for the Clickhouse benchmark). In the following posts I will use other datasets to compare the performance.

Databases, Versions and Storage Engines Tested

  • MariaDB ColumnStore v. 1.0.7, ColumnStore storage engine
  • Yandex ClickHouse v. 1.1.54164, MergeTree storage engine
  • Apache Spark v. 2.1.0, Parquet files and ORC files

Although all of the above solutions can run in a “cluster” mode (with multiple nodes), I’ve only used one server.

Hardware

This time I’m using newer and faster hardware:

  • CPU: physical = 2, cores = 32, virtual = 64, hyperthreading = yes
  • RAM: 256Gb
  • Disk: Samsung SSD 960 PRO 1TB, NVMe card

Data Sizes

I’ve loaded the above data into Clickhouse, ColumnStore and MySQL (for MySQL the data included a primary key; Wikistat was not loaded to MySQL due to the size). MySQL tables are InnoDB with a primary key.

Dataset Size (GB) Column Store Clickhouse MySQL Spark / Parquet Spark / ORC file
Wikistat 374.24 Gb 211.3 Gb n/a (> 2 Tb) 395 Gb 273 Gb
Query metrics 61.23 Gb 28.35 Gb 520 Gb
Store Orders 9.3 Gb 4.01 Gb 46.55 Gb

 

Query Performance

Wikipedia page counts queries

Test type (warm) Spark Clickhouse ColumnStore
Query 1: count(*) 5.37 2.14 30.77
Query 2: group by month 205.75 16.36 259.09
Query 3: top 100 wiki pages by hits (group by path) 750.35 171.22 1640.7

Test type (cold) Spark Clickhouse ColumnStore
Query 1: count(*) 21.93 8.01 139.01
Query 2: group by month 217.88 16.65 420.77
Query 3: top 100 wiki pages by hits (group by path) 887.434 182.56 1703.19


Partitioning and Primary Keys

All of the solutions have the ability to take advantage of data “partitioning,” and only scan needed rows.

Clickhouse has “primary keys” (for the MergeTree storage engine) and scans only the needed chunks of data (similar to partition “pruning” in MySQL). No changes to SQL or table definitions is needed when working with Clickhouse.

Clickhouse example:

:) select count(*), toMonth(date) as mon
:-] from wikistat where toYear(date)=2008
:-] and toMonth(date) = 1
:-] group by mon
:-] order by mon;
SELECT
    count(*),
    toMonth(date) AS mon
FROM wikistat
WHERE (toYear(date) = 2008) AND (toMonth(date) = 1)
GROUP BY mon
ORDER BY mon ASC
┌────count()─┬─mon─┐
│ 2077594099 │   1 │
└────────────┴─────┘
1 rows in set. Elapsed: 0.787 sec. Processed 2.08 billion rows, 4.16 GB (2.64 billion rows/s., 5.28 GB/s.)
:) select count(*), toMonth(date) as mon from wikistat where toYear(date)=2008 and toMonth(date) between 1 and 10 group by mon order by mon;
SELECT
    count(*),
    toMonth(date) AS mon
FROM wikistat
WHERE (toYear(date) = 2008) AND ((toMonth(date) >= 1) AND (toMonth(date) <= 10))
GROUP BY mon
ORDER BY mon ASC
┌────count()─┬─mon─┐
│ 2077594099 │   1 │
│ 1969757069 │   2 │
│ 2081371530 │   3 │
│ 2156878512 │   4 │
│ 2476890621 │   5 │
│ 2526662896 │   6 │
│ 2460873213 │   7 │
│ 2480356358 │   8 │
│ 2522746544 │   9 │
│ 2614372352 │  10 │
└────────────┴─────┘
10 rows in set. Elapsed: 13.426 sec. Processed 23.37 billion rows, 46.74 GB (1.74 billion rows/s., 3.48 GB/s.)

As we can see here, ClickHouse has processed ~two billion rows for one month of data, and ~23 billion rows for ten months of data. Queries that only select one month of data are much faster.

For ColumnStore we need to re-write the SQL query and use “between ‘2008-01-01’ and 2008-01-10′” so it can take advantage of partition elimination (as long as the data is loaded in approximate time order). When using functions (i.e., year(dt) or month(dt)), the current implementation does not use this optimization. (This is similar to MySQL, in that if the WHERE clause has month(dt) or any other functions, MySQL can’t use an index on the dt field.)

ColumnStore example:

MariaDB [wikistat]> select count(*), month(date) as mon
    -> from wikistat where year(date)=2008
    -> and month(date) = 1
    -> group by mon
    -> order by mon;
+------------+------+
| count(*)   | mon  |
+------------+------+
| 2077594099 |    1 |
+------------+------+
1 row in set (2 min 12.34 sec)
MariaDB [wikistat]> select count(*), month(date) as mon
from wikistat
where date between '2008-01-01' and '2008-01-31'
group by mon order by mon;
+------------+------+
| count(*)   | mon  |
+------------+------+
| 2077594099 |    1 |
+------------+------+
1 row in set (12.46 sec)

Apache Spark does have partitioning however. It requires the use of partitioning with parquet format in the table definition. Without declaring partitions, even the modified query (“select count(*), month(date) as mon from wikistat where date between ‘2008-01-01’ and ‘2008-01-31’ group by mon order by mon”) will have to scan all the data.

The following table and graph shows the performance of the updated query:

Test type / updated query Spark Clickhouse ColumnStore
group by month, one month, updated syntax 205.75 0.93 12.46
group by month, ten months, updated syntax 205.75 8.84 170.81

 

Working with Large Datasets

With 1Tb uncompressed data, doing a “GROUP BY” requires lots of memory to store the intermediate results (unlike MySQL, ColumnStore, Clickhouse and Apache Spark use hash tables to store groups by “buckets”). For example, this query requires a very large hash table:

SELECT
path,
count(*),
sum(hits) AS sum_hits,
round(sum(hits) / count(*), 2) AS hit_ratio
FROM wikistat
WHERE project = 'en'
GROUP BY path
ORDER BY sum_hits DESC
LIMIT 100

As “path” is actually a URL (without the hostname), it takes a lot of memory to store the intermediate results (hash table) for GROUP BY.

MariaDB ColumnStore does not allow us to “spill” data on disk for now (only disk-based joins are implemented). If you need to GROUP BY on a large text field, you can decrease the disk block cache setting in Columnstore.xml (i.e., set disk cache to 10% of RAM) to make room for an intermediate GROUP BY:

<DBBC>
                <!-- The percentage of RAM to use for the disk block cache. Defaults to 86% -->
                <NumBlocksPct>10</NumBlocksPct>

In addition, as the query has an ORDER BY, we need to

increase max_length_for_sort_data
 in MySQL:

ERROR 1815 (HY000): Internal error: IDB-2015: Sorting length exceeded. Session variable max_length_for_sort_data needs to be set higher.
mysql> set global max_length_for_sort_data=8*1024*1024;

SQL Support

SQL Spark* Clickhouse ColumnStore
INSERT … VALUES ✅ yes ✅ yes ✅ yes
INSERT SELECT / BULK INSERT ✅ yes ✅ yes ✅ yes
UPDATE ❌ no ❌ no ✅ yes
DELETE ❌ no ❌ no ✅ yes
ALTER … ADD/DROP/MODIFY COLUMN ❌ no ✅ yes ✅ yes
ALTER … change paritions ✅ yes ✅ yes ✅ yes
SELECT with WINDOW functions ✅ yes ❌ no ✅ yes

 

*Spark does not support UPDATE/DELETE. However, Hive supports ACID transactions with UPDATE and DELETE statements. BEGIN, COMMIT, and ROLLBACK are not yet supported (only the ORC file format is supported).

ColumnStore is the only database out of the three that supports a full set of DML and DDL (almost all of the MySQL’s implementation of SQL is supported).

Comparing ColumnStore to Clickhouse and Apache Spark

 Solution  Advantages  Disadvantages
MariaDB ColumnStore
  • MySQL frontend (make it easy to migrate from MySQL)
  • UPDATE and DELETE are supported
  • Window functions support
  • Select queries are slower
  • No replication from normal MySQL server (planned for the future versions)
  • No support for GROUP BY on disk
Yandex ClickHouse
  • Fastest performance
  • Better compression
  • Primary keys
  • Disk-based GROUP BY, etc.
  • No MySQL protocol support
Apache Spark
  • Flexible storage options
  • Machine learning integration (i.e., pyspark ML libraries run inside spark nodes)
  • No MySQL protocol support
  • Slower select queries (compared to ClickHouse)


Conclusion

Yandex ClickHouse is an absolute winner in this benchmark: it shows both better performance (>10x) and better compression than
MariaDB ColumnStore and Apache Spark. If you are looking for the best performance and compression, ClickHouse looks very good.

At the same time, ColumnStore provides a MySQL endpoint (MySQL protocol and syntax), so it is a good option if you are migrating from MySQL. Right now, it can’t replicate directly from MySQL but if this option is available in the future we can attach a ColumnStore replication slave to any MySQL master and use the slave for reporting queries (i.e., BI or data science teams can use a ColumnStore database, which is updated very close to realtime).

Table Structure and List of Queries

Table structure (MySQL / Columnstore version):

CREATE TABLE `wikistat` (
  `date` date DEFAULT NULL,
  `time` datetime DEFAULT NULL,
  `project` varchar(20) DEFAULT NULL,
  `subproject` varchar(2) DEFAULT NULL,
  `path` varchar(1024) DEFAULT NULL,
  `hits` bigint(20) DEFAULT NULL,
  `size` bigint(20) DEFAULT NULL
) ENGINE=Columnstore DEFAULT CHARSET=utf8

Query 1:

select count(*) from wikistat

Query 2a (full scan):

select count(*), month(dt) as mon
from wikistat where year(dt)=2008
and month(dt) between 1 and 10
group by month(dt)
order by month(dt)

Query 2b (for partitioning test)

select count(*), month(date) as mon
from wikistat where
date between '2008-01-01' and '2008-10-31'
group by mon
order by mon;

Query 3:

SELECT
path,
count(*),
sum(hits) AS sum_hits,
round(sum(hits) / count(*), 2) AS hit_ratio
FROM wikistat
WHERE project = 'en'
GROUP BY path
ORDER BY sum_hits DESC
LIMIT 100;

 

by Alexander Rubin at March 17, 2017 06:12 PM

March 16, 2017

Peter Zaitsev

Percona Server for MongoDB: Dashing New LDAP Authentication Plugin

LDAP Authentication

This blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog, we’ll look at the new LDAP authentication plugin. 

Hear ye, hear ye, hear ye… With the arrival of version 3.4, Percona has included an LDAP plugin in Percona Server for MongoDB. Authentication is an essential part in client communication with the database backend.

What is LDAP?

LDAP stands for Lightweight Directory Access Protocol. It’s a centralized environment containing information on users or services. This information can be objects like passwords, roles or other meta information. Typically when connecting to the MongoDB server, you simply have to authenticate with the MongoDB servers local credential lists. Using an external authentication method like the one included in Percona Server for MongoDB, you can poll an external service. External authentication allows the MongoDB server to verify the client’s username and password against a separate service, such as OpenLDAP or Active Directory.

LDAP Authentication

Why should you use it?

Having a centralized LDAP offers you the ability to rely on one single “truth” for authentication and authorization. LDAP is essential in large organizations where maintaining users on a big infrastructure becomes troublesome.

In ideal situations, you would use your LDAP authentication for multiple MongoDB servers, and even other database solutions like MySQL. The idea is that you only need to modify the passwords or accounts centrally, so you can manage entries without having to modify them locally on each MongoDB instance.

Having a centralized authentication repository is often a requirement when managing highly sensitive information (due to compliancy requirements). A central repository for user information, like an LDAP server, solves the problem of a central source for authentication. When a user with database level privileges leaves the organization, simply shutting off the one central LDAP account will prevent access to all databases that use it as a source. If local accounts were created without being tied back to a central user repository, then the likelihood of an access revocation getting missed is far greater. This is why many security standards require accounts to be created with an LDAP/Active Directory type of service.

So what components do we actually use?

If you  want to visualize it in a figure:

LDAP Authentication

  • SASL Daemon. Used as a MongoDB server-local proxy for the remote LDAP service. This service is used for MongoDB as an intermediate service. It will translate the request and poll the LDAP server.
  • SASL Library. Used by the MongoDB client and server to create data necessary for the authentication mechanism. This library is used by the MongoDB client and server for making properly formatted requests to the SASL daemon.

So how does it work?

An authentication session uses the following sequence:

  1. A MongoDB client connects to a running mongod instance.
  2. The client creates a PLAIN authentication request using the SASL library.
  3. The client then sends this SASL request to the server as a special MongoDB command.
  4. The mongod server receives this SASL Message, with its authentication request payload.
  5. The server then creates a SASL session scoped to this client, using its own reference to the SASL library.
  6. Then the server passes the authentication payload to the SASL library, which in turn passes it on to the saslauthd daemon.
  7. The saslauthd daemon passes the payload on to the LDAP service to get a YES or NO authentication response (in other words, does this user exist and is the password correct).
  8. The YES/NO response moves back from saslauthd, through the SASL library, to mongod.
  9. The mongod server uses this YES/NO response to authenticate the client or reject the request.
  10. If successful, the client has authenticated and can proceed.

Below a  visualisation of the authentication path using the LDAP connector

An example of the output when using the authentication plugin

The mongod server is running with the added option:

cat /etc/default/mongod OPTIONS="-f /etc/mongod.conf --auth --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1" STDOUT="/var/log/mongodb/mongod.stdout"
STDERR="/var/log/mongodb/mongod.stderr" First let’s make a user in MongoDB, I’ve created an Organisational Unit and associated user with password on LDAP >
db.getSiblingDB("$external").createUser({ ... user : 'dim0', ... roles: [ {role : "read", db: 'percona'} ] ... }) Successfully added user: { "user" : "utest", "roles" : [ {
"role" : "read", "db" : "percona" } ] }

At this point the user is correctly added on MongoDB.

Let’s try and perform a read query on the database “percona”:

db.getSiblingDB("$external").auth({mechanism: "PLAIN", user: 'dim0', pwd: 'test’, digestPassword: false }) 1
use percona db.foo.find() { "_id" : ObjectId("58b3e4b80322deccc99dc763"), "x" : 1 } { "_id" : ObjectId("58b3e8fee48bdc7edeb31cc5"), "x" : 2 } { "_id" :
ObjectId("58b3e931e48bdc7edeb31cc6"), "x" : 3 } > exit

Now let’s try and write something in it:

db.foo.insert({x : 5}) WriteResult({ "writeError" : { "code" : 13, "errmsg" : "not authorized on percona to execute command { insert: "foo", documents: [ { _id:
ObjectId('58b3e97f2343c5da97a2256e'), x: 5.0 } ], ordered: true }" } })

This is logical behavior, as we only allowed read interaction on the percona database.

After a correct login, you will find the following in the mongod.log:

2017-02-27T08:55:19.612+0000 I ACCESS   [conn2] Successfully authenticated as principal dim0 on $external

If an incorrect login happens, the following entry will appear in the mongod.log:

2017-02-27T09:10:55.297+0000 I ACCESS   [conn4] PLAIN authentication failed for dim0 on $external from client 127.0.0.1:34812 ; OperationFailed: SASL step did not complete: (authentication failure)

Conclusion

Percona Server for MongoDB has an easy way of integrating correctly with SASL authd. If you are looking for an option of centrally managing the users of your MongoDB environments, look no further. Keep in mind, however, that if you don’t need a centrally managed environment adding this functionality creates additional complexity to your infrastructure. You can find additional information on the LDAP plugin in our documentation at https://www.percona.com/doc/percona-server-for-mongodb/ext_authentication/index.html.

by Dimitri Vanoverbeke at March 16, 2017 10:06 PM

Monitoring Databases: A Product Comparison

Monitoring Databases PMM small

In this blog post, I will discuss the solutions for monitoring databases (which includes alerting) I have worked with and recommended in the past to my clients. This survey will mostly focus on MySQL solutions. 

One of the most common issues I come across when working with clients is monitoring and alerting. Many times, companies will fall into one of these categories:

  • No monitoring or alerting. This means they have no idea what’s going on in their environment whatsoever.
  • Inadequate monitoring. Maybe people in this camp are using a platform that just tells them the database is up or connections are happening, but there is no insight into what the database is doing.
  • Too much monitoring and alerting. Companies in this camp have tons of dashboards filled with graphs, and their inbox is full of alerts that get promptly ignored. This type of monitoring is just as useful as the first option. Alert fatigue is a real thing!

With my clients, I like to talk about what monitoring they need and what will work for them.

Before we get started, I do want to point out that I have borrowed some text and/or graphics from the websites and promotional material of some of the products I’m discussing.

Simple Alerting

Percona provides a Nagios plugin for database alerts: https://www.percona.com/downloads/percona-monitoring-plugins/.

I also like to point out to clients what metrics are important to monitor long term to make sure there are no performance issues. I prefer the following approach:

  • On the hardware level:
    • Monitor CPU, IO, network usage and how it trends monthly. If some resource consumption comes to a critical level, this might be a signal that you need more capacity.
  • On the MySQL server level:
    • Monitor connections, active threads, table locks, row locks, InnoDB IO and buffer pool usage
    • For replication, monitor seconds behind master (SBM), binlog size and replication errors. In Percona XtraDB Cluster, you might want to watch wsrep_local_recv_queue.
  • On the query level:
    • Regularly check query execution and response time, and make sure it stays within acceptable levels. When execution time approaches or exceeds established levels, evaluate ways to optimize your queries.
  • On the application side:
    • Monitor that response time is within established SLAs.

High-Level Monitoring Solution Comparison

PMM MonYOG Severalnines VividCortex SelectStar
Databases Supported MySQL, MongoDB and others with custom addons MySQL MySQL, MongoDB, PostgreSQL MySQL, MongoDB, PostgreSQL, Redis MySQL, MongoDB, PostgreSQL, Hadoop, Cassandra, Amazon Dynamo, IBM DB2, SQL Server, Oracle
Open Source x
Cost Free Subscription per node Subscription per node and free Community Edition Subscription per instance Subscription per instance
Cloud or
On Premises
On premises On premises On premises Cloud with on premises collector Cloud with on premises collector
Has Agents x x
Monitoring x x x x x
Alerting Yes, but requires custom setup x x x x
Replication Topology Management x x
Query Analytics x x x x
Configuration Management x x
Backup Management x
OS Metrics x x  x x x
Configuration Advisors x  x x
Failover Management x x
ProxySQL and
HA Proxy Support
Monitors ProxySQL x

 

PMM

http://pmmdemo.percona.com

https://www.percona.com/blog/2016/04/18/percona-monitoring-and-management/

https://www.percona.com/doc/percona-monitoring-and-management/index.html

Percona Monitoring and Management (PMM) is a fully open source solution for managing MySQL platform performance and tuning query performance. It allows DBAs and application developers to optimize the performance of the database layer. PMM is an on-premises solution that keeps all of your performance and query data inside the confines of your environment, with no requirement for data to cross the Internet.

Assembled from a supported package of “best-of-breed” open source tools such as Prometheus, Grafana and Percona’s Query Analytics, PMM delivers results right out of the box.

With PMM, anyone with database maintenance responsibilities can get more visibility for actionable enhancements, realize faster issue resolution times, increase performance through focused optimization and better manage resources. More information allows you to concentrate efforts on the areas that yield the highest value, rather than hunting and pecking for speed.

PMM monitors and provides performance data for Oracle’s MySQL Community and Enterprise Servers, as well as Percona Server for MySQL and MariaDB.

Alerting

In the current version of PMM, custom alerting can be set up. Percona has a guide here: https://www.percona.com/blog/2017/01/23/mysql-and-mongodb-alerting-with-pmm-and-grafana/.

Architecture

The PMM platform is based on a simple client-server model that enables efficient scalability. It includes the following modules:

  • PMM Client is installed on every MySQL host that you want to monitor. It collects MySQL server metrics, general system metrics, and query analytics data for a complete performance overview. Collected data is sent to the PMM Server.
  • PMM Server aggregates collected data and presents it in the form of tables, dashboards and graphs in a web interface.

Monitoring Databases

MySQL Configuration

Percona recommends certain settings to get the most out of PMM. You can get more information and a guide here: https://www.percona.com/doc/percona-monitoring-and-management/conf-mysql.html.

Advantages

  • Fast setup
  • Fully supported and backed by Percona
  • Impressive roadmap ahead
  • Monitors your database in depth
  • Query analytics
  • Quick setup docker container
  • Free and open source

Disadvantages

  • New, could still have some growing pains
  • Requires agents on database machines

Severalnines

http://severalnines.com/

Severalnines ClusterControl provides access to 100+ key database and host metrics that matter to your operational performance. You can visualize historical performance in custom dashboards to establish operational baselines and capacity planning. It lets you proactively monitor and receive advice to address immediate and potential database and server issues, and ships with over 100 built-in advisors or easily-writeable custom advisors for your specific needs. It is very scriptable and customizable with some effort.

Severalnines has a free community version as well as a commercial offering. The free version includes deployment, monitoring and advisors with a Developer Studio (with which users can create their own advisors).

Severalnines is definitely more sysadmin focused. The best part about it is its ability to deploy and manage deployments of your database with almost no command line work.

The community edition of ClusterControl is “free forever”.

Architecture

ClusterControl is an agentless management and automation software for database clusters. It helps deploy, monitor, manage and scale your database server/cluster directly from ClusterControl user interface.

ClusterControl consists of four components:

Component Package Naming Role
ClusterControl controller (cmon) clustercontrol- controller The brain of ClusterControl. A backend service performing automation, management, monitoring and scheduling tasks. All the collected data will be stored directly inside CMON database
ClusterControl REST API clustercontrol-cmonapi Interprets request and response data between ClusterControl UI and CMON database
ClusterControl UI clustercontrol A modern web user interface to visualize and manage the cluster. It interacts with CMON controller via remote procedure call (RPC) or REST API interface
ClusterControl NodeJS clustercontrol-nodejs This optional package is introduced in ClusterControl version 1.2.12 to provide an interface for notification services and integration with 3rd party tools

 

Advantages

  • Agentless
  • Monitors, deploys and manages:
    • Database
    • Configuration
    • Backups
    • Users
  • Simple web GUI to manage your databases, alerts, users, settings
  • Can create custom monitors or jobs
  • Can off-load and compress backups
  • Great support team
  • Rich feature set and multiple databases supported

Disadvantages

  • Cost per node
  • UI can occasionally be clunky
  • Query tools lack as compared to other solutions here
  • Metrics and Advisors may not be as powerful or easy to use as other products

MONyog

https://www.webyog.com/product/monyog

MONyog MySQL Monitor and Advisor is a “MySQL DBA in a box” that helps MySQL DBAs manage more MySQL servers, tune their current MySQL servers and find and fix problems with their MySQL database applications before they can become serious problems or costly outages.

MONyog proactively monitors enterprise database environments and provides expert advice on how even those new to MySQL can tighten security, optimize performance and reduce downtime of their MySQL powered systems.

MONyog is more DBA focused and focuses on the MySQL configuration and queries.

Architecture

MONyog web server runs on Linux, monitoring MySQL on all platforms and also monitoring OS-data on Linux servers. To retrieve OS metrics, MONyog uses SSH. However, with this scenario (MONyog installed on a Linux machine) MONyog web-server/agent cannot collect Windows OS metrics.

Of course, the client where the MONyog output is viewed can be any browser supporting AJAX on any platform. MONyog can be installed on a remote PC as well as the server. It does not require processing, and with agentless monitoring it can collect and retrieve data from the server.

Advantages

  • Setup and startup within two minutes
  • Agentless
  • Good query tools
  • Manages configuration
  • Great advisors for database tuning built-in
  • Most comprehensive and detailed alerting

Disadvantages

  • Cost per node
  • Only supports MySQL

VividCortex

VividCortex is a good cloud-based tool to see what your production databases are doing. It is a modern SaaS database performance monitoring platform that significantly eases the pain of database performance at scale, on distributed and polyglot systems, for the entire engineering team. It’s hosted for you with industry-leading security, and is continuously improved and maintained. VividCortex measures and analyzes the system’s work and resource consumption. The result is an immediate insight into query performance, better performance and quality, faster time-to-market and reduced cost and effort.

Architecture

VividCortex is the combination of agent programs, APIs and a web application. You install the agents on your servers, they send data to their APIs, and you access the results through the web application at https://app.vividcortex.com. VividCortex has a diagram on their site showing how it works:

Monitoring Databases VividCortex

The agents are self-supervising, managed by an agent called vc-agent-007. You can read more about the agents in the agent-specific documentation. They send primarily time-series metrics to the APIs, at one-second granularity. It sometimes sends additional metadata as well. For example, query digests are required to show what query is responsible for specific query-related metrics.
On the backend, a distributed, fully multi-tenant service stores your data separately from all other customers. VividCortex servers are currently hosted in the Amazon AWS public cloud.

Advantages

  • Great visibility into query-level performance to pinpoint optimization efforts
  • Granularity, with the ability to identify performance fluctuations down to a one-second resolution
  • Smart anomaly detection using advanced statistics and machine learning to reduce false-positives and make alerts meaningful and actionable
  • Unique collaboration tools, enabling developers to answer many of their own questions and freeing DBAs to be more responsive and proactive.

Disadvantages

  • Cloud-based tools may not be desirable in a secure environment
  • Cost
  • Not useful if you lose outside network access during an incident
  • Dependent on AWS availability

SelectStar

https://selectstar.io

SelectStar monitors key metrics for many different database types, and has a comprehensive alerts and recommendations system. SelectStar supports monitoring and alerts on:

  • MySQL, Percona Server for MySQL, MariaDB
  • PostgreSQL
  • Oracle
  • MongoDB
  • Microsoft SQL
  • DB2
  • Amazon RDS and Aurora
  • Hadoop
  • Cassandra

The alerts and recommendations are designed to ensure you have an immediate understanding of key issues — and where they are coming from. You can pinpoint the exact database instance that may be causing the issue, or go further up the chain and see if it’s an issue impacting several database instances at the host level.

Recommendations are often tied to alerts — if you have a red alert, there’s going to be a recommendation tied to it on how you can improve. However, the recommendations pop up even if your database is completely healthy — ensuring that you have visibility into how you can improve your configuration before you actually have an issue impacting performance.

Architecture

Using agentless collectors, SelectStar gathers data from both your on-premises and AWS platforms so that you can have insight into all of your database instances.

Monitoring Databases SelectStar

The collector is an independent machine within your infrastructure that pulls data from your database. It is low impact in order to not impact performance. This is a different approach from all of the other monitoring tools I have looked at.

Advantages

  • Multiple database technologies (the most out of the tools presented here)
  • Great visibility into query-level performance to pinpoint optimization efforts
  • Agentless
  • Good query tools
  • Great advisors for database tuning built in
  • Good alerting
  • Fast setup
  • Monitors your database in depth
  • Query analytics

Disadvantages

  • Cloud-based tools may not be desirable in a secure environment
  • Cost
  • New, could still have some growing pains
  • Still requires an on-premises collector

So What Do I Recommend?

It depends.” – Peter Z., CEO Percona

As always, I recommend whatever works best for your workload, in your environment, and within the standards of your company’s practices!

by Manjot Singh at March 16, 2017 08:12 PM

Jean-Jerome Schmidt

Video: ClusterControl Developer Studio Introduction Video

The free ClusterControl Developer Studio provides you a set of monitoring and performance advisors to use and lets you create custom advisors to add security and stability to your MySQL, Galera, and MongoDB infrastructures.

ClusterControl’s library of Advisors allows you to extend the features of ClusterControl to add even more database management functionality.

Advisors in ClusterControl are powerful constructs; they provide specific advice on how to address issues in areas such as performance, security, log management, configuration, storage space, etc. They can be anything from simple configuration advice, warning on thresholds or more complex rules for predictions, or even cluster-wide automation tasks based on the state of your servers or databases.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Developer Studio Resources

Want to learn more about the Developer Studio in ClusterControl check out the information below!

Advisor Highlights

Here is some information on particular advisors that can help you with your instances

by Severalnines at March 16, 2017 09:57 AM

Shlomi Noach

MySQL Community Awards 2017: Call for Nominations!

The 2017 MySQL Community Awards event will take place, as usual, in Santa Clara, during the Percona Live Data Performance Conference, April 2017.

The MySQL Community Awards is a community based initiative. The idea is to publicly recognize contributors to the MySQL ecosystem. The entire process of discussing, voting and awarding is controlled by an independent group of community members, typically based of past winners or their representatives, as well as known contributors.

It is a self-appointed, self-declared, self-making-up-the-rules-as-it-goes committee. It is also very aware of the importance of the community; a no-nonsense, non-political, adhering to tradition, self criticizing committee.

The Call for Nominations is open. We are seeking the community’s assistance in nominating candidates in the following categories:

MySQL Community Awards: Community Contributor of the year 2017

This is a personal award; a winner would a person who has made contribution to the MySQL ecosystem. This could be via development, advocating, blogging, speaking, supporting, etc. All things go.

MySQL Community Awards: Application of the year 2017

An application, project, product etc. which supports the MySQL ecosystem by either contributing code, complementing its behavior, supporting its use, etc. This could range from a one man open source project to a large scale social service.

MySQL Community Awards: Corporate Contributor of the year 2017

A company who made contribution to the MySQL ecosystem. This might be a corporate which released major open source code; one that advocates for MySQL; one that help out community members by... anything.

For a list of previous winners, please see MySQL Hall of Fame.

Process of nomination and voting

Anyone can nominate anyone. When nominating, please make sure to provide a brief explanation on why the candidate is eligible to get the award. Make a good case!

The committee will review all nominations and vote; it typically takes two rounds of votes to pick the winners, and a lot of discussion.

There will be up to three winners in each category.

Methods of nomination:

  • Send en email to mysql.community.awards [ at ] gmail.com
  • Comment to this post
  • Assuming you can provide a reasonable description in 140 characters, tweet your nomination at #MySQLAwards.

Please submit your nominations no later than Friday, March 31 2017.

The committee

Members of the committee are:

  • Baron Schwartz, Colin Charles, Daniël van Eeden, Domas Mituzas, Eric Herman, Giuseppe Maxia, Justin Swanhart, Jeremy Cole, Mark Leith, Yoshinori Matsunobu, Morgan Tocker, Santiago Lertora

Emily Slocombe and Shlomi Noach are acting as co-secretaries; we will be non-voting (except for breaking ties).

Update: there have been multiple nominations that I'm related to. In previous years I disqualified such nominations but this year I do not wish to disqualify the ones that have been made. I stepped down from the committee and will have no access to the ongoing discussion/voting. Emily is the chairwoman and the committee is likely to get more members.

There is no conspiracy 🙂

 

The committee communicates throughout the nomination and voting process to exchange views and opinions.

The awards

Awards are traditionally donated by some party whose identity remains secret. A sponsor has already stepped up. Dear anonymous, thank you very much for your contribution!

Support

This is a community effort; we ask for your support in spreading the word and of course in nominating candidates. Thanks!

by shlomi at March 16, 2017 06:34 AM

March 15, 2017

Peter Zaitsev

Percona Live Featured Session with Evan Elias: Automatic MySQL Schema Management with Skeema

Percona Live Featured Session

Percona Live Featured SessionWelcome to another post in the series of Percona Live featured session blogs! In these blogs, we’ll highlight some of the session speakers that will be at this year’s Percona Live conference. We’ll also discuss how these sessions can help you improve your database environment. Make sure to read to the end to get a special Percona Live 2017 registration bonus!

In this Percona Live featured session, we’ll meet Evan Elias, Director of Engineering, Tumblr. His session is Automatic MySQL Schema Management with SkeemaSkeema is a new open source CLI tool for managing MySQL schemas and migrations. It allows you to easily track your schemas in a repository, supporting a pull-request-based workflow for schema change submission, review, and execution.

I had a chance to speak with Evan about Skeema:

Evan EliasPercona: How did you get into database technology? What do you love about it?

Evan: I first started using MySQL at a college IT job in 2003, and over the years I eventually began tackling much larger-scale deployments at Tumblr and Facebook. I’ve spent most of the past decade working on social networks, where massive high-volume database technology is fundamental to the product. I love the technical challenges present in that type of environment, as well as the huge potential impact of database automation and tooling. In companies with giant databases and many engineers, a well-designed automation system can provide a truly enormous increase in productivity.

Percona: Your talk is called Automatic MySQL Schema Management with Skeema. What is Skeema, and how is it helpful for engineers and DBAs?

Evan: Skeema is an open source tool for managing MySQL schemas and migrations. It allows users to diff, push or pull schema definitions between the local filesystem and one or more databases. It can be configured to support multiple environments (e.g. development/staging/production), external online schema change tools, sharding, and service discovery. Once configured, an engineer or DBA can use Skeema to execute an online schema change on many shards concurrently simply by editing a CREATE TABLE statement in a file and then running “skeema push”.

Percona: What are the benefits of storing schemas in a repository?

Evan: The whole industry is moving towards infrastructure-as-code solutions, providing automated configuration which is reproducible across multiple environments. In extending this concept to database schemas, a file repository stores the desired state of each table, and a schema change is tied to simply changing these files. A few large companies like Facebook have internal closed-source tools to tie MySQL schemas to a git repo, allowing schema changes to be powered by pull requests (without any manual DBA effort). There hasn’t previously been an open source, general-purpose tool for managing schemas and migrations in this way, however. I developed Skeema to fill this gap.

Percona: What do you want attendees to take away from your session? Why should they attend?

Evan: In this session, MySQL DBAs will learn how to automate their schema change workflow to reduce manual operational work, while software engineers will discover how Skeema permits easy online migrations even in frameworks like Rails or Django. Skeema is a brand new tool, and this is the first conference session to introduce it. At this relatively early stage, feedback and feature requests from attendees will greatly influence the direction and prioritization of future development.

Percona: What are you most looking forward to at Percona Live 2017?

Evan: Percona Live is my favorite technical conference. It’s the best place to learn about all of the recent developments in the database world, and meet the top experts in the field. This is my fifth year attending in Santa Clara. I’m looking forward to reconnecting with old friends and making some new ones as well!

Register for Percona Live Data Performance Conference 2017, and see Evan present his session on Automatic MySQL Schema Management with Skeema. Use the code FeaturedTalk and receive $100 off the current registration price!

Percona Live Data Performance Conference 2017 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Data Performance Conference will be April 24-27, 2017 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

by Dave Avery at March 15, 2017 10:03 PM

Upgrading to Percona Server for MongoDB 3.4 from Previous Versions

MMAPv1

Upgrading to Percona Server for MongoDBThis post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we will walk through upgrading to Percona Server for MongoDB 3.4 from a previous MongoDB version. We will highlight the important details to guarantee the successful completion of this task in your environment.

MongoDB 3.4 was just released, and it has a lot of new features such as views, a better sharding algorithm, facet aggregation, numeric precision data type and more.

The procedure below covers upgrading from a previous version. The process for a fresh installation is slightly different. We don’t need to enable compatibility (as explained later).

Before upgrading, please be aware of a few details/requirements:

  • Minimum versionThe minimum version required to upgrade to MongoDB 3.4 is 3.2.8.
  • Config serversConfig servers only work as replica sets. The previous method – instances as config servers – is no longer supported.
  • DriverThe driver must be ready for the new features (such as views and decimal data type). Please make sure that your driver version contains all the required settings.

How to upgrade a MongoDB config server instance (the previous config server architecture) to a replica set

  1. Check if the balancer is running. If it is, disable the balancer process:

$ mongos
mongos> sh.isBalancerRunning()
false
mongos> sh.stopBalancer()
Waiting for active hosts...
Waiting for the balancer lock...
Waiting again for active hosts after balancer is off...
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 0 })

  1. Connect to the first MongoDB config server and run the command below. Please replace the
    _id
     and
    members
     with your own settings:

configsvr> rs.initiate( {
  _id: "configRS",
  configsvr: true,
  version: 1,
  members: [ { _id: 0, host: "tests:27019" } ]
})
{ "ok" : 1 }

Get the storage engine name that is used by this instance. We will use this information later. If you are using MMAPv1, there are some additional commands to run:

db.serverStatus().storageEngine
{
"name" : "wiredTiger",
"supportsCommittedReads" : true,
"persistent" : true
}

Use the following command to read all the application startup parameters (informational only):

db.runCommand('getCmdLineOpts').parsed
{
 "net" : {
   "port" : 27019
   },
 "processManagement" : {
   "fork" : true
 },
 "sharding" : {
   "clusterRole" : "configsvr"
 },
 "storage" : {
   "dbPath" : "config1"
 },
 "systemLog" : {
   "destination" : "file",
   "path" : "config1/logconfig1.log"
 }
}

  1. Restart the first MongoDB config instance with the
    –replSet configRS
     and 
    –configsvrMode=sccc
     parameters:

mongod --configsvr --replSet configRS --configsvrMode=sccc --storageEngine wiredTiger --port 27019 --dbpath config1 --logpath &nbsp;config1/logconfig1.log --fork

If using a configuration file, you must use the following parameters:

sharding:
  clusterRole: configsvr
  configsvrMode: sccc
replication:
  replSetName: csReplSet
net:
  port: <port>
storage:
  dbPath: <path>
  engine: <storageEngine>

  1. After the first config server has been restarted, start two empty config servers with the parameters settings chosen before. If you are using MMAPv1, please start three instances instead of two.

./mongod --configsvr --replSet configRS --port 27025 --dbpath rsconfig1 --logpath rsconfig1/rsconfig1 --fork
./mongod --configsvr --replSet configRS --port 27026 --dbpath rsconfig2 --logpath rsconfig2/rsconfig2 --fork
(only if using MMAPv1) ./mongod --configsvr --replSet configRS --port 27027 --dbpath rsconfig2 --logpath rsconfig3/rsconfig3 --fork

  1. Connect to the first config server again and add the new instances to the replica set:

configRS:PRIMARY> rs.add({host : "tests:27025", priority : 0 , votes : 0})
{ "ok" : 1 }
configRS:PRIMARY> rs.add({host : "tests:27026", priority : 0 , votes : 0})
{ "ok" : 1 }
(only if using MMAPv1) rs.add({host : "tests:27027", priority : 0 , votes : 0})
{ "ok" : 1 }

  1. Check the replication status. All the new instances must be in secondary status (part of the output has been omitted in the
    rs.status
     (below):

rs.status()
{
  "set" : "configRS",
  "date" : ISODate("2017-02-07T13:11:12.914Z"),
  "myState" : 1,
  "term" : NumberLong(1),
  "configsvr" : true,
  "heartbeatIntervalMillis" : NumberLong(2000),
  "members" : [
  {
     "_id" : 0,
     "name" : "tests:27019",
     "stateStr" : "PRIMARY",
     (...)
  },
  {
     "_id" : 1,
     "name" : "tests:27025",
     "stateStr" : "SECONDARY",
     (...)
  },
  {
    "_id" : 2,
    "name" : "tests:27026",
    "stateStr" : "SECONDARY",
    (...)
  },
  {
    "_id" : 3, // (will appear only if using MMAP)
    "name" : "tests:27027",
    "stateStr" : "SECONDARY",
    (...)
  },
],
  "ok" : 1
}

  1. Once the replica set is up and running, please stop old config instances running without replica set. Also, add the votes and priority to all members of the replica set. If using MMAPv1, remove votes and priority from
    cfg.members[0];
    :

var cfg = rs.conf();
cfg.members[0].priority = 1; // 0 if using MMAP
cfg.members[1].priority = 1;
cfg.members[2].priority = 1;
cfg.members[0].votes = 1; // 0 if using MMAP
cfg.members[1].votes = 1;
cfg.members[2].votes = 1;
(Only if the first config server is using mmap)
cfg.members[3].priority = 1;
cfg.members[3].votes = 1;
rs.reconfig(cfg);

  1. Restart the mongos service using the new replica set
    repl/hostnames:port
     and the
    —configdb
     parameter:

./mongos --configdb configRS/tests:27019,tests:27025,tests:27026 --logpath mongos.log --fork

  1. Perform a
    stepDown()
     and
    shutDown()
     in the first server. If this server is using MMAPv1, remove it from the replica set using
    rs.remove()
    . If it is using wiredTiger, restart without
    configsvrMode=sccc
     in the parameters.

At this point, you have an instance that has been correctly migrated to a replica set.

  1. Please enable the balancer again (use a 3.4 client, or it will fail):

sh.startBalancer()

How to upgrade config instances server to version 3.4

Please follow these instructions to upgrade a shard to the 3.4 version.

As we already have the configs correctly running as replica sets, we need to upgrade the binaries versions. It is easy to do. Stop and replace the secondaries binaries versions:

  1. Connect to the mongos and stop the balancer.
  2. Stop the secondary, replace the old binary with the new version, and start the service.
  3. When the primary is the only one running with version 3.2, run
    rs.stepDown()
      and force this instance to become secondary.
  4. Replace these (now) secondary binaries with the 3.4 version.

For the shards, add a new parameter before restarting the process in the 3.4 version:

  1. Stop the secondaries, replace the binaries and start it with the
    –shardsvr
     parameter.

If using a config file, add (this is a new parameter in this version):

sharding:
    clusterRole: shardsvr

  1. Step down the primary and perform this same process on it.
  2. After all the shards binaries have been changed to the 3.4 version, upgrade the mongos binaries.
  3. Stop mongos and replace this executable with the 3.4 version.

At this point, all the shards are running the 3.4 version, but by default the new features in MonogDB 3.4 are disabled. If we try to use a decimal data type, mongos reports the following:

WriteResult({
  "nInserted" : 0,
  "writeError" : {
  "code" : 22,
  "errmsg" : "Cannot use decimal BSON type when the featureCompatibilityVersion is 3.2. See http://dochub.mongodb.org/core/3.4-feature-compatibility."
 }
})

In order to enable MongoDB 3.4 features we need run the following in the mongos:

db.adminCommand( { setFeatureCompatibilityVersion: "3.4" } )
{ "ok" : 1 }
mongos> use testing
switched to db testing
mongos> db.foo.insert({x : NumberDecimal(10.2)})
WriteResult({ "nInserted" : 1 })

After changing the

featureCompatibility
 to “3.4,”  all the new MongoDB 3.4 features will be available.

I hope this tutorial was useful in explaining upgrading to Percona Server for MongoDB 3.4. Please ping on twitter @adamotonete or @percona for any questions and suggestions.

by Adamo Tonete at March 15, 2017 08:58 PM

How to Restore a Single InnoDB Table from a Full Backup After Accidentally Dropping It

InnoDB

InnoDBIn this blog post, we’ll look at how to restore a single InnoDB table from a full backup after dropping the table.

You can also see an earlier blog post about restoring a single table from a full backup here: How to recover a single InnoDB table from a full backup.

The idea behind the actions in that blog is based on the “Transportable Tablespace” concept, which was introduced in MySQL 5.6. So when you have deleted the data from a table, you are going to quickly restore this table as follows:

  • Prepare the backup
  • Discard the tablespace of the original table
  • Copy .ibd from the backup to the original table path
  • Import the tablespace

Of course, you need to test it using the process in production, even though it is relatively straightforward.

But how about when you drop a table? It is still a great process because you will lose the table structure and the datafiles.

The actions mentioned in the previous blog will not work here, simple because it is impossible to discard a non-existing tablespace. 🙂

Instead, one solution scenario could be something like:

  • Prepare the backup
  • Extract the original table structure from the backup (i.e., extract the create statement from the backup .frm file)
  • Create a new empty table
  • Apply some locks
  • Discard the newly created tablespace
  • Copy back .ibd from the backup
  • Import the tablespace

And now you can continue to be happy!

Let’s test this scenario. In this test, I am not going to use real world tables. I will instead use good old “t1”.

> select * from dbtest.t1;
+----+
| id |
+----+
|  1 |
|  1 |
|  2 |
|  1 |
|  2 |
|  3 |
|  5 |
|  5 |
|  5 |
+----+
9 rows in set (0.01 sec)

Take a backup:

xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_17/master/my.sandbox.cnf
--user=jeffrey --password='msandbox'
 --target-dir=/home/shahriyar.rzaev/backup_dirs/ps_5.7_master/full/2017-03-07_08-34-17
--backup --host=localhost --port=20192

Prepare a backup:

xtrabackup --prepare
--target-dir=/home/shahriyar.rzaev/backup_dirs/ps_5.7_master/full/2017-03-07_08-34-17

Drop the table:

drop table dbtest.t1;
Query OK, 0 rows affected (0.22 sec)

Extract the create statement from the .frm file using the mysqlfrm tool. Please read the #WARNING and #CAUTION sections below, and also review the documentation to figure out how to use this tool:

$ sudo mysqlfrm --diagnostic dbtest/t1.frm
# WARNING: Cannot generate character set or collation names without the --server option.
# CAUTION: The diagnostic mode is a best-effort parse of the .frm file. As such, it may not identify all of the components of the table correctly. This is especially true for damaged files. It will also not read the default values for the columns and the resulting statement may not be syntactically correct.
# Reading .frm file for dbtest/t1.frm:
# The .frm file is a TABLE.
# CREATE TABLE Statement:
CREATE TABLE `dbtest`.`t1` (
  `id` int(11) NOT NULL
) ENGINE=InnoDB;
#...done.

Create an empty table using this create statement:

> CREATE TABLE `dbtest`.`t1` (
    ->   `id` int(11) NOT NULL
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.27 sec)

Apply write lock, or take another action to ensure safety:

> lock tables dbtest.t1 write;
Query OK, 0 rows affected (0.00 sec)

Discard the tablespace:

> alter table dbtest.t1 discard tablespace;
Query OK, 0 rows affected (0.09 sec)

Copy back the .ibd file from backup:

$ sudo cp dbtest/t1.ibd /home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_17/master/data/dbtest/

Apply the proper owner:

$ sudo chown shahriyar.rzaev:shahriyar.rzaev
/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_17/master/data/dbtest/t1.ibd

Import the tablespace:

master [localhost] {jeffrey} ((none)) > alter table dbtest.t1 import tablespace;
Query OK, 0 rows affected, 2 warnings (0.42 sec)
master [localhost] {jeffrey} ((none)) > show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                   |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1814 | InnoDB: Tablespace has been discarded for table 't1'                                                                                      |
| Warning | 1810 | InnoDB: IO Read error: (2, No such file or directory) Error opening './dbtest/t1.cfg', will attempt to import without schema verification |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

You can ignore the warnings.

Now check if your table is restored or not:

> select * from dbtest.t1;
+----+
| id |
+----+
|  1 |
|  1 |
|  2 |
|  1 |
|  2 |
|  3 |
|  5 |
|  5 |
|  5 |
+----+
9 rows in set (0.00 sec)

With our simple test, this method worked well. It should theoretically work in a production environment. However, you will need to test this first. Ensure it works for your test environment and test tables.

Thanks! 🙂

by Shahriyar Rzayev at March 15, 2017 08:31 PM

Daniël van Eeden

Network attacks on MySQL, Part 2: SSL stripping with MySQL

Intro

In my previous blog post I told you to use SSL/TLS to secure your MySQL network connections. So I followed my advice and did enable SSL. Great!

So first let's quickly verify that everything is working.

So you enabled SSL with mysql_ssl_rsa_setup, used a OpenSSL based build or put ssl-cert, ssl-key and ssl-ca in the mysqld section of your /etc/my.cnf and now show global variables like 'have_SSL'; returns 'YES'.

And you have configured the client with --ssl-mode=PREFERRED. Now show global status like 'Ssl_cipher'; indicates the session is indeed secured.

You could also dump traffic and it looks 'encrypted' (i.e. not readable)...

With SSL enabled everything should be safe isn't it?

The handshake which MySQL uses always starts unsecured and is upgraded to secured if both the client and server have the SSL flag set. This is very similar to STARTTLS as used in the SMTP protocol.

To attach this we need an active attack; we need to actually sit in between the client and the server and modify packets.

Then we modify the flags sent from the server to the client to have the SSL flag disabled. This is called SSL stripping.

Because the client thinks the server doesn't support SSL the connection is not upgraded and continues in clear text.

An example can be found in the dolfijn_stripssl.py script.

Once the SSL layer is stripped from the connection an attacker can see your queries and resultsets again as described before.

To protect against this attack:

  1. Set REQUIRE SSL on accounts which should never use unencrypted connections.
  2. On the client use --ssl-mode=REQUIRED to force the use of SSL. This is available since 5.6.30 / 5.7 11.
  3. For older clients: Check the Ssl_cipher status variable and exit if it is empty.

by Daniël van Eeden (noreply@blogger.com) at March 15, 2017 05:22 PM

Jean-Jerome Schmidt

Migrating MySQL database from Amazon RDS to DigitalOcean

In previous blogs (part 1 and part 2), we discussed how to migrate your RDS data into an EC2 instance. In the process, we managed to move our data out of RDS, but we are still running on AWS. If you would like to move your data completely out of Amazon Web Services, there’s a bit more work to do. In today’s blog post, we will show you how it can be done.

Environment introduction

The environment we’ll be working with is pretty similar to what we ended up with on our last post in the series. The only difference is that no cutover happened, as we will use the EC2 instance as an intermediate step in the process of moving out of AWS.

Initial infrastructure setup
Initial infrastructure setup

The action plan

In the previous blog, we first migrated our data from RDS to an EC2 instance that we have full access to. As we already have MySQL running on our EC2 instance, we have more options to choose from regarding how to copy our data to another cloud. DigitalOcean is only used for demo purposes here, the process we describe below can be used to migrate to any other hosting or cloud provider. You would need direct access to the VPS instances. In this process, we will use xtrabackup to copy the data (although it is perfectly fine to use any other method of binary transfer). We would need to prepare a safe connection between AWS and DigitalOcean. Once we do that, we will setup replication from the EC2 instance into a DigitalOcean droplet. The next step would be to perform a cutover and move applications, but we won’t cover it here.

Deciding on connectivity method

Amazon Web Services allows you to pick from many different ways to create a connection to external networks. If you have a hardware appliance which supports VPN connections, you can use it to form a VPN connection between your VPC in AWS and your local infrastructure. If your network provider offers you a peering connection with the AWS network and you have a BGP router, you can get a direct VLAN connection between your network and AWS via AWS Direct Connect. If you have multiple, isolated networks you can link them together with Amazon by using AWS VPN CloudHub. Finally, as EC2 instances are yours to manage, you can as well set up a VPN between that EC2 instance and your local network using software solutions like OpenVPN.

As we are talking databases, you can also decide to setup SSL replication between MySQL on EC2 (the master) and the slave running on DigitalOcean. - We still have to figure out how to do an initial data transfer to the slave - one solution could be to tar the output of xtrabackup, encrypt that file and either send it via WAN (rsync) or upload to S3 bucket and then download it from there. You could also rely on SSH encryption and just scp (or even rsync, using SSH) the data to the new location.

There are many options to choose from. We will use another solution though - we are going to establish an SSH tunnel between the EC2 instance and our DigitalOcean droplet to form a secure channel that we will use to replicate data. Initial transfer will be made using rsync over the SSH connection.

Configuring a DigitalOcean droplet

As we decided to use DigitalOcean, we can leverage NinesControl to deploy it. We will deploy a single PXC 5.7 node (to match MySQL 5.7 version that we use on EC2 - please keep in mind that replication from newer to older version of MySQL is not supported and it will most likely fail). We will also have to configure an SSH tunnel between EC2 and DigitalOcean instances.

We won’t cover here the setting up of NinesControl and deployment, but you can check following blog posts:

Registering an account and deploying on DigitalOcean should not take more than 10-15 minutes. After deployment completed, you will see your database in the NinesControl UI:

NinesControl screen with cluster deployed
NinesControl screen with cluster deployed

Note that we have deployed a single node of Galera Cluster here. You can find an option to download the SSH key - this is what we need to get the access to the host.

NinesControl screen with details of a cluster
NinesControl screen with details of a cluster
Infrastructure that we want to build
Infrastructure that we want to build

Copying data to DigitalOcean

Once we have MySQL 5.7 up and running on the DigitalOcean instance, we need to perform a backup of the EC2 instance and then transfer it to DO. Technically, it should be possible to perform a direct streaming of xtrabackup data between the nodes but we cannot really recommend it. WAN links can be unreliable, and it would be better to take a backup locally and then use rsync with its ability to retry the transfer whenever something is not right.

First, let’s take a backup on our EC2 instance:

root@ip-172-30-4-238:~# innobackupex --user=tpcc --password=tpccpass /tmp/backup

Once it’s ready we need to transfer it to the DigitalOcean network. To do it in a safe way, we will create a new user on the DO droplet, generate an SSH key and use this user to copy the data. Of course, you can as well use any of existing users, it’s not a required to create a new one. So, let’s add a new user. There are many ways to do this, we’ll use ‘adduser’ command.

root@galera1-node-1:~# adduser rdscopy
Adding user `rdscopy' ...
Adding new group `rdscopy' (1001) ...
Adding new user `rdscopy' (1001) with group `rdscopy' ...
Creating home directory `/home/rdscopy' ...
Copying files from `/etc/skel' ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for rdscopy
Enter the new value, or press ENTER for the default
    Full Name []:
    Room Number []:
    Work Phone []:
    Home Phone []:
    Other []:
Is the information correct? [Y/n] y

Now, it’s time to generate a pair of ssh keys to use for connectivity:

root@galera1-node-1:~# ssh-keygen -C 'rdscopy' -f id_rsa_rdscopy -t rsa -b 4096
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in id_rsa_rdscopy.
Your public key has been saved in id_rsa_rdscopy.pub.
The key fingerprint is:
3a:b0:d2:80:5b:b8:60:1b:17:58:bd:8e:74:c9:56:b3 rdscopy
The key's randomart image is:
+--[ RSA 4096]----+
|   ..            |
|  o  . o         |
| . .. + o        |
| o ..* E         |
|+o+.*   S        |
|o+++ + .         |
|o.. o o          |
|   .   .         |
|                 |
+-----------------+

Having the SSH key, we need to set it up on our Digital Ocean droplet. We need to create .ssh directory and create authorized_keys file with proper access permissions.

root@galera1-node-1:~# mkdir /home/rdscopy/.ssh
root@galera1-node-1:~# cat id_rsa_rdscopy.pub > /home/rdscopy/.ssh/authorized_keys
root@galera1-node-1:~# chown rdscopy.rdscopy /home/rdscopy/.ssh/authorized_keys
root@galera1-node-1:~# chmod 600 /home/rdscopy/.ssh/authorized_keys

Then, we need to copy our private key to the EC2 instance. When we are ready with it, we can copy our data. As we mentioned earlier, we will use rsync for that - it will let us to restart the transfer if, for whatever reason, the process is interrupted. Combined with SSH, we have create a safe and robust method of copying the data over WAN. Let’s start rsync on the EC2 host:

root@ip-172-30-4-238:~# rsync -avz -e "ssh -i id_rsa_rdscopy -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --progress /tmp/backup/2017-02-20_16-34-18/ rdscopy@198.211.97.97:/home/rdscopy

After a while, which will depend on the amount of data and transfer speed, our backup data should become available on the DigitalOcean droplet. This means that it is time to prepare it by applying InnoDB redo logs, and then copying it back into MySQL data directory. For that we need to stop MySQL, remove the current data directory, copy the files back using either innobackupex or manually, and  finally, verify that owner and group for new files is set to mysql:

root@galera1-node-1:~# innobackupex --apply-log /home/rdscopy/
root@galera1-node-1:~# service mysql stop
root@galera1-node-1:~# rm -rf /var/lib/mysql/*
root@galera1-node-1:~# innobackupex --copy-back /home/rdscopy/
root@galera1-node-1:~# chown -R mysql.mysql /var/lib/mysql

Before we start MySQL, we also need to ensure that both server_id and UUID’s are different. The former can be edited in my.cnf, the latter can be assured by:

root@galera1-node-1:~# rm /var/lib/mysql/auto.cnf

Now, we can start MySQL:

root@galera1-node-1:~# service mysql start

Setting up replication

We are ready to set up replication between EC2 and DO, but first we need to setup an ssh tunnel - we’ll create an additional ssh key for ubuntu user on EC2 instance and copy it to the DO instance. Then we will use the ubuntu user to create a tunnel that we will use for the replication.

Let’s start by creating the new ssh key:

root@ip-172-30-4-238:~# ssh-keygen -C 'tunnel' -f id_rsa_tunnel -t rsa -b 4096
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in id_rsa_tunnel.
Your public key has been saved in id_rsa_tunnel.pub.
The key fingerprint is:
c4:44:79:39:9c:c6:ce:45:bb:13:e5:6f:c5:d9:8c:14 tunnel
The key's randomart image is:
+--[ RSA 4096]----+
|       .o+ +. E. |
|       o. O .= +o|
|        o= oo o.=|
|       .  o  o ..|
|        S   o   o|
|             . . |
|                 |
|                 |
|                 |
+-----------------+

Next step - we need to add our public key to the authorized_keys file on the EC2 instance, to which we will connect from DigitalOcean to create a tunnel.

root@ip-172-30-4-238:~# cat id_rsa_tunnel.pub >> /home/ubuntu/.ssh/authorized_keys

We also need a private key to be transferred to the DO droplet. It can be done in many ways, but we’ll use secure scp using rdscopy user and key that we created earlier:

root@ip-172-30-4-238:~# scp -i id_rsa_rdscopy id_rsa_tunnel rdscopy@198.211.97.97:/home/rdscopy
id_rsa_tunnel                                                                                                                                                                    100% 3247     3.2KB/s   00:00

That’s all we need - now we can create the SSH tunnel. We want it to be available all the time so we will use screen session for it.

root@galera1-node-1:~# screen -S tunnel
root@galera1-node-1:~# ssh -L 3307:localhost:3306 ubuntu@54.224.107.6 -i /home/rdscopy/id_rsa_tunnel

What we did here was to open an SSH tunnel between localhost, port 3307 and remote host, 54.224.107.6, port 3306 using “ubuntu” user and a key located in /home/rdscopy/id_rsa_tunnel. Detach the screen session and remote host should be available via 127.0.0.1:3307.

To setup replication, we still need to add n user that we will use to connect to MySQL on EC2. We will create it on the EC2 host and we’ll use ‘127.0.0.1’ as host - connections via SSH tunnel will look like they come from localhost:

mysql> CREATE USER rds_rpl@127.0.0.1 IDENTIFIED BY 'rds_rpl_pass';
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT REPLICATION SLAVE ON *.* TO rds_rpl@127.0.0.1;
Query OK, 0 rows affected (0.00 sec)

All is ready to setup replication, it’s time now to follow a traditional process of creating a slave based on xtrabackup data. We need to use data from xtrabackup_binlog_info to identify the master position at the time of the backup. This position is what we want to use in our CHANGE MASTER TO … command. Let’s take a look at the contents of xtrabackup_binlog_info file:

root@galera1-node-1:~# cat /home/rdscopy/xtrabackup_binlog_info
binlog.000052    896957365

This is the binary log file and position we’ll use in our CHANGE MASTER TO:

root@galera1-node-1:~# mysql -u root -ppass
mysql> CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3307, MASTER_USER='rds_rpl', MASTER_PASSWORD='rds_rpl_pass', MASTER_LOG_FILE='binlog.000052', MASTER_LOG_POS=896957365; START SLAVE;

This is it - replication should be now up and running and our DigitalOcean slave should be catching up on the replication. Once it has caught up, our database tier is ready for switchover. Of course, usually it’s more than just a single node - you will most likely have to setup multiple slaves on DO before the infrastructure is ready to handle production traffic.

Switchover itself is a different topic - you will have to devise a plan to minimize downtime. In general, traffic should be moved from old to new location but how it should be done depends mostly on your environment. It can be anything from a simple change in DNS entry, to complex scripts which will pull all triggers in a correct order to redirect the traffic. No matter what, your database is now already in the new location, ready to serve requests.

by krzysztof at March 15, 2017 02:17 PM

March 14, 2017

MariaDB Foundation

MariaDB 10.1.22 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.1.22. This is a Stable (GA) release. See the release notes and changelog for details. Download MariaDB 10.1.22 Release Notes Changelog What is MariaDB 10.1? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.1.22 now available appeared first on MariaDB.org.

by Daniel Bartholomew at March 14, 2017 03:48 PM

MariaDB AB

ColumnStore: Storage Architecture Choices

ColumnStore: Storage Architecture Choices david_thompson_g Tue, 03/14/2017 - 11:02

Join MariaDB ColumnStore Use Cases Webinar on March 23, 2017 10 a.m. PT

Introduction

MariaDB ColumnStore provides a complete solution for automated high availability of a cluster. On the data processing side:

  • Multiple front-end or User Module (UM) servers can be deployed to provide failover on SQL processing.

  • Multiple back-end or Performance Module (PM) servers can be deployed to provide failover on distributed data processing.

Due to the shared-nothing data processing architecture, ColumnStore requires the storage tier to deliver high availability for a complete solution. In this blog, I intend to provide some clarity and direction on the architecture and choices available.

 

Architecture

As a recap, here is an outline of MariaDB ColumnStore’s architecture:

pasted image 0.png

ColumnStore provides data processing redundancy by individually scaling out the front-end MariaDB Server UM nodes and back-end Distributed Query Engine PM nodes.

 

All ColumnStore data is accessed and managed by the PM nodes. Each PM manages one or more DBRoots, which contain Extents that hold the data values for a single column. A DBRoot belongs to exactly one PM server (at a point in time). The complete data for a given column is spread across all DBRoots with a given column value being stored exactly once. To learn more about the storage architecture, please refer to my previous blog.

 

Storage Choices

During installation, a choice must be made between utilizing:

  • Local Storage: The DBRoot is created on the local disk for the PM node, specifically under /usr/local/mariadb/columnstore/data.  

  • External Storage: The DBRoot will be stored on storage external to the PM server, then will be mounted for access. An entry for each DBRoot mount must exist in the /etc/fstab file on each PM server.

 

To provide data redundancy, ColumnStore relies on external storage to provide resilient storage and enable a particular DBRoot volume to be remounted on another PM server. This generally implies a remote networked storage solution, although filesystems such as GlusterFS can allow deployment without additional servers.

 

When internal storage is utilized, journaling filesystems and RAID deployment provide for resilient storage. However, since the storage is only available within a given PM server, the storage cannot be remounted on another PM server should one fail. In this case, the failed server must be recovered before ColumnStore can support additional queries.

 

With external storage, ColumnStore can provide automated failover and continuity in the event a PM server fails. This is because a given DBRoot storage is external to the failed PM server and can be remounted on another PM server. The following diagram illustrates how this works:

 

pasted image 0 (1).png

In this case:

 

  1. Server PM4 crashes.

  2. The ProcMgr process on PM1 detects that PM4 is no longer reachable and instructs PM3 to mount DBRoot4 and process reads and writes in addition to DBRoot3.

  3. When PM4 recovers, ProcMgr instructs PM3 to unmount DBRoot4, and then PM4 to mount DBRoot4, thereby returning the system to a steady state.

  4. If the PM1 server crashes, which contains the active ProcMgr process, the system will promote another PM server to act as the active ProcMgr, which will then initiate the above process.

 

Storage choices that support this model include:

 

  • AWS EBS (if deployed on AWS) – The ColumnStore AMI image utilizes this and provides automation around storage management.

  • Other Cloud Platforms – Are available and provide similar capabilities such as Persistent Disks for Google Cloud.

  • SAN / NAS – Many vendor choices exist and may include capabilities such as snapshotting to simplify backup and asynchronous replication to DR storage.

  • GlusterFS – Open source software filesystem (with support available from RedHat). On a small cluster this can be co-deployed with the PM servers for a simplified topology.

  • CEPH – Open source storage cluster (with support available from RedHat). This enables deployment of a lower cost software-defined storage cluster.

  • DBRD – A community member has been working with us on testing this as a storage layer, though it is not certified yet.

  • Other Options – Are available and will work as long as they present a compliant file system layer to the Linux operation system and ensure that a data volume can be remounted onto a different PM server.

Disaster Recovery

Warm Standby

To enable a warm standby DR setup, a second MariaDB ColumnStore cluster should be installed and configured identically in a secondary location. Available storage cluster replication should be configured to provide replication between the locations. Should the cluster in Data Center 1 fail, the cluster in Data Center 2 can be initiated to provide continuity.

 

pasted image 0 (2).png

Active-Active

To enable an active-active DR setup, a second MariaDB ColumnStore cluster should be installed, configured and run in a secondary location. Potentially this could have a different topology of PMs from the primary data center. The data feed output of the ETL process must be duplicated and run on both clusters in order to provide a secondary active cluster.

pasted image 0 (3).png

Conclusion

MariaDB ColumnStore offers fully automated high availability. To provide this at the storage level it relies upon the storage layer to provide both:

  1. Resiliency

  2. The ability to remount a DBRoot on a different PM server.

In the majority of cloud providers, such networked storage capabilities are a baseline offering. Private clouds such as OpenShift also come with similar capabilities such as the CEPH storage cluster. Finally, for a bare metal installation, storage appliances and software-defined storage offerings can provide this. Storage-based replication provides tried and trusted replication that works and handles the many edge cases that arise in real-world replication scenarios.

Not needing to focus on this enables my engineering team to support more customer use cases to broaden the reach and impact of ColumnStore. For example, in response to community and customer feedback, we are working on improving the text data type as well as providing a bulk write API to better support streaming use cases (i.e., Kafka integration). These improvements, plus other roadmap items, will be addressed in our 1.1 release later this year.

 

Join MariaDB ColumnStore Use Cases Webinar on March 23, 2017 10 a.m. PT

MariaDB ColumnStore provides a complete solution for automated high availability of a cluster. On the data processing side:

  • Multiple front-end or User Module (UM) servers can be deployed to provide failover on SQL processing.

  • Multiple back-end or Performance Module (PM) servers can be deployed to provide failover on distributed data processing.

Due to the shared-nothing data processing architecture, ColumnStore requires the storage tier to deliver high availability for a complete solution. In this blog, I intend to provide some clarity and direction on the architecture and choices available.

Login or Register to post comments

by david_thompson_g at March 14, 2017 03:02 PM

Open Query

TEXT and VARCHAR inefficiencies in your db schema

The TEXT and VARCHAR definitions in many db schemas are based on old information – that is, they appear to be presuming restrictions and behaviour from MySQL versions long ago. This has consequences for performance. To us, use of for instance VARCHAR(255) is a key indicator for this. Yep, an anti-pattern.

VARCHAR

In MySQL 4.0, VARCHAR used to be restricted to 255 max. In MySQL 4.1 character sets such as UTF8 were introduced and MySQL 5.1 supports VARCHARs up to 64K-1 in byte length. Thus, any occurrence of VARCHAR(255) indicates some old style logic that needs to be reviewed.

Why not just set the maximum length possible? Well…

A VARCHAR is subject to the character set it’s in, for UTF8 this means either 3 or 4 (utf8mb4) bytes per character can be used. So if one specifies VARCHAR(50) CHARSET utf8mb4, the actual byte length of the stored string can be up to 200 bytes. In stored row format, MySQL uses 1 byte for VARCHAR length when possible (depending on the column definition), and up to 2 bytes if necessary. So, specifying VARCHAR(255) unnecessarily means that the server has to use a 2 byte length in the stored row.

This may be viewed as nitpicking, however storage efficiency affects the number of rows that can fit on a data page and thus the amount of I/O required to manage a certain amount of rows. It all adds up, so having little unnecessary inefficiencies will cost – particularly for larger sites.

VARCHAR best practice

Best practice is to set VARCHAR to the maximum necessary, not the maximum possible – otherwise, as per the above, the maximum possible is about 16000 for utf8mb4, not 255 – and nobody would propose setting it to 16000, would they? But it’s not much different, in stored row space a VARCHAR(255) requires a 2 byte length indicator just like VARCHAR(16000) would.

So please review VARCHAR columns and set their definition to the maximum actually necessary, this is very unlikely to come out as 255. If 255, why not 300? Or rather 200? Or 60? Setting a proper number indicates that thought and data analysis has gone into the design. 255 looks sloppy.

TEXT

TEXT (and LONGTEXT) columns are handled different in MySQL/MariaDB. First, a recap of some facts related to TEXT columns.

The db server often needs to create a temporary table while processing a query. MEMORY tables cannot contain TEXT type columns, thus the temporary table created will be a disk-based one. Admittedly this will likely remain in the disk cache and never actually touch a disk, however it goes through file I/O functions and thus causes overhead – unnecessarily. Queries will be slower.

InnoDB can store a TEXT column on a separate page, and only retrieve it when necessary (this also means that using SELECT * is needlessly inefficient – it’s almost always better to specify only the columns that are required – this also makes code maintenance easier: you can scan the source code for referenced column names and actually find all relevant code and queries).

TEXT best practice

A TEXT column can contain up to 64k-1 in byte length (4G for LONGTEXT). So essentially a TEXT column can store the same amount of data as a VARCHAR column (since MySQL 5.0), and we know that VARCHAR offers us benefits in terms of server behaviour. Thus, any instance of TEXT should be carefully reviewed and generally the outcome is to change to an appropriate VARCHAR.

Using LONGTEXT is ok, if necessary. If the amount of data is not going to exceed say 16KB character length, using LONGTEXT is not warranted and again VARCHAR (not TEXT) is the most suitable column type.

Summary

Particularly when combined with the best practice of not using SELECT *, using appropriately defined VARCHAR columns (rather than VARCHAR(255) or TEXT) can have a measurable and even significant performance impact on application environments.

Applications don’t need to care, so the db definition can be altered without any application impact.

It is a worthwhile effort.

by Arjen Lentz at March 14, 2017 12:59 AM

March 13, 2017

Peter Zaitsev

Webinar Thursday, March 16, 2017: Moving to Amazon Web Services (AWS)

Amazon Web Services

Amazon Web ServicesJoin Percona’s Solutions Engineer Dimitri Vanoverbeke on Thursday, March 16, 2017 at 7:00 a.m. PDT / 10:00 a.m. EDT (UTC-7) for a webinar on Moving to Amazon Web Services (AWS).


This webinar covers the many challenges faced when migrating applications from on-premises into Amazon Web Services (AWS). It will specifically look at moving MySQL to AWS’s Relations Database Service (RDS) platform.
AWS is a great platform for hosting your infrastructure in the cloud. This webinar will go over the particulars of moving towards the DBaaS solution inside Amazon’s web services, and covers the different levels of service and options available.

The webinar will also discuss RDS specifics and possible migration techniques for pushing your information into RDS:

  • Moving to an RDS instance specifics
  • Optimizing configuration on RDS
  • Choosing between EC2 with MySQL and RDS
  • Moving to Amazon Aurora
  • Selecting between availability options in Amazon RDS
  • Using backups with Amazon RDS

This webinar provides a good overview on what migrating to RDS can bring to your organization, and many of the essential configuration options.

Register for the webinar here.

Dimitri Vanoverbeke, Solutions Engineer

At the age of 7, Dimitri received his first computer. Since then, he has been addicted to anything with a digital pulse. Dimitri has been active in IT professionally since 2003. He took various roles from internal system engineering to consulting.

Prior to joining Percona, Dimitri worked as an open source consultant for a leading open source software consulting firm in Belgium. During his career, Dimitri became familiar with a broad range of open source solutions and with the devops philosophy. Whenever he’s not glued to his computer screen, he enjoys traveling, cultural activities, basketball and the great outdoors. Dimitri is living with his girlfriend in the beautiful city of Ghent, Belgium.

by Dave Avery at March 13, 2017 07:36 PM

Percona XtraDB Cluster 5.6.35-26.20 is now available

Percona XtraDB Cluster

Percona XtraDB Cluster 5.6.34-26.19

Percona announces the release of Percona XtraDB Cluster 5.6.35-26.20 on March 10, 2017. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.35-26.20 is now the current release, based on the following:

All Percona software is open-source and free. Details of this release can be found in the 5.6.35-26.20 milestone on Launchpad.

There are no new features or bug fixes to the main components, besides upstream changes and the following fixes related to packaging:

  • BLD-593: Limited the use of rm and chown by mysqld_safe to avoid exploits of the CVE-2016-5617 vulnerability. For more information, see 1660265.
    Credit to Dawid Golunski (https://legalhackers.com).
  • BLD-610: Added version number to the dependency requirements of the full RPM package.
  • BLD-645: Fixed mysqld_safe to support options with a forward slash (/). For more information, see 1652838.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

by Alexey Zhebel at March 13, 2017 05:32 PM

Jean-Jerome Schmidt

MySQL in the Cloud - Online Migration from Amazon RDS to your own server (part 2)

As we saw earlier, it might be challenging for companies to move their data out of RDS for MySQL. In the first part of this blog, we showed you how to set up your target environment on EC2 and insert a proxy layer (ProxySQL) between your applications and RDS. In this second part, we will show you how to do the actual migration of data to your own server, and then redirect your applications to the new database instance without downtime.

Copying data out of RDS

Once we have our database traffic running through ProxySQL, we can start preparations to copy our data out of RDS. We need to do this in order to set up replication between RDS and our MySQL instance running on EC2. Once this is done, we will configure ProxySQL to redirect traffic from RDS to our MySQL/EC2.

As we discussed in the first blog post in this series, the only way you can get data out of the RDS is via logical dump. Without access to the instance, we cannot use any hot, physical backup tools like xtrabackup. We cannot use snapshots either as there is no way to build anything else other than a new RDS instance from the snapshot.

We are limited to logical dump tools, therefore the logical option would be to use mydumper/myloader to process the data. Luckily, mydumper can create consistent backups so we can rely on it to provide binlog coordinates for our new slave to connect to. The main issue while building RDS replicas is binlog rotation policy - logical dump and load may take even days on larger (hundreds of gigabytes) datasets and you need to keep binlogs on the RDS instance for the duration of this whole process. Sure, you can increase binlog rotation retention on RDS (call mysql.rds_set_configuration('binlog retention hours', 24); - you can keep them up to 7 days) but it’s much safer to do it differently.

Before we proceed with taking a dump, we will add a replica to our RDS instance.

Amazon RDS Dashboard
Amazon RDS Dashboard
Create Replica DB in RDS
Create Replica DB in RDS

Once we click on the “Create Read Replica” button, a snapshot will be started on the “master” RDS replica. It will be used to provision the new slave. The process may take hours, it all depends on the volume size, when was the last time a snapshot was taken and performance of the volume (io1/gp2? Magnetic? How many pIOPS a volume has?).

Master RDS Replica
Master RDS Replica

When slave is ready (its status has changed to “available”), we can log into it using its RDS endpoint.

RDS Slave
RDS Slave

Once logged in, we will stop replication on our slave - this will ensure the RDS master won’t purge binary logs and they will be still available for our EC2 slave once we complete our dump/reload process.

mysql> CALL mysql.rds_stop_replication;
+---------------------------+
| Message                   |
+---------------------------+
| Slave is down or disabled |
+---------------------------+
1 row in set (1.02 sec)

Query OK, 0 rows affected (1.02 sec)

Now, it’s finally time to copy data to EC2. First, we need to install mydumper. You can get it from github: https://github.com/maxbube/mydumper. The installation process is fairly simple and nicely described in the readme file, so we won’t cover it here. Most likely you will have to install a couple of packages (listed in the readme) and the harder part is to identify which package contains mysql_config - it depends on the MySQL flavor (and sometimes also MySQL version).

Once you have mydumper compiled and ready to go, you can execute it:

root@ip-172-30-4-228:~/mydumper# mkdir /tmp/rdsdump
root@ip-172-30-4-228:~/mydumper# ./mydumper -h rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com -p tpccpass -u tpcc  -o /tmp/rdsdump  --lock-all-tables --chunk-filesize 100 --events --routines --triggers
. 

Please note --lock-all-tables which ensures that the snapshot of the data will be consistent and it will be possible to use it to create a slave. Now, we have to wait until mydumper complete its task.

One more step is required - we don’t want to restore the mysql schema but we need to copy users and their grants. We can use pt-show-grants for that:

root@ip-172-30-4-228:~# wget http://percona.com/get/pt-show-grants
root@ip-172-30-4-228:~# chmod u+x ./pt-show-grants
root@ip-172-30-4-228:~# ./pt-show-grants -h rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com -u tpcc -p tpccpass > grants.sql

Sample of pt-show-grants may look like this:

-- Grants for 'sbtest'@'%'
CREATE USER IF NOT EXISTS 'sbtest'@'%';
ALTER USER 'sbtest'@'%' IDENTIFIED WITH 'mysql_native_password' AS '*2AFD99E79E4AA23DE141540F4179F64FFB3AC521' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK;
GRANT ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE USER, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, PROCESS, REFERENCES, RELOAD, REPLICATION CLIENT, REPLICATION SLAVE, SELECT, SHOW DATABASES, SHOW VIEW, TRIGGER, UPDATE ON *.* TO 'sbtest'@'%';

It is up to you to pick what users are required to be copied onto your MySQL/EC2 instance. It doesn’t make sense to do it for all of them. For example, root users don’t have ‘SUPER’ privilege on RDS so it’s better to recreate them from scratch. What you need to copy are grants for your application user. We also need to copy users used by ProxySQL (proxysql-monitor in our case).

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Inserting data into your MySQL/EC2 instance

As stated above, we don’t want to restore system schemas. Therefore we will move files related to those schemas out of our mydumper directory:

root@ip-172-30-4-228:~# mkdir /tmp/rdsdump_sys/
root@ip-172-30-4-228:~# mv /tmp/rdsdump/mysql* /tmp/rdsdump_sys/
root@ip-172-30-4-228:~# mv /tmp/rdsdump/sys* /tmp/rdsdump_sys/

When we are done with it, it’s time to start to load data into the MySQL/EC2 instance:

root@ip-172-30-4-228:~/mydumper# ./myloader -d /tmp/rdsdump/ -u tpcc -p tpccpass -t 4 --overwrite-tables -h 172.30.4.238

Please note that we used four threads (-t 4) - make sure you set this to whatever makes sense in your environment. It’s all about saturating the target MySQL instance - either CPU or I/O, depending on the bottleneck. We want to squeeze as much out of it as possible to ensure we used all available resources for loading the data.

After the main data is loaded, there are two more steps to take, both are related to RDS internals and both may break our replication. First, RDS contains a couple of rds_* tables in the mysql schema. We want to load them in case some of them are used by RDS - replication will break if our slave won’t have them. We can do it in the following way:

root@ip-172-30-4-228:~/mydumper# for i in $(ls -alh /tmp/rdsdump_sys/ | grep rds | awk '{print $9}') ; do echo $i ;  mysql -ppass -uroot  mysql < /tmp/rdsdump_sys/$i ; done
mysql.rds_configuration-schema.sql
mysql.rds_configuration.sql
mysql.rds_global_status_history_old-schema.sql
mysql.rds_global_status_history-schema.sql
mysql.rds_heartbeat2-schema.sql
mysql.rds_heartbeat2.sql
mysql.rds_history-schema.sql
mysql.rds_history.sql
mysql.rds_replication_status-schema.sql
mysql.rds_replication_status.sql
mysql.rds_sysinfo-schema.sql

Similar problem is with timezone tables, we need to load them using data from the RDS instance:

root@ip-172-30-4-228:~/mydumper# for i in $(ls -alh /tmp/rdsdump_sys/ | grep time_zone | grep -v schema | awk '{print $9}') ; do echo $i ;  mysql -ppass -uroot  mysql < /tmp/rdsdump_sys/$i ; done
mysql.time_zone_name.sql
mysql.time_zone.sql
mysql.time_zone_transition.sql
mysql.time_zone_transition_type.sql

When all this is ready, we can setup replication between RDS (master) and our MySQL/EC2 instance (slave).

Setting up replication

Mydumper, when performing a consistent dump, writes down a binary log position. We can find this data in a file called metadata in the dump directory. Let’s take a look at it, we will then use the position to setup replication.

root@ip-172-30-4-228:~/mydumper# cat /tmp/rdsdump/metadata
Started dump at: 2017-02-03 16:17:29
SHOW SLAVE STATUS:
    Host: 10.1.4.180
    Log: mysql-bin-changelog.007079
    Pos: 10537102
    GTID:

Finished dump at: 2017-02-03 16:44:46

One last thing we lack is a user that we could use to setup our slave. Let’s create one on the RDS instance:

root@ip-172-30-4-228:~# mysql -ppassword -h rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com
mysql> CREATE USER IF NOT EXISTS 'rds_rpl'@'%' IDENTIFIED BY 'rds_rpl_pass';
Query OK, 0 rows affected (0.04 sec)
mysql> GRANT REPLICATION SLAVE ON *.* TO 'rds_rpl'@'%';
Query OK, 0 rows affected (0.01 sec)

Now it’s time to slave our MySQL/EC2 server off the RDS instance:

mysql> CHANGE MASTER TO MASTER_HOST='rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com', MASTER_USER='rds_rpl', MASTER_PASSWORD='rds_rpl_pass', MASTER_LOG_FILE='mysql-bin-changelog.007079', MASTER_LOG_POS=10537102;
Query OK, 0 rows affected, 2 warnings (0.03 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com
                  Master_User: rds_rpl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin-changelog.007080
          Read_Master_Log_Pos: 13842678
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 20448
        Relay_Master_Log_File: mysql-bin-changelog.007079
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 10557220
              Relay_Log_Space: 29071382
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 258726
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1237547456
                  Master_UUID: b5337d20-d815-11e6-abf1-120217bb3ac2
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: System lock
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.01 sec)

Last step will be to switch our traffic from the RDS instance to MySQL/EC2, but we need to let it catch up first.

When the slave has caught up, we need to perform a cutover. To automate it, we decided to prepare a short bash script which will connect to ProxySQL and do what needs to be done.

# At first, we define old and new masters
OldMaster=rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com
NewMaster=172.30.4.238

(
# We remove entries from mysql_replication_hostgroup so ProxySQL logic won’t interfere
# with our script

echo "DELETE FROM mysql_replication_hostgroups;"

# Then we set current master to OFFLINE_SOFT - this will allow current transactions to
# complete while not accepting any more transactions - they will wait (by default for 
# 10 seconds) for a master to become available again.

echo "UPDATE mysql_servers SET STATUS='OFFLINE_SOFT' WHERE hostname=\"$OldMaster\";"
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032


# Here we are going to check for connections in the pool which are still used by 
# transactions which haven’t closed so far. If we see that neither hostgroup 10 nor
# hostgroup 20 has open transactions, we can perform a switchover.

CONNUSED=`mysql -h 127.0.0.1 -P6032 -uadmin -padmin -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND (hostgroup=10 OR hostgroup=20)' -B -N 2> /dev/null`
TRIES=0
while [ $CONNUSED -ne 0 -a $TRIES -ne 20 ]
do
  CONNUSED=`mysql -h 127.0.0.1 -P6032 -uadmin -padmin -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND (hostgroup=10 OR hostgroup=20)' -B -N 2> /dev/null`
  TRIES=$(($TRIES+1))
  if [ $CONNUSED -ne "0" ]; then
    sleep 0.05
  fi
done

# Here is our switchover logic - we basically exchange hostgroups for RDS and EC2
# instance. We also configure back mysql_replication_hostgroups table.

(
echo "UPDATE mysql_servers SET STATUS='ONLINE', hostgroup_id=110 WHERE hostname=\"$OldMaster\" AND hostgroup_id=10;"
echo "UPDATE mysql_servers SET STATUS='ONLINE', hostgroup_id=120 WHERE hostname=\"$OldMaster\" AND hostgroup_id=20;"
echo "UPDATE mysql_servers SET hostgroup_id=10 WHERE hostname=\"$NewMaster\" AND hostgroup_id=110;"
echo "UPDATE mysql_servers SET hostgroup_id=20 WHERE hostname=\"$NewMaster\" AND hostgroup_id=120;"
echo "INSERT INTO mysql_replication_hostgroups VALUES (10, 20, 'hostgroups');"
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032

When all is done, you should see the following contents in the mysql_servers table:

mysql> select * from mysql_servers;
+--------------+-----------------------------------------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+-------------+
| hostgroup_id | hostname                                      | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment     |
+--------------+-----------------------------------------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+-------------+
| 20           | 172.30.4.238                                  | 3306 | ONLINE | 1      | 0           | 100             | 10                  | 0       | 0              | read server |
| 10           | 172.30.4.238                                  | 3306 | ONLINE | 1      | 0           | 100             | 10                  | 0       | 0              | read server |
| 120          | rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com | 3306 | ONLINE | 1      | 0           | 100             | 10                  | 0       | 0              |             |
| 110          | rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com | 3306 | ONLINE | 1      | 0           | 100             | 10                  | 0       | 0              |             |
+--------------+-----------------------------------------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+-------------+

On the application side, you should not see much of an impact, thanks to the ability of ProxySQL to queue queries for some time.

With this we completed the process of moving your database from RDS to EC2. Last step to do is to remove our RDS slave - it did its job and it can be deleted.

In our next blog post, we will build upon that. We will walk through a scenario in which we will move our database out of AWS/EC2 into a separate hosting provider.

by krzysztof at March 13, 2017 12:00 PM

MySQL in the Cloud - Online Migration from Amazon RDS to EC2 instance (part 1)

In our previous blog, we saw how easy it is to get started with RDS for MySQL. It is a convenient way to deploy and use MySQL, without worrying about operational overhead. The tradeoff though is reduced control, as users are entirely reliant on Amazon staff in case of poor performance or operational anomalies. No access to the data directory or physical backups makes it hard to move data out of RDS. This can be a major problem if your database outgrows RDS, and you decide to migrate to another platform. This two-part blog shows you how to do an online migration from RDS to your own MySQL server.

We’ll be using EC2 to run our own MySQL Server. It can be a first step for more complex migrations to your own private datacenters. EC2 gives you access to your data so xtrabackup can be used. EC2 also allows you to setup SSH tunnels and it removes requirement of setting up hardware VPN connections between your on-premises infrastructure and VPC.

Assumptions

Before we start, we need to make couple of assumptions - especially around security. First and foremost, we assume that RDS instance is not accessible from outside of AWS. We also assume that you have an application in EC2. This implies that either the RDS instance and the rest of your infrastructure shares a VPC or there is access configured between them, one way or the other. In short, we assume that you can create a new EC2 instance and it will have access (or it can be configured to have the access) to your MySQL RDS instance.

We have configured ClusterControl on the application host. We’ll use it to manage our EC2 MySQL instance.

Initial setup

In our case, the RDS instance shares the same VPC with our “application” (EC2 instance with IP 172.30.4.228) and host which will be a target for the migration process (EC2 instance with IP 172.30.4.238). As the application we are going to use tpcc-MySQL benchmark executed in the following way:

./tpcc_start -h rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com -d tpcc1000 -u tpcc -p tpccpass -w 20 -r 60 -l 600 -i 10 -c 4

Initial plan

We are going to perform a migration using the following steps:

  1. Setup our target environment using ClusterControl - install MySQL on 172.30.4.238
  2. Then, install ProxySQL, which we will use to manage our traffic at the time of failover
  3. Dump the data from the RDS instance
  4. Load the data into our target host
  5. Set up replication between RDS instance and target host
  6. Switchover traffic from RDS to target host
ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Prepare environment using ClusterControl

Assuming we have ClusterControl installed (if you don’t you can grab it from: https://severalnines.com/download-clustercontrol-database-management-system), we need to setup our target host. We will use the deployment wizard from ClusterControl for that:

Deploying a Database Cluster in ClusterControl
Deploying a Database Cluster in ClusterControl
Deploying a Database Cluster in ClusterControl
Deploying a Database Cluster in ClusterControl
Deploying a Database Cluster in ClusterControl
Deploying a Database Cluster in ClusterControl

Once this is done, you will see a new cluster (in this case, just your single server) in the cluster list:

Database Cluster in ClusterControl
Database Cluster in ClusterControl

Next step will be to install ProxySQL - starting from ClusterControl 1.4 you can do it easily from the UI. We covered this process in details in this blog post. When installing it, we picked our application host (172.30.4.228) as the host to install ProxySQL to. When installing, you also have to pick a host to route your traffic to. As we have only our “destination” host in the cluster, you can include it but then couple of changes are needed to redirect traffic to the RDS instance.

If you have chosen to include destination host (in our case it was 172.30.4.238) in the ProxySQL setup, you’ll see following entries in the mysql_servers table:

mysql> select * from mysql_servers\G
*************************** 1. row ***************************
       hostgroup_id: 20
           hostname: 172.30.4.238
               port: 3306
             status: ONLINE
             weight: 1
        compression: 0
    max_connections: 100
max_replication_lag: 10
            use_ssl: 0
     max_latency_ms: 0
            comment: read server
*************************** 2. row ***************************
       hostgroup_id: 10
           hostname: 172.30.4.238
               port: 3306
             status: ONLINE
             weight: 1
        compression: 0
    max_connections: 100
max_replication_lag: 10
            use_ssl: 0
     max_latency_ms: 0
            comment: read and write server
2 rows in set (0.00 sec)

ClusterControl configured ProxySQL to use hostgroups 10 and 20 to route writes and reads to the backend servers. We will have to remove the currently configured host from those hostgroups and add the RDS instance there. First, though, we have to ensure that ProxySQL’s monitor user can access the RDS instance.

mysql> SHOW VARIABLES LIKE 'mysql-monitor_username';
+------------------------+------------------+
| Variable_name          | Value            |
+------------------------+------------------+
| mysql-monitor_username | proxysql-monitor |
+------------------------+------------------+
1 row in set (0.00 sec)
mysql> SHOW VARIABLES LIKE 'mysql-monitor_password';
+------------------------+---------+
| Variable_name          | Value   |
+------------------------+---------+
| mysql-monitor_password | monpass |
+------------------------+---------+
1 row in set (0.00 sec)

We need to grant this user access to RDS. If we need it to track replication lag, the user would have to have then‘REPLICATION CLIENT’ privilege. In our case it is not needed as we don’t have slave RDS instance - ‘USAGE’ will be enough.

root@ip-172-30-4-228:~# mysql -ppassword -h rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 210
Server version: 5.7.16-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> CREATE USER 'proxysql-monitor'@172.30.4.228 IDENTIFIED BY 'monpass';
Query OK, 0 rows affected (0.06 sec)

Now it’s time to reconfigure ProxySQL. We are going to add the RDS instance to both writer (10) and reader (20) hostgroups. We will also remove 172.30.4.238 from those hostgroups - we’ll just edit them and add 100 to each hostgroup.

mysql> INSERT INTO mysql_servers (hostgroup_id, hostname, max_connections, max_replication_lag) VALUES (10, 'rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com', 100, 10);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO mysql_servers (hostgroup_id, hostname, max_connections, max_replication_lag) VALUES (20, 'rds2.cvsw8xpajw2b.us-east-1.rds.amazonaws.com', 100, 10);
Query OK, 1 row affected (0.00 sec)
mysql> UPDATE mysql_servers SET hostgroup_id=110 WHERE hostname='172.30.4.238' AND hostgroup_id=10;
Query OK, 1 row affected (0.00 sec)
mysql> UPDATE mysql_servers SET hostgroup_id=120 WHERE hostname='172.30.4.238' AND hostgroup_id=20;
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)
mysql> SAVE MYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.07 sec)

Last step required before we can use ProxySQL to redirect our traffic is to add our application user to ProxySQL.

mysql> INSERT INTO mysql_users (username, password, active, default_hostgroup) VALUES ('tpcc', 'tpccpass', 1, 10);
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL USERS TO RUNTIME; SAVE MYSQL USERS TO DISK; SAVE MYSQL USERS TO MEMORY;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.05 sec)

Query OK, 0 rows affected (0.00 sec)
mysql> SELECT username, password FROM mysql_users WHERE username='tpcc';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| tpcc     | *8C446904FFE784865DF49B29DABEF3B2A6D232FC |
+----------+-------------------------------------------+
1 row in set (0.00 sec)

Quick note - we executed “SAVE MYSQL USERS TO MEMORY;” only to have password hashed not only in RUNTIME but also in working memory buffer. You can find more details about ProxySQL’s password hashing mechanism in their documentation.

We can now redirect our traffic to ProxySQL. How to do it depends on your setup, we just restarted tpcc and pointed it to ProxySQL.

Redirecting Traffic with ProxySQL
Redirecting Traffic with ProxySQL

At this point, we have built a target environment to which we will migrate. We also prepared ProxySQL and configured it for our application to use. We now have a good foundation for the next step, which is the actual data migration. In the next post, we will show you how to copy the data out of RDS into our own MySQL instance (running on EC2). We will also show you how to switch traffic to your own instance while applications continue to serve users, without downtime.

by krzysztof at March 13, 2017 11:00 AM

Daniël van Eeden

Network attacks on MySQL, Part 1: Unencrypted connections

Intro

In a set of blog posts I will explain to you how different attacks on the network traffic of MySQL look like and what you can do to secure your systems againt these kinds of attacks.

How to gain access

To gain access to MySQL network traffic you can use tcpdump, dumpcap, snoop or whatever the tool to capture network packets on your OS is. This can be on any device which is part of the connnection: the server, the client, routers, switches, etc.

Besides application-to-database traffic this attack can also be done on replication traffic.

Results

This allows you to extract queries and result sets.

The default password hash type mysql_new_password uses a nonce to protect against password sniffing. But when you change a password this will be sent accross the wire by default. Note that MySQL 5.6 and newer has some protection which ensures passwords are not sent to the logfiles, but this feature won't secure your network traffic.

In the replication stream however there are not as many places where passwords are exposed. This is true especially for row based replication, but even for statement based replication this can be true.

Some examples:

SET PASSWORD FOR 'myuser'@'%' = PASSWORD('foo'); -- deprecated syntax
UPDATE secrets SET secret_value = AES_ENCRYPT('foo', 'secret') WHERE id=5;

For both the password and the encryption key this can be seen in plain text for application-to-server traffic, but not for RBR replication traffic.

There is a trick to make this somewhat more secure, especially on 5.5 and older:

SELECT PASSWORD('foo') INTO @pwd;
SET PASSWORD FOR 'myuser'@'%' = @a;

If your application stores passwords in MySQL: You're doing it wrong. If your application stores hashed passwords (w/ salt, etc): If the hashing is done in your application: this is ok. But note that a man-in-the-middle might send a slightly altered resultset to your application and with this gain access to your application, but that requires an active attack.

This attacks for this level are mostly passive, which makes it hard to detect. An attacker might snif password hashes for your appliation and brute force them and then login to your application. The only thing you will see in your logs is a successful login...

To protect against this attack:

  1. Use SSL/TLS
  2. Encrypt/Decrypt values in the application before inserting it in the database.
  3. Use a SSH tunnel (Workbench has built-in support for this)
  4. Use a local TCP or UNIX domain socket when changing passwords.[1]
  5. Don't use the MySQL protocol over the internet w/o encryption. Use a VPN or SSH.

For sensitive data you preferably should combine 1. and 2. Item 3. and 4. are mostly for ad-hoc DBA access.

Keep in mind that there might be some cron jobs, backups etc. which also need to use a secure connection. Ofcourse you should also protect your data files and backup files, but that's not what this post is about.

[1] It is possible to snoop on UNIX domain socket traffic, but an attacker who has that access probably has full system access and might more easily use an active attack.

by Daniël van Eeden (noreply@blogger.com) at March 13, 2017 07:37 AM

March 11, 2017

Valeriy Kravchuk

Fun with Bugs #49 - Applying PMP to MySQL

As you maybe noted, in several recent posts I've provided some additional details for slides used during my FOSDEM talk on profiling MySQL. The only part not covered yet is related to using Poor Man's Profiler (and pt-pmp version of it). I see no reason to explain what it does and how to use it once again, but would like to show several recent enough MySQL bug reports where this tool was essential to find, explain or demonstrate the problem.

Quick search for active bugs with "pt-pmp" in MySQL bugs database produces 8 hits at the moment:
  •  Bug #85304 - "Reduce mutex contention of fil_system->mutex". It was reported by Zhai Weixiang few days ago, and pt-pmp output was used as a starting point for the analysis that ended up with a patch suggested.
  • Bug #85191 - "performance regression with HANDLER READ syntax", from the same reporter. In this report pt-pmp was used to prove the point and show what exactly threads were doing.
  • Bug #80979 - "Doublewrite buffer event waits should be annotated for Performance Schema", by Laurynas Biveinis. One more case when PMP shows where the time is spent by threads in some specific case, while there is no instrumentation (yet) for the related wait in Perfomance Schema.
  • Bug #77827 - "reduce cpu time costs of creating dummy index while change buffer is enabled", again by Zhai Weixiang. In this bug report he had used both perf to show that some notable time was spent on the operation, and pt-pmp to show the related backtraces.
  • Bug #73803 - "don't acquire global_sid_lock->rdlock if gtid is disabled". Once again, Zhai Weixiang used pt-pmp output as a starting point for further code analysis.I wonder why this bug is still "Open", by the way...
  • Bug #70237 - "the mysqladmin shutdown hangs". Guess who reported it after applying PMP when something hanged. As I stated in all my 3 recent FOSDEM talks, this is exactly what you have to do before killing and restarting MySQL server in production - get backtraces of all threads, raw or at least aggregated with pt-pmp... I am not sure why the bug was not noted in time, there are even ideas of patches shared. Time for somebody to process it formally.
  • Bug #69812 - "the server stalls at function row_vers_build_for_consistent_read". Same reporter, same tool used, same result - the bug report is still "Open". Looks like I know what my next post(s) in this "Fun with Bugs" series will be devoted to...
  • Bug #62018 - "innodb adaptive hash index mutex contention". It was reported by Mark Callaghan and PMP outputs were used as a part of the evidence. The bug is "Verified" and even got a patch suggested for 5.7.x by Percona engineers, but still had not got any proper attention from Oracle. I may have some more results related to the "cost" and "benefits" of adaptive hash indexing to share soon, so stay tuned...
Surely, there are way more bugs where PMP was used. Let me share one more that I noted while working on my recent talk on profiling (bug had not found time to put it on slides and comment on):

  • Bug #78277 - "InnoDB deadlock, thread stuck on kernel calls from transparent page compression", by Mark Callaghan. Again, PMP outputs were provided to prove the point and show where threads are stuck. The bug is "Open".

For many performance related cases applying pt-pmp and sharing the results becomes a de facto community requirement, as you can see, for example, in Bug #84025. Note that Umesh Shastry, who verified the bug, provided pt-pmp outputs in hist testing results. I'd suggest to have gdb and pt-pmp installed and ready to use on any production system using any version and fork of MySQL. Even if your bug will be ignored by Oracle, these outputs are useful for other community members who may hit similar cases or is not lazy to check and work on the code to provide a patch.

by Valeriy Kravchuk (noreply@blogger.com) at March 11, 2017 05:30 PM

Peter Zaitsev

Troubleshooting MySQL access privileges issues: Q & A

MySQL access privileges

MySQL access privilegesIn this blog, I will provide answers to the Q & A for the Troubleshooting MySQL Access Privileges Issues webinar.

First, I want to thank everybody for attending the February 23 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: Should the root@localhost user be given ALL privileges or Super privileges? Does All include Super privileges also?

A: Yes, you should have a user with all privileges. Better if this user has access from localhost only.

ALL
  includes
SUPER
.

Q: We have users who connect via a laptop that get dynamic IP addresses, so granting access with a server name is an easier way to manage these users. Can I grant access to a MySQL database with a hostname as opposed to an ipaddress? For example “myname@mymachine.mydomain.com” as opposed to “myname@10.10.10.10”?  Is the host cache/performance_schema required for this?

A: Yes, you can.

But it looks like I was not clear about host cache. Host cache is an internal structure that is always available and contains answers from DNS server. You cannot enable or disable it. Until version 5.6, you also could not control it. For example, if the cache got corrupted the only thing you could do is to restart the server. Version 5.6 the table

HOST_CACHE
 was introduced to Performance Schema. With this table you can examine the content of the host cache and truncate it if needed.

Q: If there are multiple entries in the user table that match the connecting user (e.g., with wildcards, hostname, and IP), what rules does MySQL use to select which is used for authentication?  Does it try multiple ones until it gets a password match?

A: Not, mysqld does not try to hack your passwords. Instead it sorts the user table by name and host in descending order as I showed on slide #37 (page 110). Then it takes the first matching row. So if you created users

foo@somehost
,
foo@some%
 and
foo@1.2.3.4
, and you connect as foo from
somehost
, mysqld first checks the user name and then chooses the first matching row
foo@somehost
. If you instead connect as
foo
 from
someotherhost
, mysqld chooses 
foo@some%
. An IP-based host is chosen if either mysqld started with option
skip-networking
 or if
1.2.3.4
  points to a host whose name does not start with “some”.

Mixing IP-based and name-based hosts is dangerous in situations when the same host can be resolved as

somehost
 or
1.2.3.4
. In this case, if something goes wrong with the host cache or DNS server, the wrong entry from the user table can be chosen. For example, if you initially had three hosts:
uniquehost
 (which resolves as
1.2.3.4
),
somehost
 (which resolves as
4.3.2.1
) and
someothershost
 (which resolves as
4.3.2.2
). Now you decided to re-locate
uniquehost
 to a machine with IP
1.2.3.5
 and use IP
1.2.3.4
 for the host with name
someyetanotherhost
. In this case, the clients from the machine with IP
1.2.3.4
 will be treated as
foo@some%
, which isn’t what you want.

To demonstrate this issue, I created two users and granted two different privileges to them:

mysql> create user sveta@Thinkie;
Query OK, 0 rows affected (0,01 sec)
mysql> create user sveta@'192.168.0.4';
Query OK, 0 rows affected (0,00 sec)
mysql> grant all on *.* to 'sveta'@'Thinkie';
Query OK, 0 rows affected (0,00 sec)
mysql> grant all on db1.* to 'sveta'@'192.168.0.4';
Query OK, 0 rows affected (0,00 sec)

Now I modified my

/etc/hosts
  file and pointed address
192.168.0.4
  to name
Thinkie
:

127.0.0.1   localhost
# 127.0.1.1   Thinkie
192.168.0.4 Thinkie

Now, if I connect as sveta both Thinkie and 192.168.0.4 are resolved to the same host:

sveta@Thinkie:$ mysql -hThinkie -usveta
...
mysql> select user(), current_user();
+---------------+----------------+
| user()        | current_user() |
+---------------+----------------+
| sveta@Thinkie | sveta@thinkie  |
+---------------+----------------+
1 row in set (0,00 sec)
mysql> show grants;
+--------------------------------------------------+
| Grants for sveta@thinkie                         |
+--------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'sveta'@'thinkie' |
+--------------------------------------------------+
1 row in set (0,00 sec)
mysql> q
Bye
sveta@Thinkie:$ mysql -h192.168.0.4 -usveta
...
mysql> show grants;
+--------------------------------------------------+
| Grants for sveta@thinkie                         |
+--------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'sveta'@'thinkie' |
+--------------------------------------------------+
1 row in set (0,00 sec)
mysql> select user(), current_user();
+---------------+----------------+
| user()        | current_user() |
+---------------+----------------+
| sveta@Thinkie | sveta@thinkie  |
+---------------+----------------+
1 row in set (0,00 sec)
mysql> q
Bye

Now I modified the 

/etc/hosts
  file and pointed
Thinkie
  back to
127.0.0.1
  (
localhost
):

127.0.0.1   localhost
127.0.1.1   Thinkie
# 192.168.0.4 Thinkie

But host

192.168.0.4
 still resolves to
Thinkie
:

sveta@Thinkie:$ mysql -h192.168.0.4 -usveta
...
mysql> select user(), current_user();
+---------------+----------------+
| user()        | current_user() |
+---------------+----------------+
| sveta@Thinkie | sveta@thinkie  |
+---------------+----------------+
1 row in set (0,00 sec)
mysql> show grants;
+--------------------------------------------------+
| Grants for sveta@thinkie                         |
+--------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'sveta'@'thinkie' |
+--------------------------------------------------+
1 row in set (0,00 sec)
mysql> q
Bye

The reason for this is a stalled host cache, which can be easily observable with Performance Schema:

sveta@Thinkie:$ mysql -uroot
...
mysql> select * from performance_schema.host_cacheG
*************************** 1. row ***************************
                                        IP: 192.168.0.4
                                      HOST: Thinkie
                            HOST_VALIDATED: YES
                        SUM_CONNECT_ERRORS: 0
                 COUNT_HOST_BLOCKED_ERRORS: 0
           COUNT_NAMEINFO_TRANSIENT_ERRORS: 0
           COUNT_NAMEINFO_PERMANENT_ERRORS: 0
                       COUNT_FORMAT_ERRORS: 0
           COUNT_ADDRINFO_TRANSIENT_ERRORS: 0
           COUNT_ADDRINFO_PERMANENT_ERRORS: 0
                       COUNT_FCRDNS_ERRORS: 0
                     COUNT_HOST_ACL_ERRORS: 0
               COUNT_NO_AUTH_PLUGIN_ERRORS: 0
                  COUNT_AUTH_PLUGIN_ERRORS: 0
                    COUNT_HANDSHAKE_ERRORS: 0
                   COUNT_PROXY_USER_ERRORS: 0
               COUNT_PROXY_USER_ACL_ERRORS: 0
               COUNT_AUTHENTICATION_ERRORS: 0
                          COUNT_SSL_ERRORS: 0
         COUNT_MAX_USER_CONNECTIONS_ERRORS: 0
COUNT_MAX_USER_CONNECTIONS_PER_HOUR_ERRORS: 0
             COUNT_DEFAULT_DATABASE_ERRORS: 0
                 COUNT_INIT_CONNECT_ERRORS: 0
                        COUNT_LOCAL_ERRORS: 0
                      COUNT_UNKNOWN_ERRORS: 0
                                FIRST_SEEN: 2017-03-02 23:19:32
                                 LAST_SEEN: 2017-03-02 23:20:31
                          FIRST_ERROR_SEEN: NULL
                           LAST_ERROR_SEEN: NULL
1 row in set (0,00 sec)
mysql> truncate performance_schema.host_cache;
Query OK, 0 rows affected (0,00 sec)
mysql> q
Bye

After I truncated table

host_cache
 the numeric host resolves as I expect:

sveta@Thinkie:$ mysql -h192.168.0.4 -usveta
...
mysql> show grants;
+----------------------------------------------------------+
| Grants for sveta@192.168.0.4                             |
+----------------------------------------------------------+
| GRANT USAGE ON *.* TO 'sveta'@'192.168.0.4'              |
| GRANT ALL PRIVILEGES ON `db1`.* TO 'sveta'@'192.168.0.4' |
+----------------------------------------------------------+
2 rows in set (0,00 sec)
mysql> select user(), current_user();
+-------------------+-------------------+
| user()            | current_user()    |
+-------------------+-------------------+
| sveta@192.168.0.4 | sveta@192.168.0.4 |
+-------------------+-------------------+
1 row in set (0,00 sec)
mysql> q
Bye

Q: What privileges are required for a non-root or non-super user to be to use mysqldump to dump the database and then restore it on a different server?

A: Generally you should have

SELECT
  privilege on all objects you are going to dump. If you are dumping views, you also should have
SHOW VIEW
  privilege in order to run
SHOW CREATE TABLE
. If you want to dump stored routines/events, you need access to them as well. If you use option
--lock-tables
or
--lock-all-tables
, you should have the 
LOCK
  privilege.

Q: If the max_connection value is reached in MySQL, can root@localhost with ALL privilege still login, or with Super privilege user can login?

A:

ALL
 includes
SUPER
, so a user with
ALL
  privilege can login. Just note there can be only one such connection, thus do not grant
SUPER
 
or ALL
 privilege to the application user.

Q: Is it possible to remove a priv at a lower level? In other words, grant select and delete at the database level, but remove delete for a specific table?  Or can privs only be added to?

A: Not, MySQL will reject such a statement:

mysql> show grants for sveta@'192.168.0.4';
+----------------------------------------------------------+
| Grants for sveta@192.168.0.4                             |
+----------------------------------------------------------+
| GRANT USAGE ON *.* TO 'sveta'@'192.168.0.4'              |
| GRANT ALL PRIVILEGES ON `db1`.* TO 'sveta'@'192.168.0.4' |
+----------------------------------------------------------+
2 rows in set (0,00 sec)
mysql> revoke update on db1.t1 from sveta@'192.168.0.4';
ERROR 1147 (42000): There is no such grant defined for user 'sveta' on host '192.168.0.4' on table 't1'

Q: How can we have DB user roles… like a group of grants for a particular role?

A: You have several options.

  1. Use MariaDB 10.0.5 or newer. You can read about roles support in MariaDB here
  2. Use MySQL 8.0. You can read about roles in MySQL 8.0 here
  3. With MySQL 5.7: imitate roles as I showed on slide 19 (pages 53 – 60)
  4. With MySQL 5.5 and 5.6: use the same method as shown on slides, but use the custom authentication plugin that supports proxy users.
  5. Always: create a template with privileges, assign privileges to each user manually.

Q: How would you migrate role simulation with proxy users to actual roles in MySQL 8.x?

A: I would drop the proxied user and create a role with the same privileges instead, then grant the proxy user the newly created role instead of

PROXY
.

Q: Is there a plugin to integrate Active Directory and MySQL in order to use Active Directory groups?

A: There is commercial Windows Authentication Plugin, available in versions 5.5 and newer. You can also use the open source Percona PAM authentication plugin and connect it to Active Directory the same way as can be done for LDAP. There is a blog post describing how to do so, but I’ve never used this method myself.

Q: Can we use central auth with MySQL?

A: Yes, with the help of the PAM Plugin. There are tutorials for LDAP and Active Directory. You may use similar methods to setup other kinds of authentications, such as Kerberos.

by Sveta Smirnova at March 11, 2017 12:36 AM

Percona Monitoring and Management (PMM) Graphs Explained: Custom MongoDB Graphs and Metrics

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM)This blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we will cover how to add custom MongoDB graphs to Percona Monitoring and Management (PMM) and (for the daring Golang developers out there) how to add custom metrics to PMM’s percona/mongodb_exporter metric exporter.

To get to adding new graphs and metrics, we first need to go over how PMM gets metrics from your database nodes and how they become graphs.

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB. It was developed by Percona on top of open-source technology. Behind the scenes, the graphing features this article covers use Prometheus (a popular time-series data store), Grafana (a popular visualisation tool), mongodb_exporter (our MongoDB database metric exporter) plus other technologies to provide database and operating system metric graphs for your database instances.

Prometheus

As mentioned, Percona Monitoring and Management uses Prometheus to gather and store database and operating system metrics. Prometheus works on an HTTP(s) pull-based architecture, where Prometheus “pulls” metrics from “exporters” on a schedule.

To provide a detailed view of your database hosts, you must enable two PMM monitoring services for MongoDB graphing capabilities:

  1. linux:metrics
  2. mongodb:metrics

See this link for more details on adding monitoring services to Percona Monitoring and Management: https://www.percona.com/doc/percona-monitoring-and-management/pmm-admin.html#adding-monitoring-services

It is important to note that not all metrics gathered by Percona Monitoring and Management are graphed. This is by design. Storing more metrics vs. what is graphed is very useful when more advanced insight is necessary. We also aim for PMM to be simple and straightforward to use, explaining why we don’t graph all of the nearly 1,000 metrics we collect per MongoDB node on each polling.

My personal monitoring philosophy is “monitor until it hurts and then take one step back.” In other words, try to get a much data as you can without impacting the database or adding monitoring resources/cost. Also, Prometheus stores large volumes of metrics efficiently due to compression and highly optimized data structures on disk. This offsets a lot of the cost of collecting extra metrics.

Later in this blog, I will show how to add a graph for an example metric that is currently gathered by PMM but is not graphed in PMM as of today. To see what metrics are available on PMM’s Prometheus instance, visit “http://<pmm-server>/promtheus/graph”.

prometheus/node_exporter (linux:metrics)

PMM’s OS-level metrics are provided to Prometheus via the 3rd-party exporter: prometheus/node_exporter.

This exporter provides 712 metrics per “pull” on my test CentOS 7 host that has:

  • 2 x CPUs
  • 3 x disks with 6 x LVMs
  • 2 x network interfaces

Note: more physical or virtual devices add to the number of node_exporter metrics.

The inner workings and addition of metrics to this exporter will not be covered in this blog post. Generally, the current metrics offered by node_exporter are more than enough.

Below is a full example of a single “pull” of metrics from node_exporter on my test host:

$ curl -sk https://192.168.99.10:42000/metrics|grep node
# HELP node_boot_time Node boot time, in unixtime.
# TYPE node_boot_time gauge
node_boot_time 1.488904144e+09
# HELP node_context_switches Total number of context switches.
# TYPE node_context_switches counter
node_context_switches 1.0407839e+07
# HELP node_cpu Seconds the cpus spent in each mode.
# TYPE node_cpu counter
node_cpu{cpu="cpu0",mode="guest"} 0
node_cpu{cpu="cpu0",mode="idle"} 6437.71
node_cpu{cpu="cpu0",mode="iowait"} 24
node_cpu{cpu="cpu0",mode="irq"} 0
node_cpu{cpu="cpu0",mode="nice"} 0
node_cpu{cpu="cpu0",mode="softirq"} 1.38
node_cpu{cpu="cpu0",mode="steal"} 0
node_cpu{cpu="cpu0",mode="system"} 117.65
node_cpu{cpu="cpu0",mode="user"} 33.6
node_cpu{cpu="cpu1",mode="guest"} 0
node_cpu{cpu="cpu1",mode="idle"} 6443.68
node_cpu{cpu="cpu1",mode="iowait"} 15.03
node_cpu{cpu="cpu1",mode="irq"} 0
node_cpu{cpu="cpu1",mode="nice"} 0
node_cpu{cpu="cpu1",mode="softirq"} 7.15
node_cpu{cpu="cpu1",mode="steal"} 0
node_cpu{cpu="cpu1",mode="system"} 118.08
node_cpu{cpu="cpu1",mode="user"} 28.84
# HELP node_disk_bytes_read The total number of bytes read successfully.
# TYPE node_disk_bytes_read counter
node_disk_bytes_read{device="dm-0"} 2.09581056e+08
node_disk_bytes_read{device="dm-1"} 1.114112e+06
node_disk_bytes_read{device="dm-2"} 5.500416e+06
node_disk_bytes_read{device="dm-3"} 2.98752e+06
node_disk_bytes_read{device="dm-4"} 3.480576e+06
node_disk_bytes_read{device="dm-5"} 4.0983552e+07
node_disk_bytes_read{device="dm-6"} 303104
node_disk_bytes_read{device="sda"} 2.40276992e+08
node_disk_bytes_read{device="sdb"} 4.5549568e+07
node_disk_bytes_read{device="sdc"} 8.967168e+06
node_disk_bytes_read{device="sr0"} 49152
# HELP node_disk_bytes_written The total number of bytes written successfully.
# TYPE node_disk_bytes_written counter
node_disk_bytes_written{device="dm-0"} 4.1160192e+07
node_disk_bytes_written{device="dm-1"} 0
node_disk_bytes_written{device="dm-2"} 2.413568e+06
node_disk_bytes_written{device="dm-3"} 9.0589184e+07
node_disk_bytes_written{device="dm-4"} 4.631552e+06
node_disk_bytes_written{device="dm-5"} 2.140672e+06
node_disk_bytes_written{device="dm-6"} 0
node_disk_bytes_written{device="sda"} 4.3257344e+07
node_disk_bytes_written{device="sdb"} 6.772224e+06
node_disk_bytes_written{device="sdc"} 9.3002752e+07
node_disk_bytes_written{device="sr0"} 0
# HELP node_disk_io_now The number of I/Os currently in progress.
# TYPE node_disk_io_now gauge
node_disk_io_now{device="dm-0"} 0
node_disk_io_now{device="dm-1"} 0
node_disk_io_now{device="dm-2"} 0
node_disk_io_now{device="dm-3"} 0
node_disk_io_now{device="dm-4"} 0
node_disk_io_now{device="dm-5"} 0
node_disk_io_now{device="dm-6"} 0
node_disk_io_now{device="sda"} 0
node_disk_io_now{device="sdb"} 0
node_disk_io_now{device="sdc"} 0
node_disk_io_now{device="sr0"} 0
# HELP node_disk_io_time_ms Milliseconds spent doing I/Os.
# TYPE node_disk_io_time_ms counter
node_disk_io_time_ms{device="dm-0"} 6443
node_disk_io_time_ms{device="dm-1"} 41
node_disk_io_time_ms{device="dm-2"} 319
node_disk_io_time_ms{device="dm-3"} 61024
node_disk_io_time_ms{device="dm-4"} 1159
node_disk_io_time_ms{device="dm-5"} 772
node_disk_io_time_ms{device="dm-6"} 1
node_disk_io_time_ms{device="sda"} 6965
node_disk_io_time_ms{device="sdb"} 2004
node_disk_io_time_ms{device="sdc"} 60718
node_disk_io_time_ms{device="sr0"} 5
# HELP node_disk_io_time_weighted The weighted # of milliseconds spent doing I/Os. See https://www.kernel.org/doc/Documentation/iostats.txt.
# TYPE node_disk_io_time_weighted counter
node_disk_io_time_weighted{device="dm-0"} 9972
node_disk_io_time_weighted{device="dm-1"} 46
node_disk_io_time_weighted{device="dm-2"} 369
node_disk_io_time_weighted{device="dm-3"} 147704
node_disk_io_time_weighted{device="dm-4"} 1618
node_disk_io_time_weighted{device="dm-5"} 968
node_disk_io_time_weighted{device="dm-6"} 1
node_disk_io_time_weighted{device="sda"} 10365
node_disk_io_time_weighted{device="sdb"} 3213
node_disk_io_time_weighted{device="sdc"} 143822
node_disk_io_time_weighted{device="sr0"} 5
# HELP node_disk_read_time_ms The total number of milliseconds spent by all reads.
# TYPE node_disk_read_time_ms counter
node_disk_read_time_ms{device="dm-0"} 7037
node_disk_read_time_ms{device="dm-1"} 46
node_disk_read_time_ms{device="dm-2"} 312
node_disk_read_time_ms{device="dm-3"} 56
node_disk_read_time_ms{device="dm-4"} 115
node_disk_read_time_ms{device="dm-5"} 906
node_disk_read_time_ms{device="dm-6"} 1
node_disk_read_time_ms{device="sda"} 7646
node_disk_read_time_ms{device="sdb"} 1192
node_disk_read_time_ms{device="sdc"} 412
node_disk_read_time_ms{device="sr0"} 5
# HELP node_disk_reads_completed The total number of reads completed successfully.
# TYPE node_disk_reads_completed counter
node_disk_reads_completed{device="dm-0"} 11521
node_disk_reads_completed{device="dm-1"} 133
node_disk_reads_completed{device="dm-2"} 1139
node_disk_reads_completed{device="dm-3"} 130
node_disk_reads_completed{device="dm-4"} 199
node_disk_reads_completed{device="dm-5"} 2077
node_disk_reads_completed{device="dm-6"} 42
node_disk_reads_completed{device="sda"} 13429
node_disk_reads_completed{device="sdb"} 2483
node_disk_reads_completed{device="sdc"} 1376
node_disk_reads_completed{device="sr0"} 12
# HELP node_disk_reads_merged The number of reads merged. See https://www.kernel.org/doc/Documentation/iostats.txt.
# TYPE node_disk_reads_merged counter
node_disk_reads_merged{device="dm-0"} 0
node_disk_reads_merged{device="dm-1"} 0
node_disk_reads_merged{device="dm-2"} 0
node_disk_reads_merged{device="dm-3"} 0
node_disk_reads_merged{device="dm-4"} 0
node_disk_reads_merged{device="dm-5"} 0
node_disk_reads_merged{device="dm-6"} 0
node_disk_reads_merged{device="sda"} 12
node_disk_reads_merged{device="sdb"} 0
node_disk_reads_merged{device="sdc"} 0
node_disk_reads_merged{device="sr0"} 0
# HELP node_disk_sectors_read The total number of sectors read successfully.
# TYPE node_disk_sectors_read counter
node_disk_sectors_read{device="dm-0"} 409338
node_disk_sectors_read{device="dm-1"} 2176
node_disk_sectors_read{device="dm-2"} 10743
node_disk_sectors_read{device="dm-3"} 5835
node_disk_sectors_read{device="dm-4"} 6798
node_disk_sectors_read{device="dm-5"} 80046
node_disk_sectors_read{device="dm-6"} 592
node_disk_sectors_read{device="sda"} 469291
node_disk_sectors_read{device="sdb"} 88964
node_disk_sectors_read{device="sdc"} 17514
node_disk_sectors_read{device="sr0"} 96
# HELP node_disk_sectors_written The total number of sectors written successfully.
# TYPE node_disk_sectors_written counter
node_disk_sectors_written{device="dm-0"} 80391
node_disk_sectors_written{device="dm-1"} 0
node_disk_sectors_written{device="dm-2"} 4714
node_disk_sectors_written{device="dm-3"} 176932
node_disk_sectors_written{device="dm-4"} 9046
node_disk_sectors_written{device="dm-5"} 4181
node_disk_sectors_written{device="dm-6"} 0
node_disk_sectors_written{device="sda"} 84487
node_disk_sectors_written{device="sdb"} 13227
node_disk_sectors_written{device="sdc"} 181646
node_disk_sectors_written{device="sr0"} 0
# HELP node_disk_write_time_ms This is the total number of milliseconds spent by all writes.
# TYPE node_disk_write_time_ms counter
node_disk_write_time_ms{device="dm-0"} 2935
node_disk_write_time_ms{device="dm-1"} 0
node_disk_write_time_ms{device="dm-2"} 57
node_disk_write_time_ms{device="dm-3"} 146868
node_disk_write_time_ms{device="dm-4"} 1503
node_disk_write_time_ms{device="dm-5"} 62
node_disk_write_time_ms{device="dm-6"} 0
node_disk_write_time_ms{device="sda"} 2797
node_disk_write_time_ms{device="sdb"} 2096
node_disk_write_time_ms{device="sdc"} 143569
node_disk_write_time_ms{device="sr0"} 0
# HELP node_disk_writes_completed The total number of writes completed successfully.
# TYPE node_disk_writes_completed counter
node_disk_writes_completed{device="dm-0"} 2701
node_disk_writes_completed{device="dm-1"} 0
node_disk_writes_completed{device="dm-2"} 70
node_disk_writes_completed{device="dm-3"} 415913
node_disk_writes_completed{device="dm-4"} 2737
node_disk_writes_completed{device="dm-5"} 17
node_disk_writes_completed{device="dm-6"} 0
node_disk_writes_completed{device="sda"} 3733
node_disk_writes_completed{device="sdb"} 3189
node_disk_writes_completed{device="sdc"} 421341
node_disk_writes_completed{device="sr0"} 0
# HELP node_disk_writes_merged The number of writes merged. See https://www.kernel.org/doc/Documentation/iostats.txt.
# TYPE node_disk_writes_merged counter
node_disk_writes_merged{device="dm-0"} 0
node_disk_writes_merged{device="dm-1"} 0
node_disk_writes_merged{device="dm-2"} 0
node_disk_writes_merged{device="dm-3"} 0
node_disk_writes_merged{device="dm-4"} 0
node_disk_writes_merged{device="dm-5"} 0
node_disk_writes_merged{device="dm-6"} 0
node_disk_writes_merged{device="sda"} 147
node_disk_writes_merged{device="sdb"} 19
node_disk_writes_merged{device="sdc"} 2182
node_disk_writes_merged{device="sr0"} 0
# HELP node_exporter_build_info A metric with a constant '1' value labeled by version, revision, branch, and goversion from which node_exporter was built.
# TYPE node_exporter_build_info gauge
node_exporter_build_info{branch="master",goversion="go1.7.4",revision="2d78e22000779d63c714011e4fb30c65623b9c77",version="1.1.1"} 1
# HELP node_exporter_scrape_duration_seconds node_exporter: Duration of a scrape job.
# TYPE node_exporter_scrape_duration_seconds summary
node_exporter_scrape_duration_seconds{collector="diskstats",result="success",quantile="0.5"} 0.00018768700000000002
node_exporter_scrape_duration_seconds{collector="diskstats",result="success",quantile="0.9"} 0.000199063
node_exporter_scrape_duration_seconds{collector="diskstats",result="success",quantile="0.99"} 0.000199063
node_exporter_scrape_duration_seconds_sum{collector="diskstats",result="success"} 0.0008093360000000001
node_exporter_scrape_duration_seconds_count{collector="diskstats",result="success"} 3
node_exporter_scrape_duration_seconds{collector="filefd",result="success",quantile="0.5"} 2.6501000000000003e-05
node_exporter_scrape_duration_seconds{collector="filefd",result="success",quantile="0.9"} 0.000149764
node_exporter_scrape_duration_seconds{collector="filefd",result="success",quantile="0.99"} 0.000149764
node_exporter_scrape_duration_seconds_sum{collector="filefd",result="success"} 0.0004463360000000001
node_exporter_scrape_duration_seconds_count{collector="filefd",result="success"} 3
node_exporter_scrape_duration_seconds{collector="filesystem",result="success",quantile="0.5"} 0.000350556
node_exporter_scrape_duration_seconds{collector="filesystem",result="success",quantile="0.9"} 0.000661542
node_exporter_scrape_duration_seconds{collector="filesystem",result="success",quantile="0.99"} 0.000661542
node_exporter_scrape_duration_seconds_sum{collector="filesystem",result="success"} 0.006669826000000001
node_exporter_scrape_duration_seconds_count{collector="filesystem",result="success"} 3
node_exporter_scrape_duration_seconds{collector="loadavg",result="success",quantile="0.5"} 4.1976e-05
node_exporter_scrape_duration_seconds{collector="loadavg",result="success",quantile="0.9"} 4.2433000000000005e-05
node_exporter_scrape_duration_seconds{collector="loadavg",result="success",quantile="0.99"} 4.2433000000000005e-05
node_exporter_scrape_duration_seconds_sum{collector="loadavg",result="success"} 0.0006603080000000001
node_exporter_scrape_duration_seconds_count{collector="loadavg",result="success"} 3
node_exporter_scrape_duration_seconds{collector="meminfo",result="success",quantile="0.5"} 0.000244919
node_exporter_scrape_duration_seconds{collector="meminfo",result="success",quantile="0.9"} 0.00038740000000000004
node_exporter_scrape_duration_seconds{collector="meminfo",result="success",quantile="0.99"} 0.00038740000000000004
node_exporter_scrape_duration_seconds_sum{collector="meminfo",result="success"} 0.003157154
node_exporter_scrape_duration_seconds_count{collector="meminfo",result="success"} 3
node_exporter_scrape_duration_seconds{collector="netdev",result="success",quantile="0.5"} 0.000152886
node_exporter_scrape_duration_seconds{collector="netdev",result="success",quantile="0.9"} 0.00048569400000000006
node_exporter_scrape_duration_seconds{collector="netdev",result="success",quantile="0.99"} 0.00048569400000000006
node_exporter_scrape_duration_seconds_sum{collector="netdev",result="success"} 0.0033770220000000004
node_exporter_scrape_duration_seconds_count{collector="netdev",result="success"} 3
node_exporter_scrape_duration_seconds{collector="netstat",result="success",quantile="0.5"} 0.0007874710000000001
node_exporter_scrape_duration_seconds{collector="netstat",result="success",quantile="0.9"} 0.001612906
node_exporter_scrape_duration_seconds{collector="netstat",result="success",quantile="0.99"} 0.001612906
node_exporter_scrape_duration_seconds_sum{collector="netstat",result="success"} 0.0055138800000000005
node_exporter_scrape_duration_seconds_count{collector="netstat",result="success"} 3
node_exporter_scrape_duration_seconds{collector="stat",result="success",quantile="0.5"} 9.3368e-05
node_exporter_scrape_duration_seconds{collector="stat",result="success",quantile="0.9"} 0.00014552100000000002
node_exporter_scrape_duration_seconds{collector="stat",result="success",quantile="0.99"} 0.00014552100000000002
node_exporter_scrape_duration_seconds_sum{collector="stat",result="success"} 0.00039008900000000004
node_exporter_scrape_duration_seconds_count{collector="stat",result="success"} 3
node_exporter_scrape_duration_seconds{collector="time",result="success",quantile="0.5"} 1.0485e-05
node_exporter_scrape_duration_seconds{collector="time",result="success",quantile="0.9"} 2.462e-05
node_exporter_scrape_duration_seconds{collector="time",result="success",quantile="0.99"} 2.462e-05
node_exporter_scrape_duration_seconds_sum{collector="time",result="success"} 6.4423e-05
node_exporter_scrape_duration_seconds_count{collector="time",result="success"} 3
node_exporter_scrape_duration_seconds{collector="uname",result="success",quantile="0.5"} 1.811e-05
node_exporter_scrape_duration_seconds{collector="uname",result="success",quantile="0.9"} 6.731300000000001e-05
node_exporter_scrape_duration_seconds{collector="uname",result="success",quantile="0.99"} 6.731300000000001e-05
node_exporter_scrape_duration_seconds_sum{collector="uname",result="success"} 0.00030852200000000004
node_exporter_scrape_duration_seconds_count{collector="uname",result="success"} 3
node_exporter_scrape_duration_seconds{collector="vmstat",result="success",quantile="0.5"} 0.00035561100000000003
node_exporter_scrape_duration_seconds{collector="vmstat",result="success",quantile="0.9"} 0.00046660900000000004
node_exporter_scrape_duration_seconds{collector="vmstat",result="success",quantile="0.99"} 0.00046660900000000004
node_exporter_scrape_duration_seconds_sum{collector="vmstat",result="success"} 0.002186003
node_exporter_scrape_duration_seconds_count{collector="vmstat",result="success"} 3
# HELP node_filefd_allocated File descriptor statistics: allocated.
# TYPE node_filefd_allocated gauge
node_filefd_allocated 1696
# HELP node_filefd_maximum File descriptor statistics: maximum.
# TYPE node_filefd_maximum gauge
node_filefd_maximum 382428
# HELP node_filesystem_avail Filesystem space available to non-root users in bytes.
# TYPE node_filesystem_avail gauge
node_filesystem_avail{device="/dev/mapper/centos-root",fstype="xfs",mountpoint="/"} 2.977058816e+09
node_filesystem_avail{device="/dev/mapper/data-home",fstype="xfs",mountpoint="/home"} 2.3885000704e+10
node_filesystem_avail{device="/dev/mapper/data-opt",fstype="xfs",mountpoint="/opt"} 6.520070144e+09
node_filesystem_avail{device="/dev/mapper/tmpdata-docker",fstype="xfs",mountpoint="/var/lib/docker"} 8.374427648e+09
node_filesystem_avail{device="/dev/mapper/tmpdata-mongo_data",fstype="xfs",mountpoint="/home/tim/mongo/data"} 6.386853888e+10
node_filesystem_avail{device="/dev/sda1",fstype="xfs",mountpoint="/boot"} 1.37248768e+08
node_filesystem_avail{device="rootfs",fstype="rootfs",mountpoint="/"} 2.977058816e+09
node_filesystem_avail{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.978773504e+09
node_filesystem_avail{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 3.975168e+08
node_filesystem_avail{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 3.975168e+08
# HELP node_filesystem_files Filesystem total file nodes.
# TYPE node_filesystem_files gauge
node_filesystem_files{device="/dev/mapper/centos-root",fstype="xfs",mountpoint="/"} 6.991872e+06
node_filesystem_files{device="/dev/mapper/data-home",fstype="xfs",mountpoint="/home"} 2.9360128e+07
node_filesystem_files{device="/dev/mapper/data-opt",fstype="xfs",mountpoint="/opt"} 8.388608e+06
node_filesystem_files{device="/dev/mapper/tmpdata-docker",fstype="xfs",mountpoint="/var/lib/docker"} 4.194304e+06
node_filesystem_files{device="/dev/mapper/tmpdata-mongo_data",fstype="xfs",mountpoint="/home/tim/mongo/data"} 3.145728e+07
node_filesystem_files{device="/dev/sda1",fstype="xfs",mountpoint="/boot"} 512000
node_filesystem_files{device="rootfs",fstype="rootfs",mountpoint="/"} 6.991872e+06
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 485249
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 485249
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 485249
# HELP node_filesystem_files_free Filesystem total free file nodes.
# TYPE node_filesystem_files_free gauge
node_filesystem_files_free{device="/dev/mapper/centos-root",fstype="xfs",mountpoint="/"} 6.839777e+06
node_filesystem_files_free{device="/dev/mapper/data-home",fstype="xfs",mountpoint="/home"} 2.9273287e+07
node_filesystem_files_free{device="/dev/mapper/data-opt",fstype="xfs",mountpoint="/opt"} 8.376436e+06
node_filesystem_files_free{device="/dev/mapper/tmpdata-docker",fstype="xfs",mountpoint="/var/lib/docker"} 4.194188e+06
node_filesystem_files_free{device="/dev/mapper/tmpdata-mongo_data",fstype="xfs",mountpoint="/home/tim/mongo/data"} 3.1457181e+07
node_filesystem_files_free{device="/dev/sda1",fstype="xfs",mountpoint="/boot"} 511638
node_filesystem_files_free{device="rootfs",fstype="rootfs",mountpoint="/"} 6.839777e+06
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 484725
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 485248
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 485248
# HELP node_filesystem_free Filesystem free space in bytes.
# TYPE node_filesystem_free gauge
node_filesystem_free{device="/dev/mapper/centos-root",fstype="xfs",mountpoint="/"} 2.977058816e+09
node_filesystem_free{device="/dev/mapper/data-home",fstype="xfs",mountpoint="/home"} 2.3885000704e+10
node_filesystem_free{device="/dev/mapper/data-opt",fstype="xfs",mountpoint="/opt"} 6.520070144e+09
node_filesystem_free{device="/dev/mapper/tmpdata-docker",fstype="xfs",mountpoint="/var/lib/docker"} 8.374427648e+09
node_filesystem_free{device="/dev/mapper/tmpdata-mongo_data",fstype="xfs",mountpoint="/home/tim/mongo/data"} 6.386853888e+10
node_filesystem_free{device="/dev/sda1",fstype="xfs",mountpoint="/boot"} 1.37248768e+08
node_filesystem_free{device="rootfs",fstype="rootfs",mountpoint="/"} 2.977058816e+09
node_filesystem_free{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.978773504e+09
node_filesystem_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 3.975168e+08
node_filesystem_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 3.975168e+08
# HELP node_filesystem_readonly Filesystem read-only status.
# TYPE node_filesystem_readonly gauge
node_filesystem_readonly{device="/dev/mapper/centos-root",fstype="xfs",mountpoint="/"} 0
node_filesystem_readonly{device="/dev/mapper/data-home",fstype="xfs",mountpoint="/home"} 0
node_filesystem_readonly{device="/dev/mapper/data-opt",fstype="xfs",mountpoint="/opt"} 0
node_filesystem_readonly{device="/dev/mapper/tmpdata-docker",fstype="xfs",mountpoint="/var/lib/docker"} 0
node_filesystem_readonly{device="/dev/mapper/tmpdata-mongo_data",fstype="xfs",mountpoint="/home/tim/mongo/data"} 0
node_filesystem_readonly{device="/dev/sda1",fstype="xfs",mountpoint="/boot"} 0
node_filesystem_readonly{device="rootfs",fstype="rootfs",mountpoint="/"} 0
node_filesystem_readonly{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 0
node_filesystem_readonly{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 0
node_filesystem_readonly{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 0
# HELP node_filesystem_size Filesystem size in bytes.
# TYPE node_filesystem_size gauge
node_filesystem_size{device="/dev/mapper/centos-root",fstype="xfs",mountpoint="/"} 7.149191168e+09
node_filesystem_size{device="/dev/mapper/data-home",fstype="xfs",mountpoint="/home"} 3.0054285312e+10
node_filesystem_size{device="/dev/mapper/data-opt",fstype="xfs",mountpoint="/opt"} 8.579448832e+09
node_filesystem_size{device="/dev/mapper/tmpdata-docker",fstype="xfs",mountpoint="/var/lib/docker"} 8.579448832e+09
node_filesystem_size{device="/dev/mapper/tmpdata-mongo_data",fstype="xfs",mountpoint="/home/tim/mongo/data"} 6.439305216e+10
node_filesystem_size{device="/dev/sda1",fstype="xfs",mountpoint="/boot"} 5.20794112e+08
node_filesystem_size{device="rootfs",fstype="rootfs",mountpoint="/"} 7.149191168e+09
node_filesystem_size{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.987579904e+09
node_filesystem_size{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 3.975168e+08
node_filesystem_size{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 3.975168e+08
# HELP node_forks Total number of forks.
# TYPE node_forks counter
node_forks 3928
# HELP node_intr Total number of interrupts serviced.
# TYPE node_intr counter
node_intr 4.629898e+06
# HELP node_load1 1m load average.
# TYPE node_load1 gauge
node_load1 0.2
# HELP node_load15 15m load average.
# TYPE node_load15 gauge
node_load15 0.21
# HELP node_load5 5m load average.
# TYPE node_load5 gauge
node_load5 0.22
# HELP node_memory_Active Memory information field Active.
# TYPE node_memory_Active gauge
node_memory_Active 4.80772096e+08
# HELP node_memory_Active_anon Memory information field Active_anon.
# TYPE node_memory_Active_anon gauge
node_memory_Active_anon 3.62610688e+08
# HELP node_memory_Active_file Memory information field Active_file.
# TYPE node_memory_Active_file gauge
node_memory_Active_file 1.18161408e+08
# HELP node_memory_AnonHugePages Memory information field AnonHugePages.
# TYPE node_memory_AnonHugePages gauge
node_memory_AnonHugePages 1.2582912e+07
# HELP node_memory_AnonPages Memory information field AnonPages.
# TYPE node_memory_AnonPages gauge
node_memory_AnonPages 3.62438656e+08
# HELP node_memory_Bounce Memory information field Bounce.
# TYPE node_memory_Bounce gauge
node_memory_Bounce 0
# HELP node_memory_Buffers Memory information field Buffers.
# TYPE node_memory_Buffers gauge
node_memory_Buffers 1.90464e+06
# HELP node_memory_Cached Memory information field Cached.
# TYPE node_memory_Cached gauge
node_memory_Cached 3.12205312e+08
# HELP node_memory_CommitLimit Memory information field CommitLimit.
# TYPE node_memory_CommitLimit gauge
node_memory_CommitLimit 2.847408128e+09
# HELP node_memory_Committed_AS Memory information field Committed_AS.
# TYPE node_memory_Committed_AS gauge
node_memory_Committed_AS 5.08229632e+09
# HELP node_memory_DirectMap2M Memory information field DirectMap2M.
# TYPE node_memory_DirectMap2M gauge
node_memory_DirectMap2M 4.22576128e+09
# HELP node_memory_DirectMap4k Memory information field DirectMap4k.
# TYPE node_memory_DirectMap4k gauge
node_memory_DirectMap4k 6.914048e+07
# HELP node_memory_Dirty Memory information field Dirty.
# TYPE node_memory_Dirty gauge
node_memory_Dirty 69632
# HELP node_memory_HardwareCorrupted Memory information field HardwareCorrupted.
# TYPE node_memory_HardwareCorrupted gauge
node_memory_HardwareCorrupted 0
# HELP node_memory_HugePages_Free Memory information field HugePages_Free.
# TYPE node_memory_HugePages_Free gauge
node_memory_HugePages_Free 0
# HELP node_memory_HugePages_Rsvd Memory information field HugePages_Rsvd.
# TYPE node_memory_HugePages_Rsvd gauge
node_memory_HugePages_Rsvd 0
# HELP node_memory_HugePages_Surp Memory information field HugePages_Surp.
# TYPE node_memory_HugePages_Surp gauge
node_memory_HugePages_Surp 0
# HELP node_memory_HugePages_Total Memory information field HugePages_Total.
# TYPE node_memory_HugePages_Total gauge
node_memory_HugePages_Total 0
# HELP node_memory_Hugepagesize Memory information field Hugepagesize.
# TYPE node_memory_Hugepagesize gauge
node_memory_Hugepagesize 2.097152e+06
# HELP node_memory_Inactive Memory information field Inactive.
# TYPE node_memory_Inactive gauge
node_memory_Inactive 1.95702784e+08
# HELP node_memory_Inactive_anon Memory information field Inactive_anon.
# TYPE node_memory_Inactive_anon gauge
node_memory_Inactive_anon 8.564736e+06
# HELP node_memory_Inactive_file Memory information field Inactive_file.
# TYPE node_memory_Inactive_file gauge
node_memory_Inactive_file 1.87138048e+08
# HELP node_memory_KernelStack Memory information field KernelStack.
# TYPE node_memory_KernelStack gauge
node_memory_KernelStack 1.1583488e+07
# HELP node_memory_Mapped Memory information field Mapped.
# TYPE node_memory_Mapped gauge
node_memory_Mapped 6.7530752e+07
# HELP node_memory_MemAvailable Memory information field MemAvailable.
# TYPE node_memory_MemAvailable gauge
node_memory_MemAvailable 3.273515008e+09
# HELP node_memory_MemFree Memory information field MemFree.
# TYPE node_memory_MemFree gauge
node_memory_MemFree 3.15346944e+09
# HELP node_memory_MemTotal Memory information field MemTotal.
# TYPE node_memory_MemTotal gauge
node_memory_MemTotal 3.975163904e+09
# HELP node_memory_Mlocked Memory information field Mlocked.
# TYPE node_memory_Mlocked gauge
node_memory_Mlocked 0
# HELP node_memory_NFS_Unstable Memory information field NFS_Unstable.
# TYPE node_memory_NFS_Unstable gauge
node_memory_NFS_Unstable 0
# HELP node_memory_PageTables Memory information field PageTables.
# TYPE node_memory_PageTables gauge
node_memory_PageTables 9.105408e+06
# HELP node_memory_SReclaimable Memory information field SReclaimable.
# TYPE node_memory_SReclaimable gauge
node_memory_SReclaimable 4.4638208e+07
# HELP node_memory_SUnreclaim Memory information field SUnreclaim.
# TYPE node_memory_SUnreclaim gauge
node_memory_SUnreclaim 3.2059392e+07
# HELP node_memory_Shmem Memory information field Shmem.
# TYPE node_memory_Shmem gauge
node_memory_Shmem 8.810496e+06
# HELP node_memory_Slab Memory information field Slab.
# TYPE node_memory_Slab gauge
node_memory_Slab 7.66976e+07
# HELP node_memory_SwapCached Memory information field SwapCached.
# TYPE node_memory_SwapCached gauge
node_memory_SwapCached 0
# HELP node_memory_SwapFree Memory information field SwapFree.
# TYPE node_memory_SwapFree gauge
node_memory_SwapFree 8.59828224e+08
# HELP node_memory_SwapTotal Memory information field SwapTotal.
# TYPE node_memory_SwapTotal gauge
node_memory_SwapTotal 8.59828224e+08
# HELP node_memory_Unevictable Memory information field Unevictable.
# TYPE node_memory_Unevictable gauge
node_memory_Unevictable 0
# HELP node_memory_VmallocChunk Memory information field VmallocChunk.
# TYPE node_memory_VmallocChunk gauge
node_memory_VmallocChunk 3.5184346918912e+13
# HELP node_memory_VmallocTotal Memory information field VmallocTotal.
# TYPE node_memory_VmallocTotal gauge
node_memory_VmallocTotal 3.5184372087808e+13
# HELP node_memory_VmallocUsed Memory information field VmallocUsed.
# TYPE node_memory_VmallocUsed gauge
node_memory_VmallocUsed 1.6748544e+07
# HELP node_memory_Writeback Memory information field Writeback.
# TYPE node_memory_Writeback gauge
node_memory_Writeback 0
# HELP node_memory_WritebackTmp Memory information field WritebackTmp.
# TYPE node_memory_WritebackTmp gauge
node_memory_WritebackTmp 0
# HELP node_netstat_IcmpMsg_InType3 Protocol IcmpMsg statistic InType3.
# TYPE node_netstat_IcmpMsg_InType3 untyped
node_netstat_IcmpMsg_InType3 32
# HELP node_netstat_IcmpMsg_OutType3 Protocol IcmpMsg statistic OutType3.
# TYPE node_netstat_IcmpMsg_OutType3 untyped
node_netstat_IcmpMsg_OutType3 32
# HELP node_netstat_Icmp_InAddrMaskReps Protocol Icmp statistic InAddrMaskReps.
# TYPE node_netstat_Icmp_InAddrMaskReps untyped
node_netstat_Icmp_InAddrMaskReps 0
# HELP node_netstat_Icmp_InAddrMasks Protocol Icmp statistic InAddrMasks.
# TYPE node_netstat_Icmp_InAddrMasks untyped
node_netstat_Icmp_InAddrMasks 0
# HELP node_netstat_Icmp_InCsumErrors Protocol Icmp statistic InCsumErrors.
# TYPE node_netstat_Icmp_InCsumErrors untyped
node_netstat_Icmp_InCsumErrors 0
# HELP node_netstat_Icmp_InDestUnreachs Protocol Icmp statistic InDestUnreachs.
# TYPE node_netstat_Icmp_InDestUnreachs untyped
node_netstat_Icmp_InDestUnreachs 32
# HELP node_netstat_Icmp_InEchoReps Protocol Icmp statistic InEchoReps.
# TYPE node_netstat_Icmp_InEchoReps untyped
node_netstat_Icmp_InEchoReps 0
# HELP node_netstat_Icmp_InEchos Protocol Icmp statistic InEchos.
# TYPE node_netstat_Icmp_InEchos untyped
node_netstat_Icmp_InEchos 0
# HELP node_netstat_Icmp_InErrors Protocol Icmp statistic InErrors.
# TYPE node_netstat_Icmp_InErrors untyped
node_netstat_Icmp_InErrors 0
# HELP node_netstat_Icmp_InMsgs Protocol Icmp statistic InMsgs.
# TYPE node_netstat_Icmp_InMsgs untyped
node_netstat_Icmp_InMsgs 32
# HELP node_netstat_Icmp_InParmProbs Protocol Icmp statistic InParmProbs.
# TYPE node_netstat_Icmp_InParmProbs untyped
node_netstat_Icmp_InParmProbs 0
# HELP node_netstat_Icmp_InRedirects Protocol Icmp statistic InRedirects.
# TYPE node_netstat_Icmp_InRedirects untyped
node_netstat_Icmp_InRedirects 0
# HELP node_netstat_Icmp_InSrcQuenchs Protocol Icmp statistic InSrcQuenchs.
# TYPE node_netstat_Icmp_InSrcQuenchs untyped
node_netstat_Icmp_InSrcQuenchs 0
# HELP node_netstat_Icmp_InTimeExcds Protocol Icmp statistic InTimeExcds.
# TYPE node_netstat_Icmp_InTimeExcds untyped
node_netstat_Icmp_InTimeExcds 0
# HELP node_netstat_Icmp_InTimestampReps Protocol Icmp statistic InTimestampReps.
# TYPE node_netstat_Icmp_InTimestampReps untyped
node_netstat_Icmp_InTimestampReps 0
# HELP node_netstat_Icmp_InTimestamps Protocol Icmp statistic InTimestamps.
# TYPE node_netstat_Icmp_InTimestamps untyped
node_netstat_Icmp_InTimestamps 0
# HELP node_netstat_Icmp_OutAddrMaskReps Protocol Icmp statistic OutAddrMaskReps.
# TYPE node_netstat_Icmp_OutAddrMaskReps untyped
node_netstat_Icmp_OutAddrMaskReps 0
# HELP node_netstat_Icmp_OutAddrMasks Protocol Icmp statistic OutAddrMasks.
# TYPE node_netstat_Icmp_OutAddrMasks untyped
node_netstat_Icmp_OutAddrMasks 0
# HELP node_netstat_Icmp_OutDestUnreachs Protocol Icmp statistic OutDestUnreachs.
# TYPE node_netstat_Icmp_OutDestUnreachs untyped
node_netstat_Icmp_OutDestUnreachs 32
# HELP node_netstat_Icmp_OutEchoReps Protocol Icmp statistic OutEchoReps.
# TYPE node_netstat_Icmp_OutEchoReps untyped
node_netstat_Icmp_OutEchoReps 0
# HELP node_netstat_Icmp_OutEchos Protocol Icmp statistic OutEchos.
# TYPE node_netstat_Icmp_OutEchos untyped
node_netstat_Icmp_OutEchos 0
# HELP node_netstat_Icmp_OutErrors Protocol Icmp statistic OutErrors.
# TYPE node_netstat_Icmp_OutErrors untyped
node_netstat_Icmp_OutErrors 0
# HELP node_netstat_Icmp_OutMsgs Protocol Icmp statistic OutMsgs.
# TYPE node_netstat_Icmp_OutMsgs untyped
node_netstat_Icmp_OutMsgs 32
# HELP node_netstat_Icmp_OutParmProbs Protocol Icmp statistic OutParmProbs.
# TYPE node_netstat_Icmp_OutParmProbs untyped
node_netstat_Icmp_OutParmProbs 0
# HELP node_netstat_Icmp_OutRedirects Protocol Icmp statistic OutRedirects.
# TYPE node_netstat_Icmp_OutRedirects untyped
node_netstat_Icmp_OutRedirects 0
# HELP node_netstat_Icmp_OutSrcQuenchs Protocol Icmp statistic OutSrcQuenchs.
# TYPE node_netstat_Icmp_OutSrcQuenchs untyped
node_netstat_Icmp_OutSrcQuenchs 0
# HELP node_netstat_Icmp_OutTimeExcds Protocol Icmp statistic OutTimeExcds.
# TYPE node_netstat_Icmp_OutTimeExcds untyped
node_netstat_Icmp_OutTimeExcds 0
# HELP node_netstat_Icmp_OutTimestampReps Protocol Icmp statistic OutTimestampReps.
# TYPE node_netstat_Icmp_OutTimestampReps untyped
node_netstat_Icmp_OutTimestampReps 0
# HELP node_netstat_Icmp_OutTimestamps Protocol Icmp statistic OutTimestamps.
# TYPE node_netstat_Icmp_OutTimestamps untyped
node_netstat_Icmp_OutTimestamps 0
# HELP node_netstat_IpExt_InBcastOctets Protocol IpExt statistic InBcastOctets.
# TYPE node_netstat_IpExt_InBcastOctets untyped
node_netstat_IpExt_InBcastOctets 16200
# HELP node_netstat_IpExt_InBcastPkts Protocol IpExt statistic InBcastPkts.
# TYPE node_netstat_IpExt_InBcastPkts untyped
node_netstat_IpExt_InBcastPkts 225
# HELP node_netstat_IpExt_InCEPkts Protocol IpExt statistic InCEPkts.
# TYPE node_netstat_IpExt_InCEPkts untyped
node_netstat_IpExt_InCEPkts 0
# HELP node_netstat_IpExt_InCsumErrors Protocol IpExt statistic InCsumErrors.
# TYPE node_netstat_IpExt_InCsumErrors untyped
node_netstat_IpExt_InCsumErrors 0
# HELP node_netstat_IpExt_InECT0Pkts Protocol IpExt statistic InECT0Pkts.
# TYPE node_netstat_IpExt_InECT0Pkts untyped
node_netstat_IpExt_InECT0Pkts 0
# HELP node_netstat_IpExt_InECT1Pkts Protocol IpExt statistic InECT1Pkts.
# TYPE node_netstat_IpExt_InECT1Pkts untyped
node_netstat_IpExt_InECT1Pkts 0
# HELP node_netstat_IpExt_InMcastOctets Protocol IpExt statistic InMcastOctets.
# TYPE node_netstat_IpExt_InMcastOctets untyped
node_netstat_IpExt_InMcastOctets 0
# HELP node_netstat_IpExt_InMcastPkts Protocol IpExt statistic InMcastPkts.
# TYPE node_netstat_IpExt_InMcastPkts untyped
node_netstat_IpExt_InMcastPkts 0
# HELP node_netstat_IpExt_InNoECTPkts Protocol IpExt statistic InNoECTPkts.
# TYPE node_netstat_IpExt_InNoECTPkts untyped
node_netstat_IpExt_InNoECTPkts 139536
# HELP node_netstat_IpExt_InNoRoutes Protocol IpExt statistic InNoRoutes.
# TYPE node_netstat_IpExt_InNoRoutes untyped
node_netstat_IpExt_InNoRoutes 1
# HELP node_netstat_IpExt_InOctets Protocol IpExt statistic InOctets.
# TYPE node_netstat_IpExt_InOctets untyped
node_netstat_IpExt_InOctets 3.4550209e+07
# HELP node_netstat_IpExt_InTruncatedPkts Protocol IpExt statistic InTruncatedPkts.
# TYPE node_netstat_IpExt_InTruncatedPkts untyped
node_netstat_IpExt_InTruncatedPkts 0
# HELP node_netstat_IpExt_OutBcastOctets Protocol IpExt statistic OutBcastOctets.
# TYPE node_netstat_IpExt_OutBcastOctets untyped
node_netstat_IpExt_OutBcastOctets 0
# HELP node_netstat_IpExt_OutBcastPkts Protocol IpExt statistic OutBcastPkts.
# TYPE node_netstat_IpExt_OutBcastPkts untyped
node_netstat_IpExt_OutBcastPkts 0
# HELP node_netstat_IpExt_OutMcastOctets Protocol IpExt statistic OutMcastOctets.
# TYPE node_netstat_IpExt_OutMcastOctets untyped
node_netstat_IpExt_OutMcastOctets 0
# HELP node_netstat_IpExt_OutMcastPkts Protocol IpExt statistic OutMcastPkts.
# TYPE node_netstat_IpExt_OutMcastPkts untyped
node_netstat_IpExt_OutMcastPkts 0
# HELP node_netstat_IpExt_OutOctets Protocol IpExt statistic OutOctets.
# TYPE node_netstat_IpExt_OutOctets untyped
node_netstat_IpExt_OutOctets 3.469735e+07
# HELP node_netstat_Ip_DefaultTTL Protocol Ip statistic DefaultTTL.
# TYPE node_netstat_Ip_DefaultTTL untyped
node_netstat_Ip_DefaultTTL 64
# HELP node_netstat_Ip_ForwDatagrams Protocol Ip statistic ForwDatagrams.
# TYPE node_netstat_Ip_ForwDatagrams untyped
node_netstat_Ip_ForwDatagrams 0
# HELP node_netstat_Ip_Forwarding Protocol Ip statistic Forwarding.
# TYPE node_netstat_Ip_Forwarding untyped
node_netstat_Ip_Forwarding 1
# HELP node_netstat_Ip_FragCreates Protocol Ip statistic FragCreates.
# TYPE node_netstat_Ip_FragCreates untyped
node_netstat_Ip_FragCreates 0
# HELP node_netstat_Ip_FragFails Protocol Ip statistic FragFails.
# TYPE node_netstat_Ip_FragFails untyped
node_netstat_Ip_FragFails 0
# HELP node_netstat_Ip_FragOKs Protocol Ip statistic FragOKs.
# TYPE node_netstat_Ip_FragOKs untyped
node_netstat_Ip_FragOKs 0
# HELP node_netstat_Ip_InAddrErrors Protocol Ip statistic InAddrErrors.
# TYPE node_netstat_Ip_InAddrErrors untyped
node_netstat_Ip_InAddrErrors 0
# HELP node_netstat_Ip_InDelivers Protocol Ip statistic InDelivers.
# TYPE node_netstat_Ip_InDelivers untyped
node_netstat_Ip_InDelivers 139535
# HELP node_netstat_Ip_InDiscards Protocol Ip statistic InDiscards.
# TYPE node_netstat_Ip_InDiscards untyped
node_netstat_Ip_InDiscards 0
# HELP node_netstat_Ip_InHdrErrors Protocol Ip statistic InHdrErrors.
# TYPE node_netstat_Ip_InHdrErrors untyped
node_netstat_Ip_InHdrErrors 0
# HELP node_netstat_Ip_InReceives Protocol Ip statistic InReceives.
# TYPE node_netstat_Ip_InReceives untyped
node_netstat_Ip_InReceives 139536
# HELP node_netstat_Ip_InUnknownProtos Protocol Ip statistic InUnknownProtos.
# TYPE node_netstat_Ip_InUnknownProtos untyped
node_netstat_Ip_InUnknownProtos 0
# HELP node_netstat_Ip_OutDiscards Protocol Ip statistic OutDiscards.
# TYPE node_netstat_Ip_OutDiscards untyped
node_netstat_Ip_OutDiscards 16
# HELP node_netstat_Ip_OutNoRoutes Protocol Ip statistic OutNoRoutes.
# TYPE node_netstat_Ip_OutNoRoutes untyped
node_netstat_Ip_OutNoRoutes 0
# HELP node_netstat_Ip_OutRequests Protocol Ip statistic OutRequests.
# TYPE node_netstat_Ip_OutRequests untyped
node_netstat_Ip_OutRequests 139089
# HELP node_netstat_Ip_ReasmFails Protocol Ip statistic ReasmFails.
# TYPE node_netstat_Ip_ReasmFails untyped
node_netstat_Ip_ReasmFails 0
# HELP node_netstat_Ip_ReasmOKs Protocol Ip statistic ReasmOKs.
# TYPE node_netstat_Ip_ReasmOKs untyped
node_netstat_Ip_ReasmOKs 0
# HELP node_netstat_Ip_ReasmReqds Protocol Ip statistic ReasmReqds.
# TYPE node_netstat_Ip_ReasmReqds untyped
node_netstat_Ip_ReasmReqds 0
# HELP node_netstat_Ip_ReasmTimeout Protocol Ip statistic ReasmTimeout.
# TYPE node_netstat_Ip_ReasmTimeout untyped
node_netstat_Ip_ReasmTimeout 0
# HELP node_netstat_TcpExt_ArpFilter Protocol TcpExt statistic ArpFilter.
# TYPE node_netstat_TcpExt_ArpFilter untyped
node_netstat_TcpExt_ArpFilter 0
# HELP node_netstat_TcpExt_BusyPollRxPackets Protocol TcpExt statistic BusyPollRxPackets.
# TYPE node_netstat_TcpExt_BusyPollRxPackets untyped
node_netstat_TcpExt_BusyPollRxPackets 0
# HELP node_netstat_TcpExt_DelayedACKLocked Protocol TcpExt statistic DelayedACKLocked.
# TYPE node_netstat_TcpExt_DelayedACKLocked untyped
node_netstat_TcpExt_DelayedACKLocked 0
# HELP node_netstat_TcpExt_DelayedACKLost Protocol TcpExt statistic DelayedACKLost.
# TYPE node_netstat_TcpExt_DelayedACKLost untyped
node_netstat_TcpExt_DelayedACKLost 0
# HELP node_netstat_TcpExt_DelayedACKs Protocol TcpExt statistic DelayedACKs.
# TYPE node_netstat_TcpExt_DelayedACKs untyped
node_netstat_TcpExt_DelayedACKs 3374
# HELP node_netstat_TcpExt_EmbryonicRsts Protocol TcpExt statistic EmbryonicRsts.
# TYPE node_netstat_TcpExt_EmbryonicRsts untyped
node_netstat_TcpExt_EmbryonicRsts 0
# HELP node_netstat_TcpExt_IPReversePathFilter Protocol TcpExt statistic IPReversePathFilter.
# TYPE node_netstat_TcpExt_IPReversePathFilter untyped
node_netstat_TcpExt_IPReversePathFilter 0
# HELP node_netstat_TcpExt_ListenDrops Protocol TcpExt statistic ListenDrops.
# TYPE node_netstat_TcpExt_ListenDrops untyped
node_netstat_TcpExt_ListenDrops 0
# HELP node_netstat_TcpExt_ListenOverflows Protocol TcpExt statistic ListenOverflows.
# TYPE node_netstat_TcpExt_ListenOverflows untyped
node_netstat_TcpExt_ListenOverflows 0
# HELP node_netstat_TcpExt_LockDroppedIcmps Protocol TcpExt statistic LockDroppedIcmps.
# TYPE node_netstat_TcpExt_LockDroppedIcmps untyped
node_netstat_TcpExt_LockDroppedIcmps 0
# HELP node_netstat_TcpExt_OfoPruned Protocol TcpExt statistic OfoPruned.
# TYPE node_netstat_TcpExt_OfoPruned untyped
node_netstat_TcpExt_OfoPruned 0
# HELP node_netstat_TcpExt_OutOfWindowIcmps Protocol TcpExt statistic OutOfWindowIcmps.
# TYPE node_netstat_TcpExt_OutOfWindowIcmps untyped
node_netstat_TcpExt_OutOfWindowIcmps 0
# HELP node_netstat_TcpExt_PAWSActive Protocol TcpExt statistic PAWSActive.
# TYPE node_netstat_TcpExt_PAWSActive untyped
node_netstat_TcpExt_PAWSActive 0
# HELP node_netstat_TcpExt_PAWSEstab Protocol TcpExt statistic PAWSEstab.
# TYPE node_netstat_TcpExt_PAWSEstab untyped
node_netstat_TcpExt_PAWSEstab 0
# HELP node_netstat_TcpExt_PAWSPassive Protocol TcpExt statistic PAWSPassive.
# TYPE node_netstat_TcpExt_PAWSPassive untyped
node_netstat_TcpExt_PAWSPassive 0
# HELP node_netstat_TcpExt_PruneCalled Protocol TcpExt statistic PruneCalled.
# TYPE node_netstat_TcpExt_PruneCalled untyped
node_netstat_TcpExt_PruneCalled 0
# HELP node_netstat_TcpExt_RcvPruned Protocol TcpExt statistic RcvPruned.
# TYPE node_netstat_TcpExt_RcvPruned untyped
node_netstat_TcpExt_RcvPruned 0
# HELP node_netstat_TcpExt_SyncookiesFailed Protocol TcpExt statistic SyncookiesFailed.
# TYPE node_netstat_TcpExt_SyncookiesFailed untyped
node_netstat_TcpExt_SyncookiesFailed 0
# HELP node_netstat_TcpExt_SyncookiesRecv Protocol TcpExt statistic SyncookiesRecv.
# TYPE node_netstat_TcpExt_SyncookiesRecv untyped
node_netstat_TcpExt_SyncookiesRecv 0
# HELP node_netstat_TcpExt_SyncookiesSent Protocol TcpExt statistic SyncookiesSent.
# TYPE node_netstat_TcpExt_SyncookiesSent untyped
node_netstat_TcpExt_SyncookiesSent 0
# HELP node_netstat_TcpExt_TCPACKSkippedChallenge Protocol TcpExt statistic TCPACKSkippedChallenge.
# TYPE node_netstat_TcpExt_TCPACKSkippedChallenge untyped
node_netstat_TcpExt_TCPACKSkippedChallenge 0
# HELP node_netstat_TcpExt_TCPACKSkippedFinWait2 Protocol TcpExt statistic TCPACKSkippedFinWait2.
# TYPE node_netstat_TcpExt_TCPACKSkippedFinWait2 untyped
node_netstat_TcpExt_TCPACKSkippedFinWait2 0
# HELP node_netstat_TcpExt_TCPACKSkippedPAWS Protocol TcpExt statistic TCPACKSkippedPAWS.
# TYPE node_netstat_TcpExt_TCPACKSkippedPAWS untyped
node_netstat_TcpExt_TCPACKSkippedPAWS 0
# HELP node_netstat_TcpExt_TCPACKSkippedSeq Protocol TcpExt statistic TCPACKSkippedSeq.
# TYPE node_netstat_TcpExt_TCPACKSkippedSeq untyped
node_netstat_TcpExt_TCPACKSkippedSeq 0
# HELP node_netstat_TcpExt_TCPACKSkippedSynRecv Protocol TcpExt statistic TCPACKSkippedSynRecv.
# TYPE node_netstat_TcpExt_TCPACKSkippedSynRecv untyped
node_netstat_TcpExt_TCPACKSkippedSynRecv 0
# HELP node_netstat_TcpExt_TCPACKSkippedTimeWait Protocol TcpExt statistic TCPACKSkippedTimeWait.
# TYPE node_netstat_TcpExt_TCPACKSkippedTimeWait untyped
node_netstat_TcpExt_TCPACKSkippedTimeWait 0
# HELP node_netstat_TcpExt_TCPAbortFailed Protocol TcpExt statistic TCPAbortFailed.
# TYPE node_netstat_TcpExt_TCPAbortFailed untyped
node_netstat_TcpExt_TCPAbortFailed 0
# HELP node_netstat_TcpExt_TCPAbortOnClose Protocol TcpExt statistic TCPAbortOnClose.
# TYPE node_netstat_TcpExt_TCPAbortOnClose untyped
node_netstat_TcpExt_TCPAbortOnClose 1
# HELP node_netstat_TcpExt_TCPAbortOnData Protocol TcpExt statistic TCPAbortOnData.
# TYPE node_netstat_TcpExt_TCPAbortOnData untyped
node_netstat_TcpExt_TCPAbortOnData 2
# HELP node_netstat_TcpExt_TCPAbortOnLinger Protocol TcpExt statistic TCPAbortOnLinger.
# TYPE node_netstat_TcpExt_TCPAbortOnLinger untyped
node_netstat_TcpExt_TCPAbortOnLinger 0
# HELP node_netstat_TcpExt_TCPAbortOnMemory Protocol TcpExt statistic TCPAbortOnMemory.
# TYPE node_netstat_TcpExt_TCPAbortOnMemory untyped
node_netstat_TcpExt_TCPAbortOnMemory 0
# HELP node_netstat_TcpExt_TCPAbortOnTimeout Protocol TcpExt statistic TCPAbortOnTimeout.
# TYPE node_netstat_TcpExt_TCPAbortOnTimeout untyped
node_netstat_TcpExt_TCPAbortOnTimeout 0
# HELP node_netstat_TcpExt_TCPAutoCorking Protocol TcpExt statistic TCPAutoCorking.
# TYPE node_netstat_TcpExt_TCPAutoCorking untyped
node_netstat_TcpExt_TCPAutoCorking 4
# HELP node_netstat_TcpExt_TCPBacklogDrop Protocol TcpExt statistic TCPBacklogDrop.
# TYPE node_netstat_TcpExt_TCPBacklogDrop untyped
node_netstat_TcpExt_TCPBacklogDrop 0
# HELP node_netstat_TcpExt_TCPChallengeACK Protocol TcpExt statistic TCPChallengeACK.
# TYPE node_netstat_TcpExt_TCPChallengeACK untyped
node_netstat_TcpExt_TCPChallengeACK 0
# HELP node_netstat_TcpExt_TCPDSACKIgnoredNoUndo Protocol TcpExt statistic TCPDSACKIgnoredNoUndo.
# TYPE node_netstat_TcpExt_TCPDSACKIgnoredNoUndo untyped
node_netstat_TcpExt_TCPDSACKIgnoredNoUndo 0
# HELP node_netstat_TcpExt_TCPDSACKIgnoredOld Protocol TcpExt statistic TCPDSACKIgnoredOld.
# TYPE node_netstat_TcpExt_TCPDSACKIgnoredOld untyped
node_netstat_TcpExt_TCPDSACKIgnoredOld 0
# HELP node_netstat_TcpExt_TCPDSACKOfoRecv Protocol TcpExt statistic TCPDSACKOfoRecv.
# TYPE node_netstat_TcpExt_TCPDSACKOfoRecv untyped
node_netstat_TcpExt_TCPDSACKOfoRecv 0
# HELP node_netstat_TcpExt_TCPDSACKOfoSent Protocol TcpExt statistic TCPDSACKOfoSent.
# TYPE node_netstat_TcpExt_TCPDSACKOfoSent untyped
node_netstat_TcpExt_TCPDSACKOfoSent 0
# HELP node_netstat_TcpExt_TCPDSACKOldSent Protocol TcpExt statistic TCPDSACKOldSent.
# TYPE node_netstat_TcpExt_TCPDSACKOldSent untyped
node_netstat_TcpExt_TCPDSACKOldSent 0
# HELP node_netstat_TcpExt_TCPDSACKRecv Protocol TcpExt statistic TCPDSACKRecv.
# TYPE node_netstat_TcpExt_TCPDSACKRecv untyped
node_netstat_TcpExt_TCPDSACKRecv 0
# HELP node_netstat_TcpExt_TCPDSACKUndo Protocol TcpExt statistic TCPDSACKUndo.
# TYPE node_netstat_TcpExt_TCPDSACKUndo untyped
node_netstat_TcpExt_TCPDSACKUndo 0
# HELP node_netstat_TcpExt_TCPDeferAcceptDrop Protocol TcpExt statistic TCPDeferAcceptDrop.
# TYPE node_netstat_TcpExt_TCPDeferAcceptDrop untyped
node_netstat_TcpExt_TCPDeferAcceptDrop 0
# HELP node_netstat_TcpExt_TCPDirectCopyFromBacklog Protocol TcpExt statistic TCPDirectCopyFromBacklog.
# TYPE node_netstat_TcpExt_TCPDirectCopyFromBacklog untyped
node_netstat_TcpExt_TCPDirectCopyFromBacklog 0
# HELP node_netstat_TcpExt_TCPDirectCopyFromPrequeue Protocol TcpExt statistic TCPDirectCopyFromPrequeue.
# TYPE node_netstat_TcpExt_TCPDirectCopyFromPrequeue untyped
node_netstat_TcpExt_TCPDirectCopyFromPrequeue 1.062e+06
# HELP node_netstat_TcpExt_TCPFACKReorder Protocol TcpExt statistic TCPFACKReorder.
# TYPE node_netstat_TcpExt_TCPFACKReorder untyped
node_netstat_TcpExt_TCPFACKReorder 0
# HELP node_netstat_TcpExt_TCPFastOpenActive Protocol TcpExt statistic TCPFastOpenActive.
# TYPE node_netstat_TcpExt_TCPFastOpenActive untyped
node_netstat_TcpExt_TCPFastOpenActive 0
# HELP node_netstat_TcpExt_TCPFastOpenActiveFail Protocol TcpExt statistic TCPFastOpenActiveFail.
# TYPE node_netstat_TcpExt_TCPFastOpenActiveFail untyped
node_netstat_TcpExt_TCPFastOpenActiveFail 0
# HELP node_netstat_TcpExt_TCPFastOpenCookieReqd Protocol TcpExt statistic TCPFastOpenCookieReqd.
# TYPE node_netstat_TcpExt_TCPFastOpenCookieReqd untyped
node_netstat_TcpExt_TCPFastOpenCookieReqd 0
# HELP node_netstat_TcpExt_TCPFastOpenListenOverflow Protocol TcpExt statistic TCPFastOpenListenOverflow.
# TYPE node_netstat_TcpExt_TCPFastOpenListenOverflow untyped
node_netstat_TcpExt_TCPFastOpenListenOverflow 0
# HELP node_netstat_TcpExt_TCPFastOpenPassive Protocol TcpExt statistic TCPFastOpenPassive.
# TYPE node_netstat_TcpExt_TCPFastOpenPassive untyped
node_netstat_TcpExt_TCPFastOpenPassive 0
# HELP node_netstat_TcpExt_TCPFastOpenPassiveFail Protocol TcpExt statistic TCPFastOpenPassiveFail.
# TYPE node_netstat_TcpExt_TCPFastOpenPassiveFail untyped
node_netstat_TcpExt_TCPFastOpenPassiveFail 0
# HELP node_netstat_TcpExt_TCPFastRetrans Protocol TcpExt statistic TCPFastRetrans.
# TYPE node_netstat_TcpExt_TCPFastRetrans untyped
node_netstat_TcpExt_TCPFastRetrans 0
# HELP node_netstat_TcpExt_TCPForwardRetrans Protocol TcpExt statistic TCPForwardRetrans.
# TYPE node_netstat_TcpExt_TCPForwardRetrans untyped
node_netstat_TcpExt_TCPForwardRetrans 0
# HELP node_netstat_TcpExt_TCPFromZeroWindowAdv Protocol TcpExt statistic TCPFromZeroWindowAdv.
# TYPE node_netstat_TcpExt_TCPFromZeroWindowAdv untyped
node_netstat_TcpExt_TCPFromZeroWindowAdv 0
# HELP node_netstat_TcpExt_TCPFullUndo Protocol TcpExt statistic TCPFullUndo.
# TYPE node_netstat_TcpExt_TCPFullUndo untyped
node_netstat_TcpExt_TCPFullUndo 0
# HELP node_netstat_TcpExt_TCPHPAcks Protocol TcpExt statistic TCPHPAcks.
# TYPE node_netstat_TcpExt_TCPHPAcks untyped
node_netstat_TcpExt_TCPHPAcks 21314
# HELP node_netstat_TcpExt_TCPHPHits Protocol TcpExt statistic TCPHPHits.
# TYPE node_netstat_TcpExt_TCPHPHits untyped
node_netstat_TcpExt_TCPHPHits 3607
# HELP node_netstat_TcpExt_TCPHPHitsToUser Protocol TcpExt statistic TCPHPHitsToUser.
# TYPE node_netstat_TcpExt_TCPHPHitsToUser untyped
node_netstat_TcpExt_TCPHPHitsToUser 0
# HELP node_netstat_TcpExt_TCPHystartDelayCwnd Protocol TcpExt statistic TCPHystartDelayCwnd.
# TYPE node_netstat_TcpExt_TCPHystartDelayCwnd untyped
node_netstat_TcpExt_TCPHystartDelayCwnd 0
# HELP node_netstat_TcpExt_TCPHystartDelayDetect Protocol TcpExt statistic TCPHystartDelayDetect.
# TYPE node_netstat_TcpExt_TCPHystartDelayDetect untyped
node_netstat_TcpExt_TCPHystartDelayDetect 0
# HELP node_netstat_TcpExt_TCPHystartTrainCwnd Protocol TcpExt statistic TCPHystartTrainCwnd.
# TYPE node_netstat_TcpExt_TCPHystartTrainCwnd untyped
node_netstat_TcpExt_TCPHystartTrainCwnd 0
# HELP node_netstat_TcpExt_TCPHystartTrainDetect Protocol TcpExt statistic TCPHystartTrainDetect.
# TYPE node_netstat_TcpExt_TCPHystartTrainDetect untyped
node_netstat_TcpExt_TCPHystartTrainDetect 0
# HELP node_netstat_TcpExt_TCPLossFailures Protocol TcpExt statistic TCPLossFailures.
# TYPE node_netstat_TcpExt_TCPLossFailures untyped
node_netstat_TcpExt_TCPLossFailures 0
# HELP node_netstat_TcpExt_TCPLossProbeRecovery Protocol TcpExt statistic TCPLossProbeRecovery.
# TYPE node_netstat_TcpExt_TCPLossProbeRecovery untyped
node_netstat_TcpExt_TCPLossProbeRecovery 0
# HELP node_netstat_TcpExt_TCPLossProbes Protocol TcpExt statistic TCPLossProbes.
# TYPE node_netstat_TcpExt_TCPLossProbes untyped
node_netstat_TcpExt_TCPLossProbes 0
# HELP node_netstat_TcpExt_TCPLossUndo Protocol TcpExt statistic TCPLossUndo.
# TYPE node_netstat_TcpExt_TCPLossUndo untyped
node_netstat_TcpExt_TCPLossUndo 0
# HELP node_netstat_TcpExt_TCPLostRetransmit Protocol TcpExt statistic TCPLostRetransmit.
# TYPE node_netstat_TcpExt_TCPLostRetransmit untyped
node_netstat_TcpExt_TCPLostRetransmit 0
# HELP node_netstat_TcpExt_TCPMD5NotFound Protocol TcpExt statistic TCPMD5NotFound.
# TYPE node_netstat_TcpExt_TCPMD5NotFound untyped
node_netstat_TcpExt_TCPMD5NotFound 0
# HELP node_netstat_TcpExt_TCPMD5Unexpected Protocol TcpExt statistic TCPMD5Unexpected.
# TYPE node_netstat_TcpExt_TCPMD5Unexpected untyped
node_netstat_TcpExt_TCPMD5Unexpected 0
# HELP node_netstat_TcpExt_TCPMemoryPressures Protocol TcpExt statistic TCPMemoryPressures.
# TYPE node_netstat_TcpExt_TCPMemoryPressures untyped
node_netstat_TcpExt_TCPMemoryPressures 0
# HELP node_netstat_TcpExt_TCPMinTTLDrop Protocol TcpExt statistic TCPMinTTLDrop.
# TYPE node_netstat_TcpExt_TCPMinTTLDrop untyped
node_netstat_TcpExt_TCPMinTTLDrop 0
# HELP node_netstat_TcpExt_TCPOFODrop Protocol TcpExt statistic TCPOFODrop.
# TYPE node_netstat_TcpExt_TCPOFODrop untyped
node_netstat_TcpExt_TCPOFODrop 0
# HELP node_netstat_TcpExt_TCPOFOMerge Protocol TcpExt statistic TCPOFOMerge.
# TYPE node_netstat_TcpExt_TCPOFOMerge untyped
node_netstat_TcpExt_TCPOFOMerge 0
# HELP node_netstat_TcpExt_TCPOFOQueue Protocol TcpExt statistic TCPOFOQueue.
# TYPE node_netstat_TcpExt_TCPOFOQueue untyped
node_netstat_TcpExt_TCPOFOQueue 0
# HELP node_netstat_TcpExt_TCPOrigDataSent Protocol TcpExt statistic TCPOrigDataSent.
# TYPE node_netstat_TcpExt_TCPOrigDataSent untyped
node_netstat_TcpExt_TCPOrigDataSent 99807
# HELP node_netstat_TcpExt_TCPPartialUndo Protocol TcpExt statistic TCPPartialUndo.
# TYPE node_netstat_TcpExt_TCPPartialUndo untyped
node_netstat_TcpExt_TCPPartialUndo 0
# HELP node_netstat_TcpExt_TCPPrequeueDropped Protocol TcpExt statistic TCPPrequeueDropped.
# TYPE node_netstat_TcpExt_TCPPrequeueDropped untyped
node_netstat_TcpExt_TCPPrequeueDropped 0
# HELP node_netstat_TcpExt_TCPPrequeued Protocol TcpExt statistic TCPPrequeued.
# TYPE node_netstat_TcpExt_TCPPrequeued untyped
node_netstat_TcpExt_TCPPrequeued 68753
# HELP node_netstat_TcpExt_TCPPureAcks Protocol TcpExt statistic TCPPureAcks.
# TYPE node_netstat_TcpExt_TCPPureAcks untyped
node_netstat_TcpExt_TCPPureAcks 17834
# HELP node_netstat_TcpExt_TCPRcvCoalesce Protocol TcpExt statistic TCPRcvCoalesce.
# TYPE node_netstat_TcpExt_TCPRcvCoalesce untyped
node_netstat_TcpExt_TCPRcvCoalesce 2
# HELP node_netstat_TcpExt_TCPRcvCollapsed Protocol TcpExt statistic TCPRcvCollapsed.
# TYPE node_netstat_TcpExt_TCPRcvCollapsed untyped
node_netstat_TcpExt_TCPRcvCollapsed 0
# HELP node_netstat_TcpExt_TCPRenoFailures Protocol TcpExt statistic TCPRenoFailures.
# TYPE node_netstat_TcpExt_TCPRenoFailures untyped
node_netstat_TcpExt_TCPRenoFailures 0
# HELP node_netstat_TcpExt_TCPRenoRecovery Protocol TcpExt statistic TCPRenoRecovery.
# TYPE node_netstat_TcpExt_TCPRenoRecovery untyped
node_netstat_TcpExt_TCPRenoRecovery 0
# HELP node_netstat_TcpExt_TCPRenoRecoveryFail Protocol TcpExt statistic TCPRenoRecoveryFail.
# TYPE node_netstat_TcpExt_TCPRenoRecoveryFail untyped
node_netstat_TcpExt_TCPRenoRecoveryFail 0
# HELP node_netstat_TcpExt_TCPRenoReorder Protocol TcpExt statistic TCPRenoReorder.
# TYPE node_netstat_TcpExt_TCPRenoReorder untyped
node_netstat_TcpExt_TCPRenoReorder 0
# HELP node_netstat_TcpExt_TCPReqQFullDoCookies Protocol TcpExt statistic TCPReqQFullDoCookies.
# TYPE node_netstat_TcpExt_TCPReqQFullDoCookies untyped
node_netstat_TcpExt_TCPReqQFullDoCookies 0
# HELP node_netstat_TcpExt_TCPReqQFullDrop Protocol TcpExt statistic TCPReqQFullDrop.
# TYPE node_netstat_TcpExt_TCPReqQFullDrop untyped
node_netstat_TcpExt_TCPReqQFullDrop 0
# HELP node_netstat_TcpExt_TCPRetransFail Protocol TcpExt statistic TCPRetransFail.
# TYPE node_netstat_TcpExt_TCPRetransFail untyped
node_netstat_TcpExt_TCPRetransFail 0
# HELP node_netstat_TcpExt_TCPSACKDiscard Protocol TcpExt statistic TCPSACKDiscard.
# TYPE node_netstat_TcpExt_TCPSACKDiscard untyped
node_netstat_TcpExt_TCPSACKDiscard 0
# HELP node_netstat_TcpExt_TCPSACKReneging Protocol TcpExt statistic TCPSACKReneging.
# TYPE node_netstat_TcpExt_TCPSACKReneging untyped
node_netstat_TcpExt_TCPSACKReneging 0
# HELP node_netstat_TcpExt_TCPSACKReorder Protocol TcpExt statistic TCPSACKReorder.
# TYPE node_netstat_TcpExt_TCPSACKReorder untyped
node_netstat_TcpExt_TCPSACKReorder 0
# HELP node_netstat_TcpExt_TCPSYNChallenge Protocol TcpExt statistic TCPSYNChallenge.
# TYPE node_netstat_TcpExt_TCPSYNChallenge untyped
node_netstat_TcpExt_TCPSYNChallenge 0
# HELP node_netstat_TcpExt_TCPSackFailures Protocol TcpExt statistic TCPSackFailures.
# TYPE node_netstat_TcpExt_TCPSackFailures untyped
node_netstat_TcpExt_TCPSackFailures 0
# HELP node_netstat_TcpExt_TCPSackMerged Protocol TcpExt statistic TCPSackMerged.
# TYPE node_netstat_TcpExt_TCPSackMerged untyped
node_netstat_TcpExt_TCPSackMerged 0
# HELP node_netstat_TcpExt_TCPSackRecovery Protocol TcpExt statistic TCPSackRecovery.
# TYPE node_netstat_TcpExt_TCPSackRecovery untyped
node_netstat_TcpExt_TCPSackRecovery 0
# HELP node_netstat_TcpExt_TCPSackRecoveryFail Protocol TcpExt statistic TCPSackRecoveryFail.
# TYPE node_netstat_TcpExt_TCPSackRecoveryFail untyped
node_netstat_TcpExt_TCPSackRecoveryFail 0
# HELP node_netstat_TcpExt_TCPSackShiftFallback Protocol TcpExt statistic TCPSackShiftFallback.
# TYPE node_netstat_TcpExt_TCPSackShiftFallback untyped
node_netstat_TcpExt_TCPSackShiftFallback 0
# HELP node_netstat_TcpExt_TCPSackShifted Protocol TcpExt statistic TCPSackShifted.
# TYPE node_netstat_TcpExt_TCPSackShifted untyped
node_netstat_TcpExt_TCPSackShifted 0
# HELP node_netstat_TcpExt_TCPSchedulerFailed Protocol TcpExt statistic TCPSchedulerFailed.
# TYPE node_netstat_TcpExt_TCPSchedulerFailed untyped
node_netstat_TcpExt_TCPSchedulerFailed 0
# HELP node_netstat_TcpExt_TCPSlowStartRetrans Protocol TcpExt statistic TCPSlowStartRetrans.
# TYPE node_netstat_TcpExt_TCPSlowStartRetrans untyped
node_netstat_TcpExt_TCPSlowStartRetrans 0
# HELP node_netstat_TcpExt_TCPSpuriousRTOs Protocol TcpExt statistic TCPSpuriousRTOs.
# TYPE node_netstat_TcpExt_TCPSpuriousRTOs untyped
node_netstat_TcpExt_TCPSpuriousRTOs 0
# HELP node_netstat_TcpExt_TCPSpuriousRtxHostQueues Protocol TcpExt statistic TCPSpuriousRtxHostQueues.
# TYPE node_netstat_TcpExt_TCPSpuriousRtxHostQueues untyped
node_netstat_TcpExt_TCPSpuriousRtxHostQueues 0
# HELP node_netstat_TcpExt_TCPSynRetrans Protocol TcpExt statistic TCPSynRetrans.
# TYPE node_netstat_TcpExt_TCPSynRetrans untyped
node_netstat_TcpExt_TCPSynRetrans 0
# HELP node_netstat_TcpExt_TCPTSReorder Protocol TcpExt statistic TCPTSReorder.
# TYPE node_netstat_TcpExt_TCPTSReorder untyped
node_netstat_TcpExt_TCPTSReorder 0
# HELP node_netstat_TcpExt_TCPTimeWaitOverflow Protocol TcpExt statistic TCPTimeWaitOverflow.
# TYPE node_netstat_TcpExt_TCPTimeWaitOverflow untyped
node_netstat_TcpExt_TCPTimeWaitOverflow 0
# HELP node_netstat_TcpExt_TCPTimeouts Protocol TcpExt statistic TCPTimeouts.
# TYPE node_netstat_TcpExt_TCPTimeouts untyped
node_netstat_TcpExt_TCPTimeouts 0
# HELP node_netstat_TcpExt_TCPToZeroWindowAdv Protocol TcpExt statistic TCPToZeroWindowAdv.
# TYPE node_netstat_TcpExt_TCPToZeroWindowAdv untyped
node_netstat_TcpExt_TCPToZeroWindowAdv 0
# HELP node_netstat_TcpExt_TCPWantZeroWindowAdv Protocol TcpExt statistic TCPWantZeroWindowAdv.
# TYPE node_netstat_TcpExt_TCPWantZeroWindowAdv untyped
node_netstat_TcpExt_TCPWantZeroWindowAdv 0
# HELP node_netstat_TcpExt_TW Protocol TcpExt statistic TW.
# TYPE node_netstat_TcpExt_TW untyped
node_netstat_TcpExt_TW 35
# HELP node_netstat_TcpExt_TWKilled Protocol TcpExt statistic TWKilled.
# TYPE node_netstat_TcpExt_TWKilled untyped
node_netstat_TcpExt_TWKilled 0
# HELP node_netstat_TcpExt_TWRecycled Protocol TcpExt statistic TWRecycled.
# TYPE node_netstat_TcpExt_TWRecycled untyped
node_netstat_TcpExt_TWRecycled 0
# HELP node_netstat_Tcp_ActiveOpens Protocol Tcp statistic ActiveOpens.
# TYPE node_netstat_Tcp_ActiveOpens untyped
node_netstat_Tcp_ActiveOpens 122
# HELP node_netstat_Tcp_AttemptFails Protocol Tcp statistic AttemptFails.
# TYPE node_netstat_Tcp_AttemptFails untyped
node_netstat_Tcp_AttemptFails 48
# HELP node_netstat_Tcp_CurrEstab Protocol Tcp statistic CurrEstab.
# TYPE node_netstat_Tcp_CurrEstab untyped
node_netstat_Tcp_CurrEstab 67
# HELP node_netstat_Tcp_EstabResets Protocol Tcp statistic EstabResets.
# TYPE node_netstat_Tcp_EstabResets untyped
node_netstat_Tcp_EstabResets 3
# HELP node_netstat_Tcp_InCsumErrors Protocol Tcp statistic InCsumErrors.
# TYPE node_netstat_Tcp_InCsumErrors untyped
node_netstat_Tcp_InCsumErrors 0
# HELP node_netstat_Tcp_InErrs Protocol Tcp statistic InErrs.
# TYPE node_netstat_Tcp_InErrs untyped
node_netstat_Tcp_InErrs 0
# HELP node_netstat_Tcp_InSegs Protocol Tcp statistic InSegs.
# TYPE node_netstat_Tcp_InSegs untyped
node_netstat_Tcp_InSegs 138959
# HELP node_netstat_Tcp_MaxConn Protocol Tcp statistic MaxConn.
# TYPE node_netstat_Tcp_MaxConn untyped
node_netstat_Tcp_MaxConn -1
# HELP node_netstat_Tcp_OutRsts Protocol Tcp statistic OutRsts.
# TYPE node_netstat_Tcp_OutRsts untyped
node_netstat_Tcp_OutRsts 51
# HELP node_netstat_Tcp_OutSegs Protocol Tcp statistic OutSegs.
# TYPE node_netstat_Tcp_OutSegs untyped
node_netstat_Tcp_OutSegs 138800
# HELP node_netstat_Tcp_PassiveOpens Protocol Tcp statistic PassiveOpens.
# TYPE node_netstat_Tcp_PassiveOpens untyped
node_netstat_Tcp_PassiveOpens 75
# HELP node_netstat_Tcp_RetransSegs Protocol Tcp statistic RetransSegs.
# TYPE node_netstat_Tcp_RetransSegs untyped
node_netstat_Tcp_RetransSegs 0
# HELP node_netstat_Tcp_RtoAlgorithm Protocol Tcp statistic RtoAlgorithm.
# TYPE node_netstat_Tcp_RtoAlgorithm untyped
node_netstat_Tcp_RtoAlgorithm 1
# HELP node_netstat_Tcp_RtoMax Protocol Tcp statistic RtoMax.
# TYPE node_netstat_Tcp_RtoMax untyped
node_netstat_Tcp_RtoMax 120000
# HELP node_netstat_Tcp_RtoMin Protocol Tcp statistic RtoMin.
# TYPE node_netstat_Tcp_RtoMin untyped
node_netstat_Tcp_RtoMin 200
# HELP node_netstat_UdpLite_InCsumErrors Protocol UdpLite statistic InCsumErrors.
# TYPE node_netstat_UdpLite_InCsumErrors untyped
node_netstat_UdpLite_InCsumErrors 0
# HELP node_netstat_UdpLite_InDatagrams Protocol UdpLite statistic InDatagrams.
# TYPE node_netstat_UdpLite_InDatagrams untyped
node_netstat_UdpLite_InDatagrams 0
# HELP node_netstat_UdpLite_InErrors Protocol UdpLite statistic InErrors.
# TYPE node_netstat_UdpLite_InErrors untyped
node_netstat_UdpLite_InErrors 0
# HELP node_netstat_UdpLite_NoPorts Protocol UdpLite statistic NoPorts.
# TYPE node_netstat_UdpLite_NoPorts untyped
node_netstat_UdpLite_NoPorts 0
# HELP node_netstat_UdpLite_OutDatagrams Protocol UdpLite statistic OutDatagrams.
# TYPE node_netstat_UdpLite_OutDatagrams untyped
node_netstat_UdpLite_OutDatagrams 0
# HELP node_netstat_UdpLite_RcvbufErrors Protocol UdpLite statistic RcvbufErrors.
# TYPE node_netstat_UdpLite_RcvbufErrors untyped
node_netstat_UdpLite_RcvbufErrors 0
# HELP node_netstat_UdpLite_SndbufErrors Protocol UdpLite statistic SndbufErrors.
# TYPE node_netstat_UdpLite_SndbufErrors untyped
node_netstat_UdpLite_SndbufErrors 0
# HELP node_netstat_Udp_InCsumErrors Protocol Udp statistic InCsumErrors.
# TYPE node_netstat_Udp_InCsumErrors untyped
node_netstat_Udp_InCsumErrors 0
# HELP node_netstat_Udp_InDatagrams Protocol Udp statistic InDatagrams.
# TYPE node_netstat_Udp_InDatagrams untyped
node_netstat_Udp_InDatagrams 347
# HELP node_netstat_Udp_InErrors Protocol Udp statistic InErrors.
# TYPE node_netstat_Udp_InErrors untyped
node_netstat_Udp_InErrors 0
# HELP node_netstat_Udp_NoPorts Protocol Udp statistic NoPorts.
# TYPE node_netstat_Udp_NoPorts untyped
node_netstat_Udp_NoPorts 32
# HELP node_netstat_Udp_OutDatagrams Protocol Udp statistic OutDatagrams.
# TYPE node_netstat_Udp_OutDatagrams untyped
node_netstat_Udp_OutDatagrams 388
# HELP node_netstat_Udp_RcvbufErrors Protocol Udp statistic RcvbufErrors.
# TYPE node_netstat_Udp_RcvbufErrors untyped
node_netstat_Udp_RcvbufErrors 0
# HELP node_netstat_Udp_SndbufErrors Protocol Udp statistic SndbufErrors.
# TYPE node_netstat_Udp_SndbufErrors untyped
node_netstat_Udp_SndbufErrors 0
# HELP node_network_receive_bytes Network device statistic receive_bytes.
# TYPE node_network_receive_bytes gauge
node_network_receive_bytes{device="docker0"} 0
node_network_receive_bytes{device="enp0s3"} 36421
node_network_receive_bytes{device="enp0s8"} 80451
node_network_receive_bytes{device="lo"} 3.4463599e+07
# HELP node_network_receive_compressed Network device statistic receive_compressed.
# TYPE node_network_receive_compressed gauge
node_network_receive_compressed{device="docker0"} 0
node_network_receive_compressed{device="enp0s3"} 0
node_network_receive_compressed{device="enp0s8"} 0
node_network_receive_compressed{device="lo"} 0
# HELP node_network_receive_drop Network device statistic receive_drop.
# TYPE node_network_receive_drop gauge
node_network_receive_drop{device="docker0"} 0
node_network_receive_drop{device="enp0s3"} 0
node_network_receive_drop{device="enp0s8"} 0
node_network_receive_drop{device="lo"} 0
# HELP node_network_receive_errs Network device statistic receive_errs.
# TYPE node_network_receive_errs gauge
node_network_receive_errs{device="docker0"} 0
node_network_receive_errs{device="enp0s3"} 0
node_network_receive_errs{device="enp0s8"} 0
node_network_receive_errs{device="lo"} 0
# HELP node_network_receive_fifo Network device statistic receive_fifo.
# TYPE node_network_receive_fifo gauge
node_network_receive_fifo{device="docker0"} 0
node_network_receive_fifo{device="enp0s3"} 0
node_network_receive_fifo{device="enp0s8"} 0
node_network_receive_fifo{device="lo"} 0
# HELP node_network_receive_frame Network device statistic receive_frame.
# TYPE node_network_receive_frame gauge
node_network_receive_frame{device="docker0"} 0
node_network_receive_frame{device="enp0s3"} 0
node_network_receive_frame{device="enp0s8"} 0
node_network_receive_frame{device="lo"} 0
# HELP node_network_receive_multicast Network device statistic receive_multicast.
# TYPE node_network_receive_multicast gauge
node_network_receive_multicast{device="docker0"} 0
node_network_receive_multicast{device="enp0s3"} 0
node_network_receive_multicast{device="enp0s8"} 0
node_network_receive_multicast{device="lo"} 0
# HELP node_network_receive_packets Network device statistic receive_packets.
# TYPE node_network_receive_packets gauge
node_network_receive_packets{device="docker0"} 0
node_network_receive_packets{device="enp0s3"} 421
node_network_receive_packets{device="enp0s8"} 997
node_network_receive_packets{device="lo"} 138313
# HELP node_network_transmit_bytes Network device statistic transmit_bytes.
# TYPE node_network_transmit_bytes gauge
node_network_transmit_bytes{device="docker0"} 0
node_network_transmit_bytes{device="enp0s3"} 37074
node_network_transmit_bytes{device="enp0s8"} 330717
node_network_transmit_bytes{device="lo"} 3.4463599e+07
# HELP node_network_transmit_compressed Network device statistic transmit_compressed.
# TYPE node_network_transmit_compressed gauge
node_network_transmit_compressed{device="docker0"} 0
node_network_transmit_compressed{device="enp0s3"} 0
node_network_transmit_compressed{device="enp0s8"} 0
node_network_transmit_compressed{device="lo"} 0
# HELP node_network_transmit_drop Network device statistic transmit_drop.
# TYPE node_network_transmit_drop gauge
node_network_transmit_drop{device="docker0"} 0
node_network_transmit_drop{device="enp0s3"} 0
node_network_transmit_drop{device="enp0s8"} 0
node_network_transmit_drop{device="lo"} 0
# HELP node_network_transmit_errs Network device statistic transmit_errs.
# TYPE node_network_transmit_errs gauge
node_network_transmit_errs{device="docker0"} 0
node_network_transmit_errs{device="enp0s3"} 0
node_network_transmit_errs{device="enp0s8"} 0
node_network_transmit_errs{device="lo"} 0
# HELP node_network_transmit_fifo Network device statistic transmit_fifo.
# TYPE node_network_transmit_fifo gauge
node_network_transmit_fifo{device="docker0"} 0
node_network_transmit_fifo{device="enp0s3"} 0
node_network_transmit_fifo{device="enp0s8"} 0
node_network_transmit_fifo{device="lo"} 0
# HELP node_network_transmit_frame Network device statistic transmit_frame.
# TYPE node_network_transmit_frame gauge
node_network_transmit_frame{device="docker0"} 0
node_network_transmit_frame{device="enp0s3"} 0
node_network_transmit_frame{device="enp0s8"} 0
node_network_transmit_frame{device="lo"} 0
# HELP node_network_transmit_multicast Network device statistic transmit_multicast.
# TYPE node_network_transmit_multicast gauge
node_network_transmit_multicast{device="docker0"} 0
node_network_transmit_multicast{device="enp0s3"} 0
node_network_transmit_multicast{device="enp0s8"} 0
node_network_transmit_multicast{device="lo"} 0
# HELP node_network_transmit_packets Network device statistic transmit_packets.
# TYPE node_network_transmit_packets gauge
node_network_transmit_packets{device="docker0"} 0
node_network_transmit_packets{device="enp0s3"} 438
node_network_transmit_packets{device="enp0s8"} 627
node_network_transmit_packets{device="lo"} 138313
# HELP node_procs_blocked Number of processes blocked waiting for I/O to complete.
# TYPE node_procs_blocked gauge
node_procs_blocked 0
# HELP node_procs_running Number of processes in runnable state.
# TYPE node_procs_running gauge
node_procs_running 2
# HELP node_time System time in seconds since epoch (1970).
# TYPE node_time counter
node_time 1.48891092e+09
# HELP node_uname_info Labeled system information as provided by the uname system call.
# TYPE node_uname_info gauge
node_uname_info{domainname="(none)",machine="x86_64",nodename="centos7",release="3.10.0-514.10.2.el7.x86_64",sysname="Linux",version="#1 SMP Fri Mar 3 00:04:05 UTC 2017"} 1
# HELP node_vmstat_allocstall /proc/vmstat information field allocstall.
# TYPE node_vmstat_allocstall untyped
node_vmstat_allocstall 0
# HELP node_vmstat_balloon_deflate /proc/vmstat information field balloon_deflate.
# TYPE node_vmstat_balloon_deflate untyped
node_vmstat_balloon_deflate 0
# HELP node_vmstat_balloon_inflate /proc/vmstat information field balloon_inflate.
# TYPE node_vmstat_balloon_inflate untyped
node_vmstat_balloon_inflate 0
# HELP node_vmstat_balloon_migrate /proc/vmstat information field balloon_migrate.
# TYPE node_vmstat_balloon_migrate untyped
node_vmstat_balloon_migrate 0
# HELP node_vmstat_compact_fail /proc/vmstat information field compact_fail.
# TYPE node_vmstat_compact_fail untyped
node_vmstat_compact_fail 0
# HELP node_vmstat_compact_free_scanned /proc/vmstat information field compact_free_scanned.
# TYPE node_vmstat_compact_free_scanned untyped
node_vmstat_compact_free_scanned 0
# HELP node_vmstat_compact_isolated /proc/vmstat information field compact_isolated.
# TYPE node_vmstat_compact_isolated untyped
node_vmstat_compact_isolated 0
# HELP node_vmstat_compact_migrate_scanned /proc/vmstat information field compact_migrate_scanned.
# TYPE node_vmstat_compact_migrate_scanned untyped
node_vmstat_compact_migrate_scanned 0
# HELP node_vmstat_compact_stall /proc/vmstat information field compact_stall.
# TYPE node_vmstat_compact_stall untyped
node_vmstat_compact_stall 0
# HELP node_vmstat_compact_success /proc/vmstat information field compact_success.
# TYPE node_vmstat_compact_success untyped
node_vmstat_compact_success 0
# HELP node_vmstat_drop_pagecache /proc/vmstat information field drop_pagecache.
# TYPE node_vmstat_drop_pagecache untyped
node_vmstat_drop_pagecache 0
# HELP node_vmstat_drop_slab /proc/vmstat information field drop_slab.
# TYPE node_vmstat_drop_slab untyped
node_vmstat_drop_slab 0
# HELP node_vmstat_htlb_buddy_alloc_fail /proc/vmstat information field htlb_buddy_alloc_fail.
# TYPE node_vmstat_htlb_buddy_alloc_fail untyped
node_vmstat_htlb_buddy_alloc_fail 0
# HELP node_vmstat_htlb_buddy_alloc_success /proc/vmstat information field htlb_buddy_alloc_success.
# TYPE node_vmstat_htlb_buddy_alloc_success untyped
node_vmstat_htlb_buddy_alloc_success 0
# HELP node_vmstat_kswapd_high_wmark_hit_quickly /proc/vmstat information field kswapd_high_wmark_hit_quickly.
# TYPE node_vmstat_kswapd_high_wmark_hit_quickly untyped
node_vmstat_kswapd_high_wmark_hit_quickly 0
# HELP node_vmstat_kswapd_inodesteal /proc/vmstat information field kswapd_inodesteal.
# TYPE node_vmstat_kswapd_inodesteal untyped
node_vmstat_kswapd_inodesteal 0
# HELP node_vmstat_kswapd_low_wmark_hit_quickly /proc/vmstat information field kswapd_low_wmark_hit_quickly.
# TYPE node_vmstat_kswapd_low_wmark_hit_quickly untyped
node_vmstat_kswapd_low_wmark_hit_quickly 0
# HELP node_vmstat_nr_active_anon /proc/vmstat information field nr_active_anon.
# TYPE node_vmstat_nr_active_anon untyped
node_vmstat_nr_active_anon 88528
# HELP node_vmstat_nr_active_file /proc/vmstat information field nr_active_file.
# TYPE node_vmstat_nr_active_file untyped
node_vmstat_nr_active_file 28848
# HELP node_vmstat_nr_alloc_batch /proc/vmstat information field nr_alloc_batch.
# TYPE node_vmstat_nr_alloc_batch untyped
node_vmstat_nr_alloc_batch 2642
# HELP node_vmstat_nr_anon_pages /proc/vmstat information field nr_anon_pages.
# TYPE node_vmstat_nr_anon_pages untyped
node_vmstat_nr_anon_pages 85414
# HELP node_vmstat_nr_anon_transparent_hugepages /proc/vmstat information field nr_anon_transparent_hugepages.
# TYPE node_vmstat_nr_anon_transparent_hugepages untyped
node_vmstat_nr_anon_transparent_hugepages 6
# HELP node_vmstat_nr_bounce /proc/vmstat information field nr_bounce.
# TYPE node_vmstat_nr_bounce untyped
node_vmstat_nr_bounce 0
# HELP node_vmstat_nr_dirtied /proc/vmstat information field nr_dirtied.
# TYPE node_vmstat_nr_dirtied untyped
node_vmstat_nr_dirtied 29852
# HELP node_vmstat_nr_dirty /proc/vmstat information field nr_dirty.
# TYPE node_vmstat_nr_dirty untyped
node_vmstat_nr_dirty 17
# HELP node_vmstat_nr_dirty_background_threshold /proc/vmstat information field nr_dirty_background_threshold.
# TYPE node_vmstat_nr_dirty_background_threshold untyped
node_vmstat_nr_dirty_background_threshold 39900
# HELP node_vmstat_nr_dirty_threshold /proc/vmstat information field nr_dirty_threshold.
# TYPE node_vmstat_nr_dirty_threshold untyped
node_vmstat_nr_dirty_threshold 119701
# HELP node_vmstat_nr_file_pages /proc/vmstat information field nr_file_pages.
# TYPE node_vmstat_nr_file_pages untyped
node_vmstat_nr_file_pages 76687
# HELP node_vmstat_nr_free_cma /proc/vmstat information field nr_free_cma.
# TYPE node_vmstat_nr_free_cma untyped
node_vmstat_nr_free_cma 0
# HELP node_vmstat_nr_free_pages /proc/vmstat information field nr_free_pages.
# TYPE node_vmstat_nr_free_pages untyped
node_vmstat_nr_free_pages 769929
# HELP node_vmstat_nr_inactive_anon /proc/vmstat information field nr_inactive_anon.
# TYPE node_vmstat_nr_inactive_anon untyped
node_vmstat_nr_inactive_anon 2091
# HELP node_vmstat_nr_inactive_file /proc/vmstat information field nr_inactive_file.
# TYPE node_vmstat_nr_inactive_file untyped
node_vmstat_nr_inactive_file 45688
# HELP node_vmstat_nr_isolated_anon /proc/vmstat information field nr_isolated_anon.
# TYPE node_vmstat_nr_isolated_anon untyped
node_vmstat_nr_isolated_anon 0
# HELP node_vmstat_nr_isolated_file /proc/vmstat information field nr_isolated_file.
# TYPE node_vmstat_nr_isolated_file untyped
node_vmstat_nr_isolated_file 0
# HELP node_vmstat_nr_kernel_stack /proc/vmstat information field nr_kernel_stack.
# TYPE node_vmstat_nr_kernel_stack untyped
node_vmstat_nr_kernel_stack 707
# HELP node_vmstat_nr_mapped /proc/vmstat information field nr_mapped.
# TYPE node_vmstat_nr_mapped untyped
node_vmstat_nr_mapped 16487
# HELP node_vmstat_nr_mlock /proc/vmstat information field nr_mlock.
# TYPE node_vmstat_nr_mlock untyped
node_vmstat_nr_mlock 0
# HELP node_vmstat_nr_page_table_pages /proc/vmstat information field nr_page_table_pages.
# TYPE node_vmstat_nr_page_table_pages untyped
node_vmstat_nr_page_table_pages 2223
# HELP node_vmstat_nr_shmem /proc/vmstat information field nr_shmem.
# TYPE node_vmstat_nr_shmem untyped
node_vmstat_nr_shmem 2151
# HELP node_vmstat_nr_slab_reclaimable /proc/vmstat information field nr_slab_reclaimable.
# TYPE node_vmstat_nr_slab_reclaimable untyped
node_vmstat_nr_slab_reclaimable 10898
# HELP node_vmstat_nr_slab_unreclaimable /proc/vmstat information field nr_slab_unreclaimable.
# TYPE node_vmstat_nr_slab_unreclaimable untyped
node_vmstat_nr_slab_unreclaimable 7827
# HELP node_vmstat_nr_unevictable /proc/vmstat information field nr_unevictable.
# TYPE node_vmstat_nr_unevictable untyped
node_vmstat_nr_unevictable 0
# HELP node_vmstat_nr_unstable /proc/vmstat information field nr_unstable.
# TYPE node_vmstat_nr_unstable untyped
node_vmstat_nr_unstable 0
# HELP node_vmstat_nr_vmscan_immediate_reclaim /proc/vmstat information field nr_vmscan_immediate_reclaim.
# TYPE node_vmstat_nr_vmscan_immediate_reclaim untyped
node_vmstat_nr_vmscan_immediate_reclaim 0
# HELP node_vmstat_nr_vmscan_write /proc/vmstat information field nr_vmscan_write.
# TYPE node_vmstat_nr_vmscan_write untyped
node_vmstat_nr_vmscan_write 0
# HELP node_vmstat_nr_writeback /proc/vmstat information field nr_writeback.
# TYPE node_vmstat_nr_writeback untyped
node_vmstat_nr_writeback 0
# HELP node_vmstat_nr_writeback_temp /proc/vmstat information field nr_writeback_temp.
# TYPE node_vmstat_nr_writeback_temp untyped
node_vmstat_nr_writeback_temp 0
# HELP node_vmstat_nr_written /proc/vmstat information field nr_written.
# TYPE node_vmstat_nr_written untyped
node_vmstat_nr_written 20344
# HELP node_vmstat_numa_foreign /proc/vmstat information field numa_foreign.
# TYPE node_vmstat_numa_foreign untyped
node_vmstat_numa_foreign 0
# HELP node_vmstat_numa_hint_faults /proc/vmstat information field numa_hint_faults.
# TYPE node_vmstat_numa_hint_faults untyped
node_vmstat_numa_hint_faults 0
# HELP node_vmstat_numa_hint_faults_local /proc/vmstat information field numa_hint_faults_local.
# TYPE node_vmstat_numa_hint_faults_local untyped
node_vmstat_numa_hint_faults_local 0
# HELP node_vmstat_numa_hit /proc/vmstat information field numa_hit.
# TYPE node_vmstat_numa_hit untyped
node_vmstat_numa_hit 824961
# HELP node_vmstat_numa_huge_pte_updates /proc/vmstat information field numa_huge_pte_updates.
# TYPE node_vmstat_numa_huge_pte_updates untyped
node_vmstat_numa_huge_pte_updates 0
# HELP node_vmstat_numa_interleave /proc/vmstat information field numa_interleave.
# TYPE node_vmstat_numa_interleave untyped
node_vmstat_numa_interleave 14563
# HELP node_vmstat_numa_local /proc/vmstat information field numa_local.
# TYPE node_vmstat_numa_local untyped
node_vmstat_numa_local 824961
# HELP node_vmstat_numa_miss /proc/vmstat information field numa_miss.
# TYPE node_vmstat_numa_miss untyped
node_vmstat_numa_miss 0
# HELP node_vmstat_numa_other /proc/vmstat information field numa_other.
# TYPE node_vmstat_numa_other untyped
node_vmstat_numa_other 0
# HELP node_vmstat_numa_pages_migrated /proc/vmstat information field numa_pages_migrated.
# TYPE node_vmstat_numa_pages_migrated untyped
node_vmstat_numa_pages_migrated 0
# HELP node_vmstat_numa_pte_updates /proc/vmstat information field numa_pte_updates.
# TYPE node_vmstat_numa_pte_updates untyped
node_vmstat_numa_pte_updates 0
# HELP node_vmstat_pageoutrun /proc/vmstat information field pageoutrun.
# TYPE node_vmstat_pageoutrun untyped
node_vmstat_pageoutrun 1
# HELP node_vmstat_pgactivate /proc/vmstat information field pgactivate.
# TYPE node_vmstat_pgactivate untyped
node_vmstat_pgactivate 30452
# HELP node_vmstat_pgalloc_dma /proc/vmstat information field pgalloc_dma.
# TYPE node_vmstat_pgalloc_dma untyped
node_vmstat_pgalloc_dma 487
# HELP node_vmstat_pgalloc_dma32 /proc/vmstat information field pgalloc_dma32.
# TYPE node_vmstat_pgalloc_dma32 untyped
node_vmstat_pgalloc_dma32 750526
# HELP node_vmstat_pgalloc_movable /proc/vmstat information field pgalloc_movable.
# TYPE node_vmstat_pgalloc_movable untyped
node_vmstat_pgalloc_movable 0
# HELP node_vmstat_pgalloc_normal /proc/vmstat information field pgalloc_normal.
# TYPE node_vmstat_pgalloc_normal untyped
node_vmstat_pgalloc_normal 124283
# HELP node_vmstat_pgdeactivate /proc/vmstat information field pgdeactivate.
# TYPE node_vmstat_pgdeactivate untyped
node_vmstat_pgdeactivate 0
# HELP node_vmstat_pgfault /proc/vmstat information field pgfault.
# TYPE node_vmstat_pgfault untyped
node_vmstat_pgfault 1.37842e+06
# HELP node_vmstat_pgfree /proc/vmstat information field pgfree.
# TYPE node_vmstat_pgfree untyped
node_vmstat_pgfree 1.645789e+06
# HELP node_vmstat_pginodesteal /proc/vmstat information field pginodesteal.
# TYPE node_vmstat_pginodesteal untyped
node_vmstat_pginodesteal 0
# HELP node_vmstat_pgmajfault /proc/vmstat information field pgmajfault.
# TYPE node_vmstat_pgmajfault untyped
node_vmstat_pgmajfault 4286
# HELP node_vmstat_pgmigrate_fail /proc/vmstat information field pgmigrate_fail.
# TYPE node_vmstat_pgmigrate_fail untyped
node_vmstat_pgmigrate_fail 0
# HELP node_vmstat_pgmigrate_success /proc/vmstat information field pgmigrate_success.
# TYPE node_vmstat_pgmigrate_success untyped
node_vmstat_pgmigrate_success 0
# HELP node_vmstat_pgpgin /proc/vmstat information field pgpgin.
# TYPE node_vmstat_pgpgin untyped
node_vmstat_pgpgin 289776
# HELP node_vmstat_pgpgout /proc/vmstat information field pgpgout.
# TYPE node_vmstat_pgpgout untyped
node_vmstat_pgpgout 139684
# HELP node_vmstat_pgrefill_dma /proc/vmstat information field pgrefill_dma.
# TYPE node_vmstat_pgrefill_dma untyped
node_vmstat_pgrefill_dma 0
# HELP node_vmstat_pgrefill_dma32 /proc/vmstat information field pgrefill_dma32.
# TYPE node_vmstat_pgrefill_dma32 untyped
node_vmstat_pgrefill_dma32 0
# HELP node_vmstat_pgrefill_movable /proc/vmstat information field pgrefill_movable.
# TYPE node_vmstat_pgrefill_movable untyped
node_vmstat_pgrefill_movable 0
# HELP node_vmstat_pgrefill_normal /proc/vmstat information field pgrefill_normal.
# TYPE node_vmstat_pgrefill_normal untyped
node_vmstat_pgrefill_normal 0
# HELP node_vmstat_pgrotated /proc/vmstat information field pgrotated.
# TYPE node_vmstat_pgrotated untyped
node_vmstat_pgrotated 0
# HELP node_vmstat_pgscan_direct_dma /proc/vmstat information field pgscan_direct_dma.
# TYPE node_vmstat_pgscan_direct_dma untyped
node_vmstat_pgscan_direct_dma 0
# HELP node_vmstat_pgscan_direct_dma32 /proc/vmstat information field pgscan_direct_dma32.
# TYPE node_vmstat_pgscan_direct_dma32 untyped
node_vmstat_pgscan_direct_dma32 0
# HELP node_vmstat_pgscan_direct_movable /proc/vmstat information field pgscan_direct_movable.
# TYPE node_vmstat_pgscan_direct_movable untyped
node_vmstat_pgscan_direct_movable 0
# HELP node_vmstat_pgscan_direct_normal /proc/vmstat information field pgscan_direct_normal.
# TYPE node_vmstat_pgscan_direct_normal untyped
node_vmstat_pgscan_direct_normal 0
# HELP node_vmstat_pgscan_direct_throttle /proc/vmstat information field pgscan_direct_throttle.
# TYPE node_vmstat_pgscan_direct_throttle untyped
node_vmstat_pgscan_direct_throttle 0
# HELP node_vmstat_pgscan_kswapd_dma /proc/vmstat information field pgscan_kswapd_dma.
# TYPE node_vmstat_pgscan_kswapd_dma untyped
node_vmstat_pgscan_kswapd_dma 0
# HELP node_vmstat_pgscan_kswapd_dma32 /proc/vmstat information field pgscan_kswapd_dma32.
# TYPE node_vmstat_pgscan_kswapd_dma32 untyped
node_vmstat_pgscan_kswapd_dma32 0
# HELP node_vmstat_pgscan_kswapd_movable /proc/vmstat information field pgscan_kswapd_movable.
# TYPE node_vmstat_pgscan_kswapd_movable untyped
node_vmstat_pgscan_kswapd_movable 0
# HELP node_vmstat_pgscan_kswapd_normal /proc/vmstat information field pgscan_kswapd_normal.
# TYPE node_vmstat_pgscan_kswapd_normal untyped
node_vmstat_pgscan_kswapd_normal 0
# HELP node_vmstat_pgsteal_direct_dma /proc/vmstat information field pgsteal_direct_dma.
# TYPE node_vmstat_pgsteal_direct_dma untyped
node_vmstat_pgsteal_direct_dma 0
# HELP node_vmstat_pgsteal_direct_dma32 /proc/vmstat information field pgsteal_direct_dma32.
# TYPE node_vmstat_pgsteal_direct_dma32 untyped
node_vmstat_pgsteal_direct_dma32 0
# HELP node_vmstat_pgsteal_direct_movable /proc/vmstat information field pgsteal_direct_movable.
# TYPE node_vmstat_pgsteal_direct_movable untyped
node_vmstat_pgsteal_direct_movable 0
# HELP node_vmstat_pgsteal_direct_normal /proc/vmstat information field pgsteal_direct_normal.
# TYPE node_vmstat_pgsteal_direct_normal untyped
node_vmstat_pgsteal_direct_normal 0
# HELP node_vmstat_pgsteal_kswapd_dma /proc/vmstat information field pgsteal_kswapd_dma.
# TYPE node_vmstat_pgsteal_kswapd_dma untyped
node_vmstat_pgsteal_kswapd_dma 0
# HELP node_vmstat_pgsteal_kswapd_dma32 /proc/vmstat information field pgsteal_kswapd_dma32.
# TYPE node_vmstat_pgsteal_kswapd_dma32 untyped
node_vmstat_pgsteal_kswapd_dma32 0
# HELP node_vmstat_pgsteal_kswapd_movable /proc/vmstat information field pgsteal_kswapd_movable.
# TYPE node_vmstat_pgsteal_kswapd_movable untyped
node_vmstat_pgsteal_kswapd_movable 0
# HELP node_vmstat_pgsteal_kswapd_normal /proc/vmstat information field pgsteal_kswapd_normal.
# TYPE node_vmstat_pgsteal_kswapd_normal untyped
node_vmstat_pgsteal_kswapd_normal 0
# HELP node_vmstat_pswpin /proc/vmstat information field pswpin.
# TYPE node_vmstat_pswpin untyped
node_vmstat_pswpin 0
# HELP node_vmstat_pswpout /proc/vmstat information field pswpout.
# TYPE node_vmstat_pswpout untyped
node_vmstat_pswpout 0
# HELP node_vmstat_slabs_scanned /proc/vmstat information field slabs_scanned.
# TYPE node_vmstat_slabs_scanned untyped
node_vmstat_slabs_scanned 0
# HELP node_vmstat_thp_collapse_alloc /proc/vmstat information field thp_collapse_alloc.
# TYPE node_vmstat_thp_collapse_alloc untyped
node_vmstat_thp_collapse_alloc 0
# HELP node_vmstat_thp_collapse_alloc_failed /proc/vmstat information field thp_collapse_alloc_failed.
# TYPE node_vmstat_thp_collapse_alloc_failed untyped
node_vmstat_thp_collapse_alloc_failed 0
# HELP node_vmstat_thp_fault_alloc /proc/vmstat information field thp_fault_alloc.
# TYPE node_vmstat_thp_fault_alloc untyped
node_vmstat_thp_fault_alloc 53
# HELP node_vmstat_thp_fault_fallback /proc/vmstat information field thp_fault_fallback.
# TYPE node_vmstat_thp_fault_fallback untyped
node_vmstat_thp_fault_fallback 0
# HELP node_vmstat_thp_split /proc/vmstat information field thp_split.
# TYPE node_vmstat_thp_split untyped
node_vmstat_thp_split 1
# HELP node_vmstat_thp_zero_page_alloc /proc/vmstat information field thp_zero_page_alloc.
# TYPE node_vmstat_thp_zero_page_alloc untyped
node_vmstat_thp_zero_page_alloc 0
# HELP node_vmstat_thp_zero_page_alloc_failed /proc/vmstat information field thp_zero_page_alloc_failed.
# TYPE node_vmstat_thp_zero_page_alloc_failed untyped
node_vmstat_thp_zero_page_alloc_failed 0
# HELP node_vmstat_unevictable_pgs_cleared /proc/vmstat information field unevictable_pgs_cleared.
# TYPE node_vmstat_unevictable_pgs_cleared untyped
node_vmstat_unevictable_pgs_cleared 0
# HELP node_vmstat_unevictable_pgs_culled /proc/vmstat information field unevictable_pgs_culled.
# TYPE node_vmstat_unevictable_pgs_culled untyped
node_vmstat_unevictable_pgs_culled 8348
# HELP node_vmstat_unevictable_pgs_mlocked /proc/vmstat information field unevictable_pgs_mlocked.
# TYPE node_vmstat_unevictable_pgs_mlocked untyped
node_vmstat_unevictable_pgs_mlocked 11226
# HELP node_vmstat_unevictable_pgs_munlocked /proc/vmstat information field unevictable_pgs_munlocked.
# TYPE node_vmstat_unevictable_pgs_munlocked untyped
node_vmstat_unevictable_pgs_munlocked 9966
# HELP node_vmstat_unevictable_pgs_rescued /proc/vmstat information field unevictable_pgs_rescued.
# TYPE node_vmstat_unevictable_pgs_rescued untyped
node_vmstat_unevictable_pgs_rescued 7446
# HELP node_vmstat_unevictable_pgs_scanned /proc/vmstat information field unevictable_pgs_scanned.
# TYPE node_vmstat_unevictable_pgs_scanned untyped
node_vmstat_unevictable_pgs_scanned 0
# HELP node_vmstat_unevictable_pgs_stranded /proc/vmstat information field unevictable_pgs_stranded.
# TYPE node_vmstat_unevictable_pgs_stranded untyped
node_vmstat_unevictable_pgs_stranded 0
# HELP node_vmstat_workingset_activate /proc/vmstat information field workingset_activate.
# TYPE node_vmstat_workingset_activate untyped
node_vmstat_workingset_activate 0
# HELP node_vmstat_workingset_nodereclaim /proc/vmstat information field workingset_nodereclaim.
# TYPE node_vmstat_workingset_nodereclaim untyped
node_vmstat_workingset_nodereclaim 0
# HELP node_vmstat_workingset_refault /proc/vmstat information field workingset_refault.
# TYPE node_vmstat_workingset_refault untyped
node_vmstat_workingset_refault 0
# HELP node_vmstat_zone_reclaim_failed /proc/vmstat information field zone_reclaim_failed.
# TYPE node_vmstat_zone_reclaim_failed untyped
node_vmstat_zone_reclaim_failed 0

percona/mongodb_exporter (mongodb:metrics)

Behind the scenes of the “mongodb:metric” Percona Monitoring and Management service, percona/mongodb_exporter is the Prometheus exporter that provides detailed database metrics for graphing on the Percona Monitoring and Management server. percona/mongodb_exporter is a Percona fork of the project dcu/mongodb_exporter, with some valuable additional metrics for MongoDB sharding, storage engines, etc.

The mongodb_exporter process is designed to automatically detect the MongoDB node type, storage engine, etc., without any special configuration aside from the connection string.

As of Percona Monitoring and Management 1.1.1, here are the number of metrics collected from a single replication-enabled mongodb instance on a single ‘pull’ from Prometheus:

  • Percona Server for MongoDB 3.2 w/WiredTiger: 173 metrics per “pull”
  • Percona Server for MongoDB 3.2 w/RocksDB (with 3 x levels): 239 metrics per “pul”
  • Percona Server for MongoDB 3.2 w/MMAPv1: 172 metrics per “pul”‘

Note: each additional replica set member adds an additional seven metrics to the list of metrics.

On the sharding side of things, a “mongos” process within a cluster with one shard reports 58 metrics, with one extra metric added for each additional cluster shard and 3-4 extra metrics added for each additional “mongos” instance.

Below is a full example of a single “pull” of metrics from one RocksDB instance. Prometheus exporters provide metrics at the HTTP URL: “/metrics”. This is the exact same payload Prometheus would poll from the exporter:

$ curl -sk https://192.168.99.10:42003/metrics | grep mongodb
# HELP mongodb_mongod_asserts_total The asserts document reports the number of asserts on the database. While assert errors are typically uncommon, if there are non-zero values for the asserts, you should check the log file for the mongod process for more information. In many cases these errors are trivial, but are worth investigating.
# TYPE mongodb_mongod_asserts_total counter
mongodb_mongod_asserts_total{type="msg"} 0
mongodb_mongod_asserts_total{type="regular"} 0
mongodb_mongod_asserts_total{type="rollovers"} 0
mongodb_mongod_asserts_total{type="user"} 1
mongodb_mongod_asserts_total{type="warning"} 0
# HELP mongodb_mongod_connections The connections sub document data regarding the current status of incoming connections and availability of the database server. Use these values to assess the current load and capacity requirements of the server
# TYPE mongodb_mongod_connections gauge
mongodb_mongod_connections{state="available"} 811
mongodb_mongod_connections{state="current"} 8
# HELP mongodb_mongod_connections_metrics_created_total totalCreated provides a count of all incoming connections created to the server. This number includes connections that have since closed
# TYPE mongodb_mongod_connections_metrics_created_total counter
mongodb_mongod_connections_metrics_created_total 1977
# HELP mongodb_mongod_extra_info_heap_usage_bytes The heap_usage_bytes field is only available on Unix/Linux systems, and reports the total size in bytes of heap space used by the database process
# TYPE mongodb_mongod_extra_info_heap_usage_bytes gauge
mongodb_mongod_extra_info_heap_usage_bytes 7.77654e+07
# HELP mongodb_mongod_extra_info_page_faults_total The page_faults Reports the total number of page faults that require disk operations. Page faults refer to operations that require the database server to access data which isn’t available in active memory. The page_faults counter may increase dramatically during moments of poor performance and may correlate with limited memory environments and larger data sets. Limited and sporadic page faults do not necessarily indicate an issue
# TYPE mongodb_mongod_extra_info_page_faults_total gauge
mongodb_mongod_extra_info_page_faults_total 47
# HELP mongodb_mongod_global_lock_client The activeClients data structure provides more granular information about the number of connected clients and the operation types (e.g. read or write) performed by these clients
# TYPE mongodb_mongod_global_lock_client gauge
mongodb_mongod_global_lock_client{type="reader"} 0
mongodb_mongod_global_lock_client{type="writer"} 0
# HELP mongodb_mongod_global_lock_current_queue The currentQueue data structure value provides more granular information concerning the number of operations queued because of a lock
# TYPE mongodb_mongod_global_lock_current_queue gauge
mongodb_mongod_global_lock_current_queue{type="reader"} 0
mongodb_mongod_global_lock_current_queue{type="writer"} 0
# HELP mongodb_mongod_global_lock_ratio The value of ratio displays the relationship between lockTime and totalTime. Low values indicate that operations have held the globalLock frequently for shorter periods of time. High values indicate that operations have held globalLock infrequently for longer periods of time
# TYPE mongodb_mongod_global_lock_ratio gauge
mongodb_mongod_global_lock_ratio 0
# HELP mongodb_mongod_global_lock_total The value of totalTime represents the time, in microseconds, since the database last started and creation of the globalLock. This is roughly equivalent to total server uptime
# TYPE mongodb_mongod_global_lock_total counter
mongodb_mongod_global_lock_total 0
# HELP mongodb_mongod_instance_local_time The localTime value is the current time, according to the server, in UTC specified in an ISODate format.
# TYPE mongodb_mongod_instance_local_time counter
mongodb_mongod_instance_local_time 1.488834066e+09
# HELP mongodb_mongod_instance_uptime_estimate_seconds uptimeEstimate provides the uptime as calculated from MongoDB's internal course-grained time keeping system.
# TYPE mongodb_mongod_instance_uptime_estimate_seconds counter
mongodb_mongod_instance_uptime_estimate_seconds 15881
# HELP mongodb_mongod_instance_uptime_seconds The value of the uptime field corresponds to the number of seconds that the mongos or mongod process has been active.
# TYPE mongodb_mongod_instance_uptime_seconds counter
mongodb_mongod_instance_uptime_seconds 15881
# HELP mongodb_mongod_locks_time_acquiring_global_microseconds_total amount of time in microseconds that any database has spent waiting for the global lock
# TYPE mongodb_mongod_locks_time_acquiring_global_microseconds_total counter
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Collection",type="read"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Collection",type="write"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Database",type="read"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Database",type="write"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Global",type="read"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Global",type="write"} 5005
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Metadata",type="read"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="Metadata",type="write"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="oplog",type="read"} 0
mongodb_mongod_locks_time_acquiring_global_microseconds_total{database="oplog",type="write"} 0
# HELP mongodb_mongod_locks_time_locked_global_microseconds_total amount of time in microseconds that any database has held the global lock
# TYPE mongodb_mongod_locks_time_locked_global_microseconds_total counter
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Collection",type="read"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Collection",type="write"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Database",type="read"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Database",type="write"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Global",type="read"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Global",type="write"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Metadata",type="read"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="Metadata",type="write"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="oplog",type="read"} 0
mongodb_mongod_locks_time_locked_global_microseconds_total{database="oplog",type="write"} 0
# HELP mongodb_mongod_locks_time_locked_local_microseconds_total amount of time in microseconds that any database has held the local lock
# TYPE mongodb_mongod_locks_time_locked_local_microseconds_total counter
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Collection",type="read"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Collection",type="write"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Database",type="read"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Database",type="write"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Global",type="read"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Global",type="write"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Metadata",type="read"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="Metadata",type="write"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="oplog",type="read"} 0
mongodb_mongod_locks_time_locked_local_microseconds_total{database="oplog",type="write"} 0
# HELP mongodb_mongod_memory The mem data structure holds information regarding the target system architecture of mongod and current memory use
# TYPE mongodb_mongod_memory gauge
mongodb_mongod_memory{type="mapped"} 0
mongodb_mongod_memory{type="mapped_with_journal"} 0
mongodb_mongod_memory{type="resident"} 205
mongodb_mongod_memory{type="virtual"} 1064
# HELP mongodb_mongod_metrics_cursor_open The open is an embedded document that contains data regarding open cursors
# TYPE mongodb_mongod_metrics_cursor_open gauge
mongodb_mongod_metrics_cursor_open{state="noTimeout"} 0
mongodb_mongod_metrics_cursor_open{state="pinned"} 1
mongodb_mongod_metrics_cursor_open{state="total"} 1
# HELP mongodb_mongod_metrics_cursor_timed_out_total timedOut provides the total number of cursors that have timed out since the server process started. If this number is large or growing at a regular rate, this may indicate an application error
# TYPE mongodb_mongod_metrics_cursor_timed_out_total counter
mongodb_mongod_metrics_cursor_timed_out_total 0
# HELP mongodb_mongod_metrics_document_total The document holds a document of that reflect document access and modification patterns and data use. Compare these values to the data in the opcounters document, which track total number of operations
# TYPE mongodb_mongod_metrics_document_total counter
mongodb_mongod_metrics_document_total{state="deleted"} 0
mongodb_mongod_metrics_document_total{state="inserted"} 140396
mongodb_mongod_metrics_document_total{state="returned"} 74105
mongodb_mongod_metrics_document_total{state="updated"} 0
# HELP mongodb_mongod_metrics_get_last_error_wtime_num_total num reports the total number of getLastError operations with a specified write concern (i.e. w) that wait for one or more members of a replica set to acknowledge the write operation (i.e. a w value greater than 1.)
# TYPE mongodb_mongod_metrics_get_last_error_wtime_num_total gauge
mongodb_mongod_metrics_get_last_error_wtime_num_total 0
# HELP mongodb_mongod_metrics_get_last_error_wtime_total_milliseconds total_millis reports the total amount of time in milliseconds that the mongod has spent performing getLastError operations with write concern (i.e. w) that wait for one or more members of a replica set to acknowledge the write operation (i.e. a w value greater than 1.)
# TYPE mongodb_mongod_metrics_get_last_error_wtime_total_milliseconds counter
mongodb_mongod_metrics_get_last_error_wtime_total_milliseconds 0
# HELP mongodb_mongod_metrics_get_last_error_wtimeouts_total wtimeouts reports the number of times that write concern operations have timed out as a result of the wtimeout threshold to getLastError.
# TYPE mongodb_mongod_metrics_get_last_error_wtimeouts_total counter
mongodb_mongod_metrics_get_last_error_wtimeouts_total 0
# HELP mongodb_mongod_metrics_operation_total operation is a sub-document that holds counters for several types of update and query operations that MongoDB handles using special operation types
# TYPE mongodb_mongod_metrics_operation_total counter
mongodb_mongod_metrics_operation_total{type="fastmod"} 0
mongodb_mongod_metrics_operation_total{type="idhack"} 0
mongodb_mongod_metrics_operation_total{type="scan_and_order"} 0
# HELP mongodb_mongod_metrics_query_executor_total queryExecutor is a document that reports data from the query execution system
# TYPE mongodb_mongod_metrics_query_executor_total counter
mongodb_mongod_metrics_query_executor_total{state="scanned"} 0
mongodb_mongod_metrics_query_executor_total{state="scanned_objects"} 74105
# HELP mongodb_mongod_metrics_record_moves_total moves reports the total number of times documents move within the on-disk representation of the MongoDB data set. Documents move as a result of operations that increase the size of the document beyond their allocated record size
# TYPE mongodb_mongod_metrics_record_moves_total counter
mongodb_mongod_metrics_record_moves_total 0
# HELP mongodb_mongod_metrics_repl_apply_batches_num_total num reports the total number of batches applied across all databases
# TYPE mongodb_mongod_metrics_repl_apply_batches_num_total counter
mongodb_mongod_metrics_repl_apply_batches_num_total 0
# HELP mongodb_mongod_metrics_repl_apply_batches_total_milliseconds total_millis reports the total amount of time the mongod has spent applying operations from the oplog
# TYPE mongodb_mongod_metrics_repl_apply_batches_total_milliseconds counter
mongodb_mongod_metrics_repl_apply_batches_total_milliseconds 0
# HELP mongodb_mongod_metrics_repl_apply_ops_total ops reports the total number of oplog operations applied
# TYPE mongodb_mongod_metrics_repl_apply_ops_total counter
mongodb_mongod_metrics_repl_apply_ops_total 0
# HELP mongodb_mongod_metrics_repl_buffer_count count reports the current number of operations in the oplog buffer
# TYPE mongodb_mongod_metrics_repl_buffer_count gauge
mongodb_mongod_metrics_repl_buffer_count 0
# HELP mongodb_mongod_metrics_repl_buffer_max_size_bytes maxSizeBytes reports the maximum size of the buffer. This value is a constant setting in the mongod, and is not configurable
# TYPE mongodb_mongod_metrics_repl_buffer_max_size_bytes counter
mongodb_mongod_metrics_repl_buffer_max_size_bytes 2.68435456e+08
# HELP mongodb_mongod_metrics_repl_buffer_size_bytes sizeBytes reports the current size of the contents of the oplog buffer
# TYPE mongodb_mongod_metrics_repl_buffer_size_bytes gauge
mongodb_mongod_metrics_repl_buffer_size_bytes 0
# HELP mongodb_mongod_metrics_repl_network_bytes_total bytes reports the total amount of data read from the replication sync source
# TYPE mongodb_mongod_metrics_repl_network_bytes_total counter
mongodb_mongod_metrics_repl_network_bytes_total 0
# HELP mongodb_mongod_metrics_repl_network_getmores_num_total num reports the total number of getmore operations, which are operations that request an additional set of operations from the replication sync source.
# TYPE mongodb_mongod_metrics_repl_network_getmores_num_total counter
mongodb_mongod_metrics_repl_network_getmores_num_total 0
# HELP mongodb_mongod_metrics_repl_network_getmores_total_milliseconds total_millis reports the total amount of time required to collect data from getmore operations
# TYPE mongodb_mongod_metrics_repl_network_getmores_total_milliseconds counter
mongodb_mongod_metrics_repl_network_getmores_total_milliseconds 0
# HELP mongodb_mongod_metrics_repl_network_ops_total ops reports the total number of operations read from the replication source.
# TYPE mongodb_mongod_metrics_repl_network_ops_total counter
mongodb_mongod_metrics_repl_network_ops_total 0
# HELP mongodb_mongod_metrics_repl_network_readers_created_total readersCreated reports the total number of oplog query processes created. MongoDB will create a new oplog query any time an error occurs in the connection, including a timeout, or a network operation. Furthermore, readersCreated will increment every time MongoDB selects a new source fore replication.
# TYPE mongodb_mongod_metrics_repl_network_readers_created_total counter
mongodb_mongod_metrics_repl_network_readers_created_total 1
# HELP mongodb_mongod_metrics_repl_oplog_insert_bytes_total insertBytes the total size of documents inserted into the oplog.
# TYPE mongodb_mongod_metrics_repl_oplog_insert_bytes_total counter
mongodb_mongod_metrics_repl_oplog_insert_bytes_total 0
# HELP mongodb_mongod_metrics_repl_oplog_insert_num_total num reports the total number of items inserted into the oplog.
# TYPE mongodb_mongod_metrics_repl_oplog_insert_num_total counter
mongodb_mongod_metrics_repl_oplog_insert_num_total 0
# HELP mongodb_mongod_metrics_repl_oplog_insert_total_milliseconds total_millis reports the total amount of time spent for the mongod to insert data into the oplog.
# TYPE mongodb_mongod_metrics_repl_oplog_insert_total_milliseconds counter
mongodb_mongod_metrics_repl_oplog_insert_total_milliseconds 0
# HELP mongodb_mongod_metrics_repl_preload_docs_num_total num reports the total number of documents loaded during the pre-fetch stage of replication
# TYPE mongodb_mongod_metrics_repl_preload_docs_num_total counter
mongodb_mongod_metrics_repl_preload_docs_num_total 0
# HELP mongodb_mongod_metrics_repl_preload_docs_total_milliseconds total_millis reports the total amount of time spent loading documents as part of the pre-fetch stage of replication
# TYPE mongodb_mongod_metrics_repl_preload_docs_total_milliseconds counter
mongodb_mongod_metrics_repl_preload_docs_total_milliseconds 0
# HELP mongodb_mongod_metrics_repl_preload_indexes_num_total num reports the total number of index entries loaded by members before updating documents as part of the pre-fetch stage of replication
# TYPE mongodb_mongod_metrics_repl_preload_indexes_num_total counter
mongodb_mongod_metrics_repl_preload_indexes_num_total 0
# HELP mongodb_mongod_metrics_repl_preload_indexes_total_milliseconds total_millis reports the total amount of time spent loading index entries as part of the pre-fetch stage of replication
# TYPE mongodb_mongod_metrics_repl_preload_indexes_total_milliseconds counter
mongodb_mongod_metrics_repl_preload_indexes_total_milliseconds 0
# HELP mongodb_mongod_metrics_storage_freelist_search_total metrics about searching records in the database.
# TYPE mongodb_mongod_metrics_storage_freelist_search_total counter
mongodb_mongod_metrics_storage_freelist_search_total{type="bucket_exhausted"} 0
mongodb_mongod_metrics_storage_freelist_search_total{type="requests"} 0
mongodb_mongod_metrics_storage_freelist_search_total{type="scanned"} 0
# HELP mongodb_mongod_metrics_ttl_deleted_documents_total deletedDocuments reports the total number of documents deleted from collections with a ttl index.
# TYPE mongodb_mongod_metrics_ttl_deleted_documents_total counter
mongodb_mongod_metrics_ttl_deleted_documents_total 0
# HELP mongodb_mongod_metrics_ttl_passes_total passes reports the number of times the background process removes documents from collections with a ttl index
# TYPE mongodb_mongod_metrics_ttl_passes_total counter
mongodb_mongod_metrics_ttl_passes_total 0
# HELP mongodb_mongod_network_bytes_total The network data structure contains data regarding MongoDB’s network use
# TYPE mongodb_mongod_network_bytes_total counter
mongodb_mongod_network_bytes_total{state="in_bytes"} 1.323188842e+09
mongodb_mongod_network_bytes_total{state="out_bytes"} 1.401963024e+09
# HELP mongodb_mongod_network_metrics_num_requests_total The numRequests field is a counter of the total number of distinct requests that the server has received. Use this value to provide context for the bytesIn and bytesOut values to ensure that MongoDB’s network utilization is consistent with expectations and application use
# TYPE mongodb_mongod_network_metrics_num_requests_total counter
mongodb_mongod_network_metrics_num_requests_total 313023
# HELP mongodb_mongod_op_counters_repl_total The opcountersRepl data structure, similar to the opcounters data structure, provides an overview of database replication operations by type and makes it possible to analyze the load on the replica in more granular manner. These values only appear when the current host has replication enabled
# TYPE mongodb_mongod_op_counters_repl_total counter
mongodb_mongod_op_counters_repl_total{type="command"} 0
mongodb_mongod_op_counters_repl_total{type="delete"} 0
mongodb_mongod_op_counters_repl_total{type="getmore"} 0
mongodb_mongod_op_counters_repl_total{type="insert"} 0
mongodb_mongod_op_counters_repl_total{type="query"} 0
mongodb_mongod_op_counters_repl_total{type="update"} 0
# HELP mongodb_mongod_op_counters_total The opcounters data structure provides an overview of database operations by type and makes it possible to analyze the load on the database in more granular manner. These numbers will grow over time and in response to database use. Analyze these values over time to track database utilization
# TYPE mongodb_mongod_op_counters_total counter
mongodb_mongod_op_counters_total{type="command"} 164008
mongodb_mongod_op_counters_total{type="delete"} 0
mongodb_mongod_op_counters_total{type="getmore"} 74912
mongodb_mongod_op_counters_total{type="insert"} 70198
mongodb_mongod_op_counters_total{type="query"} 3907
mongodb_mongod_op_counters_total{type="update"} 0
# HELP mongodb_mongod_replset_heatbeat_interval_millis The frequency in milliseconds of the heartbeats
# TYPE mongodb_mongod_replset_heatbeat_interval_millis gauge
mongodb_mongod_replset_heatbeat_interval_millis{set="test1"} 2000
# HELP mongodb_mongod_replset_member_config_version The configVersion value is the replica set configuration version.
# TYPE mongodb_mongod_replset_member_config_version gauge
mongodb_mongod_replset_member_config_version{name="localhost:27017",set="test1",state="PRIMARY"} 2
mongodb_mongod_replset_member_config_version{name="localhost:27027",set="test1",state="SECONDARY"} 2
# HELP mongodb_mongod_replset_member_election_date The timestamp the node was elected as replica leader
# TYPE mongodb_mongod_replset_member_election_date gauge
mongodb_mongod_replset_member_election_date{name="localhost:27017",set="test1",state="PRIMARY"} 1.488818198e+09
# HELP mongodb_mongod_replset_member_health This field conveys if the member is up (1) or down (0).
# TYPE mongodb_mongod_replset_member_health gauge
mongodb_mongod_replset_member_health{name="localhost:27017",set="test1",state="PRIMARY"} 1
mongodb_mongod_replset_member_health{name="localhost:27027",set="test1",state="SECONDARY"} 1
# HELP mongodb_mongod_replset_member_last_heartbeat The lastHeartbeat value provides an ISODate formatted date and time of the transmission time of last heartbeat received from this member
# TYPE mongodb_mongod_replset_member_last_heartbeat gauge
mongodb_mongod_replset_member_last_heartbeat{name="localhost:27027",set="test1",state="SECONDARY"} 1.488834065e+09
# HELP mongodb_mongod_replset_member_last_heartbeat_recv The lastHeartbeatRecv value provides an ISODate formatted date and time that the last heartbeat was received from this member
# TYPE mongodb_mongod_replset_member_last_heartbeat_recv gauge
mongodb_mongod_replset_member_last_heartbeat_recv{name="localhost:27027",set="test1",state="SECONDARY"} 1.488834066e+09
# HELP mongodb_mongod_replset_member_optime_date The timestamp of the last oplog entry that this member applied.
# TYPE mongodb_mongod_replset_member_optime_date gauge
mongodb_mongod_replset_member_optime_date{name="localhost:27017",set="test1",state="PRIMARY"} 1.488833953e+09
mongodb_mongod_replset_member_optime_date{name="localhost:27027",set="test1",state="SECONDARY"} 1.488833953e+09
# HELP mongodb_mongod_replset_member_ping_ms The pingMs represents the number of milliseconds (ms) that a round-trip packet takes to travel between the remote member and the local instance.
# TYPE mongodb_mongod_replset_member_ping_ms gauge
mongodb_mongod_replset_member_ping_ms{name="localhost:27027",set="test1",state="SECONDARY"} 0
# HELP mongodb_mongod_replset_member_state The value of state is an integer between 0 and 10 that represents the replica state of the member.
# TYPE mongodb_mongod_replset_member_state gauge
mongodb_mongod_replset_member_state{name="localhost:27017",set="test1",state="PRIMARY"} 1
mongodb_mongod_replset_member_state{name="localhost:27027",set="test1",state="SECONDARY"} 2
# HELP mongodb_mongod_replset_member_uptime The uptime field holds a value that reflects the number of seconds that this member has been online.
# TYPE mongodb_mongod_replset_member_uptime counter
mongodb_mongod_replset_member_uptime{name="localhost:27017",set="test1",state="PRIMARY"} 15881
mongodb_mongod_replset_member_uptime{name="localhost:27027",set="test1",state="SECONDARY"} 15855
# HELP mongodb_mongod_replset_my_name The replica state name of the current member
# TYPE mongodb_mongod_replset_my_name gauge
mongodb_mongod_replset_my_name{name="localhost:27017",set="test1"} 1
# HELP mongodb_mongod_replset_my_state An integer between 0 and 10 that represents the replica state of the current member
# TYPE mongodb_mongod_replset_my_state gauge
mongodb_mongod_replset_my_state{set="test1"} 1
# HELP mongodb_mongod_replset_number_of_members The number of replica set mebers
# TYPE mongodb_mongod_replset_number_of_members gauge
mongodb_mongod_replset_number_of_members{set="test1"} 2
# HELP mongodb_mongod_replset_oplog_head_timestamp The timestamp of the newest change in the oplog
# TYPE mongodb_mongod_replset_oplog_head_timestamp gauge
mongodb_mongod_replset_oplog_head_timestamp 1.488833953e+09
# HELP mongodb_mongod_replset_oplog_items_total The total number of changes in the oplog
# TYPE mongodb_mongod_replset_oplog_items_total gauge
mongodb_mongod_replset_oplog_items_total 70202
# HELP mongodb_mongod_replset_oplog_size_bytes Size of oplog in bytes
# TYPE mongodb_mongod_replset_oplog_size_bytes gauge
mongodb_mongod_replset_oplog_size_bytes{type="current"} 1.273140034e+09
mongodb_mongod_replset_oplog_size_bytes{type="storage"} 1.273139968e+09
# HELP mongodb_mongod_replset_oplog_tail_timestamp The timestamp of the oldest change in the oplog
# TYPE mongodb_mongod_replset_oplog_tail_timestamp gauge
mongodb_mongod_replset_oplog_tail_timestamp 1.488818198e+09
# HELP mongodb_mongod_replset_term The election count for the replica set, as known to this replica set member
# TYPE mongodb_mongod_replset_term gauge
mongodb_mongod_replset_term{set="test1"} 1
# HELP mongodb_mongod_rocksdb_background_errors The total number of background errors in RocksDB
# TYPE mongodb_mongod_rocksdb_background_errors gauge
mongodb_mongod_rocksdb_background_errors 0
# HELP mongodb_mongod_rocksdb_block_cache_bytes The current bytes used in the RocksDB Block Cache
# TYPE mongodb_mongod_rocksdb_block_cache_bytes gauge
mongodb_mongod_rocksdb_block_cache_bytes 2.5165824e+07
# HELP mongodb_mongod_rocksdb_block_cache_hits_total The total number of hits to the RocksDB Block Cache
# TYPE mongodb_mongod_rocksdb_block_cache_hits_total counter
mongodb_mongod_rocksdb_block_cache_hits_total 519483
# HELP mongodb_mongod_rocksdb_block_cache_misses_total The total number of misses to the RocksDB Block Cache
# TYPE mongodb_mongod_rocksdb_block_cache_misses_total counter
mongodb_mongod_rocksdb_block_cache_misses_total 219214
# HELP mongodb_mongod_rocksdb_bloom_filter_useful_total The total number of times the RocksDB Bloom Filter was useful
# TYPE mongodb_mongod_rocksdb_bloom_filter_useful_total counter
mongodb_mongod_rocksdb_bloom_filter_useful_total 130251
# HELP mongodb_mongod_rocksdb_bytes_read_total The total number of bytes read by RocksDB
# TYPE mongodb_mongod_rocksdb_bytes_read_total counter
mongodb_mongod_rocksdb_bytes_read_total{type="compation"} 7.917332498e+09
mongodb_mongod_rocksdb_bytes_read_total{type="iteration"} 3.94838428e+09
mongodb_mongod_rocksdb_bytes_read_total{type="point_lookup"} 2.655612381e+09
# HELP mongodb_mongod_rocksdb_bytes_written_total The total number of bytes written by RocksDB
# TYPE mongodb_mongod_rocksdb_bytes_written_total counter
mongodb_mongod_rocksdb_bytes_written_total{type="compaction"} 9.531638929e+09
mongodb_mongod_rocksdb_bytes_written_total{type="flush"} 2.525874187e+09
mongodb_mongod_rocksdb_bytes_written_total{type="total"} 2.548843861e+09
# HELP mongodb_mongod_rocksdb_compaction_average_seconds The average time per compaction between levels N and N+1 in RocksDB
# TYPE mongodb_mongod_rocksdb_compaction_average_seconds gauge
mongodb_mongod_rocksdb_compaction_average_seconds{level="L0"} 0.505
mongodb_mongod_rocksdb_compaction_average_seconds{level="L5"} 3.606
mongodb_mongod_rocksdb_compaction_average_seconds{level="L6"} 14.579
mongodb_mongod_rocksdb_compaction_average_seconds{level="total"} 4.583
# HELP mongodb_mongod_rocksdb_compaction_bytes_per_second The rate at which data is processed during compaction between levels N and N+1 in RocksDB
# TYPE mongodb_mongod_rocksdb_compaction_bytes_per_second gauge
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="L0",type="write"} 1.248854016e+08
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="L5",type="read"} 7.45537536e+07
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="L5",type="write"} 7.45537536e+07
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="L6",type="read"} 2.44318208e+07
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="L6",type="write"} 2.06569472e+07
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="total",type="read"} 2.70532608e+07
mongodb_mongod_rocksdb_compaction_bytes_per_second{level="total",type="write"} 3.2505856e+07
# HELP mongodb_mongod_rocksdb_compaction_file_threads The number of threads currently doing compaction for levels in RocksDB
# TYPE mongodb_mongod_rocksdb_compaction_file_threads gauge
mongodb_mongod_rocksdb_compaction_file_threads{level="L0"} 0
mongodb_mongod_rocksdb_compaction_file_threads{level="L5"} 0
mongodb_mongod_rocksdb_compaction_file_threads{level="L6"} 0
mongodb_mongod_rocksdb_compaction_file_threads{level="total"} 0
# HELP mongodb_mongod_rocksdb_compaction_score The compaction score of RocksDB levels
# TYPE mongodb_mongod_rocksdb_compaction_score gauge
mongodb_mongod_rocksdb_compaction_score{level="L0"} 0
mongodb_mongod_rocksdb_compaction_score{level="L5"} 0.9
mongodb_mongod_rocksdb_compaction_score{level="L6"} 0
mongodb_mongod_rocksdb_compaction_score{level="total"} 0
# HELP mongodb_mongod_rocksdb_compaction_seconds_total The time spent doing compactions between levels N and N+1 in RocksDB
# TYPE mongodb_mongod_rocksdb_compaction_seconds_total counter
mongodb_mongod_rocksdb_compaction_seconds_total{level="L0"} 20
mongodb_mongod_rocksdb_compaction_seconds_total{level="L5"} 25
mongodb_mongod_rocksdb_compaction_seconds_total{level="L6"} 248
mongodb_mongod_rocksdb_compaction_seconds_total{level="total"} 293
# HELP mongodb_mongod_rocksdb_compaction_write_amplification The write amplification factor from compaction between levels N and N+1 in RocksDB
# TYPE mongodb_mongod_rocksdb_compaction_write_amplification gauge
mongodb_mongod_rocksdb_compaction_write_amplification{level="L5"} 1
mongodb_mongod_rocksdb_compaction_write_amplification{level="L6"} 1.7
mongodb_mongod_rocksdb_compaction_write_amplification{level="total"} 3.8
# HELP mongodb_mongod_rocksdb_compactions_total The total number of compactions between levels N and N+1 in RocksDB
# TYPE mongodb_mongod_rocksdb_compactions_total counter
mongodb_mongod_rocksdb_compactions_total{level="L0"} 40
mongodb_mongod_rocksdb_compactions_total{level="L5"} 7
mongodb_mongod_rocksdb_compactions_total{level="L6"} 17
mongodb_mongod_rocksdb_compactions_total{level="total"} 64
# HELP mongodb_mongod_rocksdb_estimate_table_readers_memory_bytes The estimate RocksDB table-reader memory bytes
# TYPE mongodb_mongod_rocksdb_estimate_table_readers_memory_bytes gauge
mongodb_mongod_rocksdb_estimate_table_readers_memory_bytes 2.10944e+06
# HELP mongodb_mongod_rocksdb_files The number of files in a RocksDB level
# TYPE mongodb_mongod_rocksdb_files gauge
mongodb_mongod_rocksdb_files{level="L0"} 0
mongodb_mongod_rocksdb_files{level="L5"} 2
mongodb_mongod_rocksdb_files{level="L6"} 23
mongodb_mongod_rocksdb_files{level="total"} 25
# HELP mongodb_mongod_rocksdb_immutable_memtables The total number of immutable MemTables in RocksDB
# TYPE mongodb_mongod_rocksdb_immutable_memtables gauge
mongodb_mongod_rocksdb_immutable_memtables 0
# HELP mongodb_mongod_rocksdb_iterations_total The total number of iterations performed by RocksDB
# TYPE mongodb_mongod_rocksdb_iterations_total counter
mongodb_mongod_rocksdb_iterations_total{type="backward"} 1955
mongodb_mongod_rocksdb_iterations_total{type="forward"} 206563
# HELP mongodb_mongod_rocksdb_keys_total The total number of RocksDB key operations
# TYPE mongodb_mongod_rocksdb_keys_total counter
mongodb_mongod_rocksdb_keys_total{type="read"} 209546
mongodb_mongod_rocksdb_keys_total{type="written"} 281332
# HELP mongodb_mongod_rocksdb_live_versions The current number of live versions in RocksDB
# TYPE mongodb_mongod_rocksdb_live_versions gauge
mongodb_mongod_rocksdb_live_versions 1
# HELP mongodb_mongod_rocksdb_memtable_bytes The current number of MemTable bytes in RocksDB
# TYPE mongodb_mongod_rocksdb_memtable_bytes gauge
mongodb_mongod_rocksdb_memtable_bytes{type="active"} 2.5165824e+07
mongodb_mongod_rocksdb_memtable_bytes{type="total"} 2.5165824e+07
# HELP mongodb_mongod_rocksdb_memtable_entries The current number of Memtable entries in RocksDB
# TYPE mongodb_mongod_rocksdb_memtable_entries gauge
mongodb_mongod_rocksdb_memtable_entries{type="active"} 4553
mongodb_mongod_rocksdb_memtable_entries{type="immutable"} 0
# HELP mongodb_mongod_rocksdb_oldest_snapshot_timestamp The timestamp of the oldest snapshot in RocksDB
# TYPE mongodb_mongod_rocksdb_oldest_snapshot_timestamp gauge
mongodb_mongod_rocksdb_oldest_snapshot_timestamp 0
# HELP mongodb_mongod_rocksdb_pending_compactions The total number of compactions pending in RocksDB
# TYPE mongodb_mongod_rocksdb_pending_compactions gauge
mongodb_mongod_rocksdb_pending_compactions 0
# HELP mongodb_mongod_rocksdb_pending_memtable_flushes The total number of MemTable flushes pending in RocksDB
# TYPE mongodb_mongod_rocksdb_pending_memtable_flushes gauge
mongodb_mongod_rocksdb_pending_memtable_flushes 0
# HELP mongodb_mongod_rocksdb_read_latency_microseconds The read latency in RocksDB in microseconds by level
# TYPE mongodb_mongod_rocksdb_read_latency_microseconds gauge
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="P50"} 9.33
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="P75"} 17.03
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="P99"} 1972.31
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="P99.9"} 9061.44
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="P99.99"} 18098.3
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="avg"} 87.3861
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="max"} 26434
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="median"} 9.3328
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="min"} 0
mongodb_mongod_rocksdb_read_latency_microseconds{level="L0",type="stddev"} 607.16
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="P50"} 10.85
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="P75"} 28.46
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="P99"} 4208.33
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="P99.9"} 13079.55
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="P99.99"} 23156.25
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="avg"} 216.3736
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="max"} 34585
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="median"} 10.8461
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="min"} 0
mongodb_mongod_rocksdb_read_latency_microseconds{level="L5",type="stddev"} 978.48
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="P50"} 9.42
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="P75"} 30.69
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="P99"} 6037.87
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="P99.9"} 15482.63
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="P99.99"} 30125.5
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="avg"} 294.9574
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="max"} 61051
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="median"} 9.4245
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="min"} 0
mongodb_mongod_rocksdb_read_latency_microseconds{level="L6",type="stddev"} 1264.12
# HELP mongodb_mongod_rocksdb_reads_total The total number of read operations in RocksDB
# TYPE mongodb_mongod_rocksdb_reads_total counter
mongodb_mongod_rocksdb_reads_total{level="L0"} 59017
mongodb_mongod_rocksdb_reads_total{level="L5"} 42125
mongodb_mongod_rocksdb_reads_total{level="L6"} 118745
# HELP mongodb_mongod_rocksdb_seeks_total The total number of seeks performed by RocksDB
# TYPE mongodb_mongod_rocksdb_seeks_total counter
mongodb_mongod_rocksdb_seeks_total 156184
# HELP mongodb_mongod_rocksdb_size_bytes The total byte size of levels in RocksDB
# TYPE mongodb_mongod_rocksdb_size_bytes gauge
mongodb_mongod_rocksdb_size_bytes{level="L0"} 0
mongodb_mongod_rocksdb_size_bytes{level="L5"} 1.2969836544e+08
mongodb_mongod_rocksdb_size_bytes{level="L6"} 1.46175688704e+09
mongodb_mongod_rocksdb_size_bytes{level="total"} 1.59145525248e+09
# HELP mongodb_mongod_rocksdb_snapshots The current number of snapshots in RocksDB
# TYPE mongodb_mongod_rocksdb_snapshots gauge
mongodb_mongod_rocksdb_snapshots 0
# HELP mongodb_mongod_rocksdb_stall_percent The percentage of time RocksDB has been stalled
# TYPE mongodb_mongod_rocksdb_stall_percent gauge
mongodb_mongod_rocksdb_stall_percent 0
# HELP mongodb_mongod_rocksdb_stalled_seconds_total The total number of seconds RocksDB has spent stalled
# TYPE mongodb_mongod_rocksdb_stalled_seconds_total counter
mongodb_mongod_rocksdb_stalled_seconds_total 0
# HELP mongodb_mongod_rocksdb_stalls_total The total number of stalls in RocksDB
# TYPE mongodb_mongod_rocksdb_stalls_total counter
mongodb_mongod_rocksdb_stalls_total{type="level0_numfiles"} 0
mongodb_mongod_rocksdb_stalls_total{type="level0_numfiles_with_compaction"} 0
mongodb_mongod_rocksdb_stalls_total{type="level0_slowdown"} 0
mongodb_mongod_rocksdb_stalls_total{type="level0_slowdown_with_compaction"} 0
mongodb_mongod_rocksdb_stalls_total{type="memtable_compaction"} 0
mongodb_mongod_rocksdb_stalls_total{type="memtable_slowdown"} 0
# HELP mongodb_mongod_rocksdb_total_live_recovery_units The total number of live recovery units in RocksDB
# TYPE mongodb_mongod_rocksdb_total_live_recovery_units gauge
mongodb_mongod_rocksdb_total_live_recovery_units 5
# HELP mongodb_mongod_rocksdb_transaction_engine_keys The current number of transaction engine keys in RocksDB
# TYPE mongodb_mongod_rocksdb_transaction_engine_keys gauge
mongodb_mongod_rocksdb_transaction_engine_keys 0
# HELP mongodb_mongod_rocksdb_transaction_engine_snapshots The current number of transaction engine snapshots in RocksDB
# TYPE mongodb_mongod_rocksdb_transaction_engine_snapshots gauge
mongodb_mongod_rocksdb_transaction_engine_snapshots 0
# HELP mongodb_mongod_rocksdb_write_ahead_log_bytes_per_second The number of bytes written per second by the Write-Ahead-Log in RocksDB
# TYPE mongodb_mongod_rocksdb_write_ahead_log_bytes_per_second gauge
mongodb_mongod_rocksdb_write_ahead_log_bytes_per_second 157286.4
# HELP mongodb_mongod_rocksdb_write_ahead_log_writes_per_sync The number of writes per Write-Ahead-Log sync in RocksDB
# TYPE mongodb_mongod_rocksdb_write_ahead_log_writes_per_sync gauge
mongodb_mongod_rocksdb_write_ahead_log_writes_per_sync 70538
# HELP mongodb_mongod_rocksdb_writes_per_batch The number of writes per batch in RocksDB
# TYPE mongodb_mongod_rocksdb_writes_per_batch gauge
mongodb_mongod_rocksdb_writes_per_batch 2.54476812288e+09
# HELP mongodb_mongod_rocksdb_writes_per_second The number of writes per second in RocksDB
# TYPE mongodb_mongod_rocksdb_writes_per_second gauge
mongodb_mongod_rocksdb_writes_per_second 157286.4
# HELP mongodb_mongod_storage_engine The storage engine used by the MongoDB instance
# TYPE mongodb_mongod_storage_engine counter
mongodb_mongod_storage_engine{engine="rocksdb"} 1

Grafana

PMM uses Grafana to visualize metrics stored in Prometheus. Grafana uses a concept of “dashboards” to store the definitions of what it should visualize. PMM’s dashboards are hosted under the GitHub project: percona/grafana-dashboards.

Grafana can support multiple “data sources” for metrics. As Percona Monitoring and Management uses Prometheus for metric storage, this is the “data source” used in Grafana.

Adding Custom Graphs to Grafana

In this section I will create an example graph to help indicate the “efficiency” of queries on a Mongod instance over the last 5 minutes. This is a good example to use because we plan to add this exact metric in an upcoming version of PMM.

This graph will rely on two metrics that are already provided to PMM via percona/mongodb_exporter since at least version 1.0.0 (“$host” = the hostname of a given node – explained later):

  1. mongodb_mongod_metrics_query_executor_total{instance=”$host”, state=”scanned_objects”} – A metric representing the total number of documents scanned by the server.
  2. mongodb_mongod_metrics_document_total{instance=”$host”, state=”returned”} – A metric representing the total number of documents returned by the server.

The graph will compute the change in these two metrics over five minutes and create a percentage/ratio of scanned vs. returned documents. A host with ten scanned documents and one document returned would have 10% efficiency and a host that scanned 100 documents to return 100 documents would have 100% efficiency (a 1:1 ratio).

Often you will encounter Prometheus metrics that are “total” metric counters that increment from the time the server is (re)started. Both of the metrics our new graph requires are incremented counters and thus need to be “scaled” or “trimmed” to only show the last five minutes of metrics (in this example), not the total since the server was (re)started.

Prometheus offers a very useful query function for incremented counters called “increase()”: https://prometheus.io/docs/querying/functions/#increase(). The Prometheus increase() function allows queries to return the amount a metric counter has increased over a given time period, making this trivial to do! It is also unaffected by counter “resets” due to server restarts as increase() only returns increases in counters.

The increase() syntax requires a time range to be specified before the closing round-bracket. In our case we will as increase() to consider the last five minutes, which is expressed with “[5m]” at the end of the increase() function, seen in the following example.

The full Prometheus query I will use to create query efficiency graph is:

sum(
    increase(mongodb_mongod_metrics_query_executor_total{instance="$host", state="scanned_objects"}[5m])
)
/
sum(
    increase(mongodb_mongod_metrics_document_total{instance="$host", state="returned"}[5m])
)

Note: sum() is required around the increase() functions when dividing two numbers in Prometheus queries.

Now, let’s make this a new graph! To do this you can create a new dashboard in PMM’s Grafana or add to an existing dashboard. In this example I’ll create a new dashboard with a single graph.

To add a new dashboard, press the Dashboard selector in PMM’s Grafana and select “Create New”:

This will create a new dashboard named “New Dashboard”.

Most of PMM’s graphing uses a variable named “$host” in place of a hostname/IP. You’ll notice “$host” was used in the “query efficiency” Prometheus query earlier. The variable is set using a Grafana feature called Templating.

Let’s add a “$host” variable to our new dashboard so we can change what host we graph without modifying our queries. First, press the gear icon at the top of the dashboard and select “Templating”:

Then press “New” to create a new Templating variable.

Set “Name” to be host, set “Data Source” to Prometheus and set “Query” to label_values(instance). Leave all other settings default:

Press “Add” to add the template variable, then save and reload the dashboard.

This will add a drop-down of unique hosts in Prometheus like this:

On the first dashboard row let’s add a new graph by opening the row menu on the far left of the first row and then select “Add Panel”:

Select “Graph” as the type. Click the title of the blank graph to open a menu, press “Edit”:

This opens Grafana’s graph editor with an empty graph and some input boxes seen below.

Next, let’s add our “query efficiency” Prometheus query (earlier in this article) to the “Query” input field and add a legend name to “Legend format”:

Now we have some graph data, but the Y-axis and title don’t explain very much about the metric. What does “1.0K” on the Y-Axis mean?

As our metric is a ratio, let’s display the Y-axis as a percentage by switching to the “Axes” tab, then selecting “percent (0.0-1.0)” as the “Unit” selection for the “Left Y” axis, like so:

Next let’s set a graph title. To do this, go to the “General” tab of the graph editor and set the “Title” field:

And now we have a “Query Efficiency” graph with an accurate Y-axis and Title(!):

“Back to Dashboard” on the top-right will take you back to the dashboard view. Always remember to save your dashboards after making changes!

Adding Custom Metrics to percona/mongodb_exporter

For those familiar with the Go programming language, adding custom metrics to percona/mongodb_exporter is fairly straightforward. The percona/mongodb_exporter uses the Prometheus Go client to export metrics gathered from queries to MongoDB.

Adding a completely new metric to the exporter is unfortunately too open-ended to explain in a blog. Instead, I will cover how an existing metric is exported by percona/mongodb_exporter. The process for a new metric will be similar.

To follow our previous example, here is an simplified example of how the metric: “mongodb_mongod_metrics_query_executor_total” is exported via percona/mongodb_exporter. This source of this metric is “db.serverStatus().metrics.queryExecutor” from MongoDB shell perspective.

First a new Prometheus metric is defined as a go ‘var’:

  1. var (
    	metricsQueryExecutorTotal = prometheus.NewCounterVec(prometheus.CounterOpts{
    		Namespace: Namespace,
    		Name:      "metrics_query_executor_total",
    		Help:      "queryExecutor is a document that reports data from the query execution system",
    	}, []string{"state"})
    )

    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/metrics.go#L58-L64
  2. A struct for marshaling the BSON response from MongoDB and an “.Export()” function is defined for the struct (.Export() is called on the metric structs):
    // QueryExecutorStats are the stats associated with a query execution.
    type QueryExecutorStats struct {
    	Scanned        float64 `bson:"scanned"`
    	ScannedObjects float64 `bson:"scannedObjects"`
    }
    // Export exports the query executor stats.
    func (queryExecutorStats *QueryExecutorStats) Export(ch chan<- prometheus.Metric) {
    	metricsQueryExecutorTotal.WithLabelValues("scanned").Set(queryExecutorStats.Scanned)
    	metricsQueryExecutorTotal.WithLabelValues("scanned_objects").Set(queryExecutorStats.ScannedObjects)
    }

    Notice that the float64 values unmarshaled from the BSON are used in the .Set() for the Prometheus metric. All Prometheus values must be float64.
    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/metrics.go#L271-L281
  3. In this case the “QueryExecutorStats” is a sub-struct of a larger “MetricStats” struct above it, also with its own “.Export()” function:
    // MetricsStats are all stats associated with metrics of the system
    type MetricsStats struct {
    	Document      *DocumentStats      `bson:"document"`
    	GetLastError  *GetLastErrorStats  `bson:"getLastError"`
    	Operation     *OperationStats     `bson:"operation"`
    	QueryExecutor *QueryExecutorStats `bson:"queryExecutor"`
    	Record        *RecordStats        `bson:"record"`
    	Repl          *ReplStats          `bson:"repl"`
    	Storage       *StorageStats       `bson:"storage"`
    	Cursor        *CursorStats        `bson:"cursor"`
    }
    // Export exports the metrics stats.
    func (metricsStats *MetricsStats) Export(ch chan<- prometheus.Metric) {
    	if metricsStats.Document != nil {
    		metricsStats.Document.Export(ch)
    	}
    	if metricsStats.GetLastError != nil {
    		metricsStats.GetLastError.Export(ch)
    	}
    	if metricsStats.Operation != nil {
    		metricsStats.Operation.Export(ch)
    	}
    	if metricsStats.QueryExecutor != nil {
    		metricsStats.QueryExecutor.Export(ch)
    	}
    ...

    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/metrics.go#L405-L430
  4. Finally a .Collect() and  .Describe()” (also required functions) is called on the metric to collect and describe it:
    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/metrics.go#L451
    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/metrics.go#L485
  5. Later on in this code, “MetricStats” is passed the result of the query “db.serverStatus().metrics”. This can be seen at:
    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/server_status.go#L60
    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/server_status.go#L177-L179
    and
    https://github.com/percona/mongodb_exporter/blob/master/collector/mongod/server_status.go#L197-L207

For those unfamiliar with Go and/or unable to contribute new metrics to the project, we suggest you open a JIRA ticket for any feature requests for new metrics here: https://jira.percona.com/projects/PMM.

Conclusion

With the flexibility of the monitoring components of Percona Monitoring and Management, the sky is the limit on what can be done with database monitoring! Hopefully this blog gives you a taste of what is possible if you need to add a new graph, a new metric or both to Percona Monitoring and Management. Also, it is worth repeating that a large number of metrics gathered in Percona Monitoring and Management are not graphed. Perhaps what you’re looking for is already collected. See “http://<pmm-server>/prometheus” for more details on what metrics are stored in Prometheus.

We are always open to improving our dashboards, and we would love to hear about any custom graphs you create and how they help solve problems!

by Tim Vaillancourt at March 11, 2017 12:10 AM

March 09, 2017

Peter Zaitsev

Percona Server for MongoDB 3.2.12-3.2 is now available

Percona Server for MongoDB

Percona Server for MongoDBPercona announces the release of Percona Server for MongoDB 3.2.12-3.2 on March 9, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.2.11-3.1 is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocks, Percona Memory Engine, and PerconaFT storage engine, as well as enterprise-grade features like External Authentication, Audit Logging, Profiling Rate Limiting, and Hot Backup at no extra cost. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: We deprecated the PerconaFT storage engine. It will not be available in future releases.

This release is based on MongoDB 3.2.12 and includes the following additional changes:

  • PSMDB-17: Changed welcome message in the shell to mention Percona Server for MongoDB instead of MongoDB
  • PSMDB-90: Added error message for storage engines that do not support Hot Backup
  • PSMDB-91: Deprecated audit configuration section and added auditLog instead
  • PSMDB-95: Fixed version dependencies for sub packages so that all corresponding packages get updated accordingly
  • PSMDB-96: Excluded diagnostic.data directory when using TokuBackup with PerconaFT
  • PSMDB-98: Improved Hot Backup to create destination folder if it does not exist
  • PSMDB-101: Implemented the auditAuthorizationSuccess parameter to enable auditing of authorization success
  • PSMDB-104: Updated links in client shell output to point to Percona’s documentation and forum
  • PSMDB-107: Fixed behavior when creating the audit log file
  • PSMDB-111: Refactored external_auth tests
  • PSMDB-123: Fixed the creation of proper subdirectories inside the backup destination directory
  • PSMDB-126: Added index and collection name to duplicate key error message
  • Fixed startup scripts for Ubuntu 14.04.5 LTS (Trusty Tahr)
  • Fixed a number of other small issues and bugs

Percona Server for MongoDB 3.2.12-3.2 release notes are available in the official documentation.

by Alexey Zhebel at March 09, 2017 06:52 PM

Services Monitoring with Probabilistic Fault Detection

Services Monitoring

In this blog post, we’ll discuss services monitoring using probabilistic fault detection.

Let’s admit it, the task of monitoring services is one of the most difficult. It is time-consuming, error-prone and difficult to automate. The usual monitoring approach has been pretty straightforward in the last few years: setup a service like Nagios, or pay money to get a cloud-based monitoring tool. Then choose the metrics you are interested in and set the thresholds. This is a manual process that works when you have a small number of services and servers, and you know exactly how they behave and what you should monitor. These days, we have hundred of servers with thousands of services sending us millions of metrics. That is the first problem: the manual approach to configuration doesn’t work.

That is not the only problem. We know that no two servers perform the same because no two servers have exactly the same workload. The thresholds that you setup for one server might not be the correct one for all of the other thousand. There are some approaches to the problem that will make it even worse (like taking averages and setting the thresholds based on those, for example, hoping it will work). Let me tell you a secret: it won’t work. Here we have a second problem: instances of the same type can demonstrate very different behaviors.

The last problem is that new shiny services you company may want to use are announced every week. It is impossible, because of time constraints, to know all of those services well enough to create a perfect monitoring template. In other words: sometimes we are asked to monitor software we don’t completely understand.

In summary, you have thousands of services, some of them you don’t even know how they work, that are sending you million of metrics that mean nothing to you. Now, set the thresholds and enable the pager alert. The nightmare has started. Is there a different approach?

Machine Learning

We have to stop thinking that monitoring is a bunch of config files with thresholds that we copy from one server to another. There are no magic templates that will work. We need to use a different technique that removes us from the process. That template is “machine learning.” As stated in Wikipedia, it is a subfield of computer science that gives computers the ability to learn without being explicitly programmed. In it’s most basic form, it can be used to solve classification problems. For example, open pet photos and identify if it is a cat or a dog. This is a classification problem that both humans and computers can solve, but we are much much slower. The computer has to take the time to learn the patterns, but at some point it will do the classification in no time.

I hope you are starting to see a pattern here. Why do we need to care about monitoring and its configuration if we have computers that can learn patterns and classify things for us?

There are two main ways of doing probabilistic fault detection: Novelty Detection and Outlier Detection.

Novelty Detection

Novelty Detection is easy to visualize and understand. It takes a series of inputs and tries to find anomalies, something that hasn’t been seen before. For example, our credit card company has a function that takes “category, expense, date, hour, country” as arguments and returns an integer so that they can classify and identify all the purchases. Your monthly use of the credit card looks like this:

[0,4,4,5,5,5,4,3]

That is the normal model that defines your use of the credit card. Now, it can be used to detect anomalies.

  • [0] – OK
  • [4] – OK
  • [4] – OK
  • [1] – Anomaly! Operation canceled.

Easy and straightforward. It is simple and very useful in a lot of areas to generate alerts when something anomalous happens. One of the machine learning models that can do this is One-Class Support Vector Machines, but since this is not the kind of fault detection we are looking for I won’t go into details. If you are interested, follow these links:

Outlier Detection

Let’s say we have this data:

[0, 3, 5, 1, -2, 19, 2, 10, -9, 2, 1, 8, 3, 21, -1, 3]

Now we know how to find anomalies, but how do we find outliers? Looking at the numbers above, it seems 21, 19 and -9 could be outliers. But a more exact definition is needed (not just intuition). The most simple and usual way of doing it is the following:

We divide our data into three pieces. One cut will be done at 25%, the second cut at 75%. The number that it is at 25% is called the First Quartile, and the value of the second cut is called the Third Quartile. The IQR or Interquartile Range is the subtraction of the Third Quartile with the First Quartile. Now, an outlier is any number that falls in one of these two categories:

  • If the value is below: (First Quartile) – (1.5 × IQR)
  • If the value is above: (Third Quartile + (1.5 × IQR)

Using Python:

inputs = [0, 3, 5, 1, -2, 19, 2, 10, -9, 2, 1, 8, 3, 21, -1, 3]
Q1 = np.percentile(inputs,25)
Q3 = np.percentile(inputs,75)
step = (Q3-Q1)*1.5
Q1
0.75
Q3
5.75
step
7.5
outliers = [x for x in inputs if x < Q1-step or x > Q3+step]
outliers
[19, -9, 21]

This looks more like what we need. If we are monitoring a metric, and outliers are detected, then something is happening there that requires investigation. Some of the most used outlier detection models in scikit-learn are:

  • Elliptic Envelope: a robust co-variance estimate that assumes that our data is Gaussian distributed. It will define the shape of the data we have, creating a frontier that delimits the contour. As you probably guessed, it will be elliptical in shape. Don’t worry about the assumption of Gaussian distribution, data can be standardized. More about this later on.

 

  • Isolation Forest: this is the well-known “forest of random trees,” but applied to outlier detection. This is more suitable when we have many different input metrics. In the example I use below, I just use a single metric, so this model would not work that well.

Therefore, Elliptic Envelope looks like the best option for our proof-of-concept.

For visual reference, this is how the three models look like when they try to shape two data inputs:

Services Monitoring
Source: scikit-learn.org

 

Proof-of-Concept

I haven’t explained the model in detail, but a high level explanation should be enough to understand the problem and the possible solution. Let’s start building a proof-of-concept.

For this test, I got data from our Prometheus setup, where all the time-series monitoring data from our customers is stored. In this particular example, I got numbers from the “Threads Running” metric. Those will be used to train our Elliptical Envelope. It is important to take the following into account:

    • We need to collect enough data so that it captures the correct shape of our baseline performance. For example, usually nighttime hours have less of a workload than during the day (same with weekend days, in some cases).
    • As explained before, it assumes a Gaussian distribution, which means that the data needs to be scaled. I am going to standardize the data so that it has 0 mean and 1 variance. The same standardization needs to be applied to the data we test after the training process, when the monitoring is already in place. That standardization also needs to be applied to each metric individually. This is the formula:
Services Monitoring
Source: dataminingblog.com

With μ as the mean and σ as the standard deviation.

This is the summary of what our proof-of-concept will do:

  • Read Prometheus JSON dump.
  • Separate some data for training, standardizing it first.
  • Separate some data for testing, standardizing it first as well.
  • Make predictions on test data.
  • For those rows identified as outliers, get the original non-standardize data to see the number of threads running.

So, let’s start:

Import the Libraries

import pandas as pd
import numpy as np
import json
from datetime import datetime
from sklearn.preprocessing import StandardScaler
from sklearn.covariance import EllipticEnvelope

Load the Data

All the information is in a JSON output from Prometheus that has the “threads_running” of a particular server. It has one second granularity for the last four weeks. I also converted “timestamps” to a normal “datetime” object so that it is easier to read:

with open('query.json') as data_file:
    data = json.load(data_file)
data = pd.DataFrame(data["data"]["result"][0]["values"])
data[0] = data[0].astype(int)
data[0] = data[0].apply(lambda x: datetime.fromtimestamp(x))

The data looks like this:

DateTime Threads Running
2017-01-19 20:32:44 1
2017-01-19 20:32:45 1
2017-01-19 20:32:46 2

 

Create the Training and Testing Dataset

First, separate some of the data for use as training:

train_data = data[(data[0] >= "2017-01-22") & (data[0] <= "2017-01-28" )]

Ignore the date column, and just store the metrics:

train_data = train_data.iloc[:,[1]]

Standardize it:

escaler = StandardScaler()
train_data = escaler.fit_transform(train_data)

Now the data looks like this:

Standardized Threads Running
-0.4072634
-0.4072634
0.47153585

To create the test dataset we need to follow the exact same procedure, only select a different timeframe:

test_original_data = data[(data[0] >= "2017-02-2") & (data[0] <= "2017-02-17" )]
test_data = test_original_data.iloc[:,[1]]
test_data = escaler.transform(test_data)

Train the Model

Let’s create our model with the training data! I am using two parameters here:

  • assume_centered: to specify that our data is already Gaussian distributed.
  • contamination: to specify the ratio of outliers our training data has.

clf = EllipticEnvelope(assume_centered=True,contamination=0)
clf.fit(train_data)

Search for Outliers

Now that we’ve trained the model and we have our test data, we can ask the model if it finds any outliers. It will return

1
 or
-1
 for each row. “1” means that the value of threads running is normal and within the boundaries, while “-1” means that the value is an outlier:

predictions = clf.predict(test_data)
outliers = np.where(predictions==-1)

The array “outliers” stores the row numbers where -1 was predicted.

At this point we have three important variables:

  • test_data: standardized testing data.
  • test_original_data: the original test data without modification.
  • outliers: the row numbers where an outlier was detected (-1).

Investigate the Outliers

Since we have the row number where an outlier was detected, now we can just query

test_original_data
 and search for those rows. In this example, I show some random ones:

for indice in outliers[0]:
    if np.random.randn() > 2.5:
        print("{} - {} threads running".format(test_original_data.iloc[indice][0], test_original_data.iloc[indice][1]))
2017-02-03 11:26:03 - 41 threads running
2017-02-03 11:26:40 - 43 threads running
2017-02-03 11:27:50 - 48 threads running
2017-02-03 11:32:07 - 78 threads running
2017-02-03 11:33:25 - 90 threads running
2017-02-12 10:06:58 - 36 threads running
2017-02-12 10:12:11 - 60 threads running
2017-02-12 10:12:30 - 64 threads running

And there we have it! Dates and hours when something really out of the ordinary happened. No need to create a config file for each service, guess thresholds, adjust them … nothing. Just let the model learn, and you get alerts when something unexpected happens. Push all the metrics from your services to these models, and let them do the hard work.

Summary

Most companies have similar situations. Companies add new services on hundred of servers, and monitoring is an essential part of the infrastructure. The old method of monolithic config files with some thresholds doesn’t scale, because it needs a lot of manual work with a trial/error approach. The types of techniques explained in this blog post can help us deploy monitoring on hundred of servers, not really caring about the different nuances of each service or workload. It is even possible to start monitoring a service without even knowing anything about it — just let the probabilistic model take care of it.

It is important to clarify that, in my opinion, these fault detection models are not going to be a substitute for software like Nagios. In those areas where a binary test is needed (service is up/down for example), Nagios and other similar services do a good job. Actually, a Nagios check can use the procedure explained here. When there are many metrics to analyze, probabilistic methods can save us from a nightmare.

by Miguel Angel Nieto at March 09, 2017 06:19 PM

March 08, 2017

Peter Zaitsev

Migrating MongoDB Away from MMAPv1

MMAPv1

MMAPv1This is another post in the series of blogs on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we’ll discuss moving away from the MMAPv1 storage engine.

Introduction

WIth the MongoDB v3.0 release in February of 2015, the long-awaited ability to choose storage engines became a reality. As of version 3.0, you could choose two engines in MongoDB Community Server and, if you use Percona Server for MongoDB, you could choose from four. Here’s a table for ease of consumption:

Here’s a table for easy consumption:

Storage Engine Percona Server for MongoDB MongoDB Community Server MongoDB Enterprise Server (licensed)
MMAPv1

WiredTiger

MongoRocks

In-memory

Encrypted

 

Why change engines?

With increased possibilities comes an increase in the decision-making process difficult (a concept that gets reinforced every time I take my mother out a restaurant with a large menu – ordering is never quick). In all seriousness, the introduction of the storage engine API to MongoDB is possibly the single greatest feature MongoDB, Inc has released to-date.

One of the biggest gripes from the pre-v3.0 days was MongoDB’s lack of scale. This was mostly due to the MMAPv1 storage engine, which suffered from a very primitive locking scheme. If you would like a illustration of the problem, think of the world’s biggest supermarket with one checkout line – you might be able to fit in lots of shoppers, but they’re not going to accomplish their goal quickly. So, the ability to increase performance and concurrency with a simple switch is huge! Additionally, modern storage engines support compression. This should reduce your space utilization when switching by at least 50%.

All the way up to MongoDB v3.2, the default storage engine was MMAPv1. If you didn’t make a conscious decision about what storage engine to choose when you started using MongoDB, there is a good chance that MMAPv1 is what you’re on. If you’d like to find out for sure what engine you’re using, simply run the command below. The output will be the name of the storage engine. As you can see, I was running the MMAPv1 storage engine on this machine. Now that we understand where we’re at, let’s get into where we can be in the future.

db.serverStatus().storageEngine.name
mmapv1

Public Service Announcement

Before we get into what storage engine(s) to evaluate, we need to talk about testing. In my experience, a majority of the MySQL and MongoDB community members are rolling out changes to production without planning or testing. If you’re in the same boat, you’re in very good company (or at least in a great deal of company). However, you should stop this practice. It’s basic “sample size” in statistics – when engaged in risk-laden behavior, the optimal time to stop increasing the sample size is prior to the probability of failure reaching “1”. In other words, start your testing and planning process today!

At Percona, we recommend that you thoroughly test any database changes in a testing or development environment before you decide to roll them into production. Additionally, prior to rolling the changes into production (with a well thought out plan, of course), you’ll need to have a roll-back plan in case of unintended consequences. Luckily, with MongoDB’s built-in replication and election protocols, both are fairly easy. The key here is to plan. This is doubly true if you are undertaking a major version upgrade, or are jumping over major versions. With major version upgrades (or version jumps) comes the increased likelihood of a change in database behavior as it relates to your application’s response time (or even stability).

What should I think about?

In the table above, we listed the pre-packaged storage engine options that are available to us and other distributions. We also took a look at why you should consider moving off of MMAPv1 in the preceding section. To be clear, in my opinion a vast majority of MongoDB users that are on MMAPv1 can benefit from a switch. Which engine to switch to is the pressing question. Your first decision should be to evaluate whether or not your workload fits into the sweet spot for MMAPv1 by reading the section below. If that section doesn’t describe your application, then the additional sections should help you narrow down your choices.

Now, let’s take a look at what workloads match up with what storage engines.

MMAPv1

Believe it or not, there are some use cases where MMAPv1 is likely to give you as good (or better) performance as any other engine. If you’re not worried about the size of your database on disk, then you may not want to bother changing engines. Users that are likely to see no benefit from changing have read-heavy (or 100%) read applications. Also, certain update-heavy use cases, where you’re updating small amounts of data or performing $set operations, are likely to be faster on MMAPv1.

WiredTiger

WiredTiger is a the new default storage engine for MongoDB. It is a good option for general workloads that are currently running on MMAPv1. WiredTiger will give you good performance for most workloads and will reduce your storage utilization with compression that’s enabled by default. If you have a write-heavy workload, or are approaching high I/O utilization (>55%) with plans for it to rise, then you might benefit from a migration to WiredTiger.

MongoRocks (RocksDB from Facebook)

This is Facebook’s baby, which was forged in the fires of the former Parse business unit. MongoRocks, which uses LSM indexing, is advertised as “designed to work with fast storage.” Don’t let this claim fool you. For workloads that are heavy on writes, highly concurrent or approaching disk bound, MongoRocks could give you great benefits. In terms of compression, MongoRocks has the ability to efficiently handle deeper compression algorithms, which should further decrease your storage requirements.

In-Memory

The in-memory engine, whether we’re speaking about the MongoDB or Percona implementation, should be used for workloads where extreme low latency is the most important requirement. The types of applications that I’m talking about are usually low-latency, “real-time” apps — like decision making or user session tracking. The in-memory engine is not persistent, so it operates strictly out of the cache allocated to MongoDB. Consequently, the data may (and likely will) be lost if the server crashes.

Encrypted

This is for applications in highly secure environments where on-disk encryption is necessary for compliance. This engine will protect the MongoDB data in the case that a disk or server is stolen. On the flip side, this engine will not protect you from a hacker that has access to the server (MongoDB shell), or can intercept your application traffic. Another way to achieve the same level of encryption for compliance is using volume level encryption like LUKS. An additional benefit of volume level encryption, since it works outside the database, is re-use on all compliant servers (not just MongoDB).

Getting to your new engine

Switching to the new engine is actually pretty easy, especially if you’re running a replica set. One important caveat is that unlike MySQL, the storage engine can only be defined per mongod process (not per database or collection). This means that it’s an all or nothing operation on a single MongoDB process. You’ll need to reload your data on that server. This is necessary because the data files from one engine are not compatible with another engine. Thus reloading the data to transform from one engine format to another is necessary. Here are the high-level steps (assuming you’re running a replica set):

  1. Make sure you’re not in your production environment
  2. Backup your data (it can’t hurt)
  3. Remove a replica set member
  4. Rename (or delete) the old data directory. The member will re-sync with the replica set
    • Make sure you have enough disk space if you’re going to keep a copy of the old data directory
  5. Update the mongo.conf file to use a new storage engine. Here’s an example for RocksDB from our documentation:
    storage:
     engine: rocksdb
     rocksdb:
       cacheSizeGB: 4
       compression: snappy
  6. Start the MongoDB process again
  7. Join the member to the replica set (initial sync will happen)
  8. When the updated member is all caught up, pick another member and repeat the process.
  9. Continue until the primary is the only server left. At this point, you should step down the primary, but hold off switching storage engines until you are certain that the new storage engine meets your needs.

The Wrap Up

At this point I’ve explained how you can understand your options, where you can gain additional performance and what engines to evaluate. Please don’t forget to test your application with the new setup before launching into production. Please drop a comment below if you found this helpful or, on the other hand, if there’s something that would make it more useful to you. Chances are, if you’d find something helpful, the rest of the community will as well.

by Jon Tobin at March 08, 2017 11:16 PM

Percona Live Featured Session with Bogdan Munteanu: Edgestore Multi-Tenancy and Isolation

Percona Live Featured Session

Percona Live Featured SessionWelcome to another post in the series of Percona Live featured talk blogs! In these blogs, we’ll highlight some of the session speakers that will be at this year’s Percona Live conference. We’ll also discuss how these sessions can help you improve your database environment. Make sure to read to the end to get a special Percona Live 2017 registration bonus!

In this Percona Live featured session, we’ll meet Bogdan Munteanu, Software Engineer at Dropbox. His session is Edgestore Multi-tenancy & Isolation. Edgestore is Dropbox’s distributed metadata store, used by hundreds of products, services and features (both internal and external). Dropbox shares a single Edgestore deployment for all workloads, which has many benefits. At the same time it also poses challenges around multi-tenancy and isolation.

I had a chance to speak with Bogdan about Edgestore:

Percona: How did you get into database technology? What do you love about it?

Bogdan: I am very passionate about large-scale distributed systems, as well as storage in general. After joining Dropbox and learning about the scale, growth and technical challenges of Dropbox’s metadata store, I decided to jump in.

One thing I love about database and database services is that for every company and deployment, they are critical, highly impactful systems. You’re always in the thick of the action! 🙂

Percona: Your talk is called Edgestore Multi-tenancy & Isolation. What does Edgestore do in Dropbox’s environment?

Bogdan: Edgestore is the metadata store that powers most of Dropbox’s products and features. Built on top of thousands of MySQL shards, it currently serves over six million RPS and stores three-trillion-plus objects.

Percona: What are the challenges that you faced at Dropbox around multi-tenancy and isolation?

Bogdan: As I mentioned earlier, most Dropbox’s products and features use Edgestore. In order to achieve isolation, many companies end up allocating dedicated database clusters for each product, or enforce hard limits on how much RPS each product can send.

We took a different approach: Edgestore is a single cluster serving all traffic and workloads. As you can probably guess, sometimes different products generate loads that negatively impact others. The challenge is to isolate the different types of workloads and traffic so as to prevent this from happening.

Due to our growth, new use cases come up all the time. It’s critical to have a mechanism that quickly detects who is causing a performance-impacting load, and then throttles back that specific source.

Percona: What do you want attendees to take away from your session? Why should they attend?

Bogdan: I think it would be interesting for folks to learn about the Edgestore architecture and workloads, some of our internals and the mechanisms we use to scale a one-size-fits-all metadata store while maintaining 99.99% availability.

Percona: What are you most looking forward to at Percona Live 2017?

Bogdan: I am really excited about several of the scheduled talks and workshops. There is a really good line up this year. It is also a great opportunity to meet other folks with similar interests in database services.

Register for Percona Live Data Performance Conference 2017, and see Bogdan present his session on Edgestore Multi-tenancy & Isolation. Use the code FeaturedTalk and receive $100 off the current registration price!

Percona Live Data Performance Conference 2017 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Data Performance Conference will be April 24-27, 2017 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

by Dave Avery at March 08, 2017 08:17 PM

Jean-Jerome Schmidt

New NinesControl Features Simplify Running MySQL & MongoDB in the Cloud

We were thrilled to announce the latest version of NinesControl early this week. Let’s have a closer look at some of the new features that were introduced.

Web console for accessing SSH, SQL and MongoDB command line

It is now possible to have SSH access from the NinesControl interface to any of your database servers, via a web based SSH proxy.

WebSSH and WebSQL - you can easily access your database’s shell and MySQL CLI directly from the NinesControl web page:

Just check the drop down menu for any of the database nodes, and you’ll see both WebSSH and WebSQL. If you are running MongoDB, you will have a MongoDB shell. Clicking on them opens a new window, with a connection to your database. It can be either access to the command line:

NinesControl WebSSH interface
NinesControl WebSSH interface

or access to the MySQL CLI:

NinesControl SQL CLI
NinesControl SQL CLI

Or access to the MongoDB CLI:

NinesControl MongoDB shell
NinesControl MongoDB shell

We use dedicated users to make such access possible - for the shell, it is ‘ninescontrol’ user and for the MySQL CLI - ‘ninescontroldb@localhost’. For MongoDB, it is the admin user you define when creating the MongoDB cluster. Make sure you don’t make changes to those users if you want to have this method of access available.

Add node to the cluster

With the new release of NinesControl, you have the tool to scale your cluster. If you ever find yourself in a position where you need one more database node to handle the load, you can easily add it through the “Add Node” action:

Adding a node to a cluster in NinesControl
Adding a node to a cluster in NinesControl

You will be presented with a screen where you need to pick the size of the new node:

Adding a node to a cluster in NinesControl
Adding a node to a cluster in NinesControl

After you click “Add Node”, the deployment process begins:

Adding a node to a cluster in NinesControl
Adding a node to a cluster in NinesControl

After a while node should be up and running.

Nodes running in NinesControl
Nodes running in NinesControl

Of course, whenever you feel like you don’t utilize all of your nodes, you can remove some of them:

Removing a node in ClusterControl
Removing a node in ClusterControl

While scaling your cluster up and down, please keep in mind that it is recommended to have odd number of nodes in both Galera and MongoDB clusters. You also should not reduce the number of nodes below three - this is a requirement if you want your cluster to be fault-tolerant.

Disable autorecovery for the cluster

NinesControl works in the background to make sure your cluster is up and running and your application can reach it and issue queries. Failed nodes are automatically recovered and restarted. Still, it may happen that you don’t want NinesControl to bring a node back up. It could be that you are performing some maintenance which requires the database instance to stay down. Maybe you want to restore an external binary backup and then bootstrap the rest of the cluster from that node? Right now it is extremely easy to disable automated recovery - all you need to do is to click on the Autorecovery switch in the UI:

Disabling Auto Recovery in NinesControl
Disabling Auto Recovery in NinesControl

It will change to:

Auto Recovery Disabled in NinesControl
Auto Recovery Disabled in NinesControl

Right now NinesControl will not attempt to restore nodes which failed.

Google Compute Engine Support

Last but definitely not least, NinesControl now also supports Google Compute Engine. You can learn more on how to setup access credentials and deploy MySQL or MongoDB on this new cloud provider.

We hope that this blog post helped you better understand the new features available in NinesControl. Please give them a try and let us know what you think.

by krzysztof at March 08, 2017 07:24 PM

MariaDB Foundation

MariaDB 10.0.30 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.30. This is a Stable (GA) release. See the release notes and changelog for details. Download MariaDB 10.0.30 Release Notes Changelog What is MariaDB 10.0? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 10.0.30 now available appeared first on MariaDB.org.

by Daniel Bartholomew at March 08, 2017 05:22 PM

MariaDB AB

Webyog – M|17 Sponsor

Webyog – M|17 Sponsor guest Wed, 03/08/2017 - 11:10

Authored by Shree Nair, Product Manager, Webyog

We are very excited to sponsor MariaDB’s inaugural user conference, M|17! It’s a great opportunity to introduce Webyog to the MariaDB community, and demonstrate the close collaboration between Webyog and MariaDB – we work together to provide powerful administrative tools for the fastest growing open source database.

By partnering with MariaDB, one of the most popular databases in the world, we have been able to help thousands of organizations. The products and services from Webyog and MariaDB form a joint solution for meeting database challenges as the market matures.

We develop best-in-class management and monitoring tools (SQLyog and Monyog) for MariaDB. In fact, they are included in a MariaDB Enterprise subscription. We offer quality products at an affordable price – our core competitive advantage. We consider word of mouth to be the key to our success and a strong barometer of quality. Providing the open source community with the most cost-effective management and monitoring tools is our mission, which is why we partnered with MariaDB.

While we intend to compete with enterprise solutions providers in the future, we will continue contributing to the open source community. We have received requests from other partners to better align Webyog in the interests of supporting their customers; however, we believe the best way to serve the open source community is with MariaDB.

While we have seen a growing interest from large MariaDB deployments in enterprises, we will continue to provide our world-class tools to MariaDB customers at no additional cost. Right now, we are working on a version of Monyog, optimized for MariaDB, that will include an unrivaled performance monitoring and trending system for key metrics.

Authored by Shree Nair, Product Manager, Webyog

We are very excited to sponsor MariaDB’s inaugural user conference, M|17! It’s a great opportunity to introduce Webyog to the MariaDB community, and demonstrate the close collaboration between Webyog and MariaDB – we work together to provide powerful administrative tools for the fastest growing open source database.

Login or Register to post comments

by guest at March 08, 2017 04:10 PM

MariaDB Foundation

Alibaba Cloud becomes a Platinum Sponsor of the MariaDB Foundation

MariaDB Foundation today announced that Alibaba Cloud, the cloud computing arm of Alibaba Group, has become its platinum sponsor. The sponsorship will help the Foundation in its goals to ensure continuity and open collaboration in the MariaDB ecosystem, and to drive adoption, serving an ever growing community of users and developers.   “Alibaba Cloud is […]

The post Alibaba Cloud becomes a Platinum Sponsor of the MariaDB Foundation appeared first on MariaDB.org.

by Ian Gilfillan at March 08, 2017 11:28 AM

Peter Zaitsev

How to Change MongoDB Storage Engines Without Downtime

MongoDB Storage Engines

MongoDB Storage EnginesThis blog is another in the series for the Percona Server for MongoDB 3.4 bundle release. Today’s blog post is about how to migrate between Percona Server for MongoDB storage engines without downtime.

Today, the default storage engine for MongoDB is WiredTiger. In previous versions (before 3.2), it was MMAPv1.

Percona Server for MongoDB features some additional storage engines, giving you the freedom for a DBA to choose the best storage based on application workload. Our storages engines are:

By design, each storage engine has its own algorithm and disk usage patterns. We simply stop and start Percona Server for MongoDB using different storage engines.

There are two common methods to change storage engines. One requires downtime, and the second doesn’t.

All the database operations are the same, even if it is using a different storage engine. From the database perspective, it doesn’t matter what storage engine gets used. The database layer asks the persistence API to save or retrieve data regardless.

For a single database instance, the best storage engine migration method is to start replication and add a secondary node with a different storage engine. Then

stepdown()
 the primary, making the secondary the new primary (killing the old primary).

However, this isn’t always an option. In this case, create a backup and use the backup to restore the database.

In the following set of steps, we’ll explain how to migrate a replica set storage engine from WiredTiger to RocksDB without downtime. I’m assuming that the replica set is already configured and doesn’t have any replication lag.

Please follow the instructions below:

  1. Check replica set status and identify the primary and secondaries. (Part of the output has been hidden to make it easier to read.):
    foo:PRIMARY> rs.status()
    {
    	"set" : "foo",
    	"date" : ISODate("2017-02-18T18:47:54.349Z"),
    	"myState" : 2,
    	"term" : NumberLong(2),
    	"syncingTo" : "adamo-percona:27019",
    	"heartbeatIntervalMillis" : NumberLong(2000),
    	"members" : [
    		{
    			"_id" : 0,
    			"name" : "test:27017",
    			"stateStr" : "PRIMARY" (...)
    		},
    		{
    			"_id" : 1,
    			"name" : "test:27018",
    			"stateStr" : "SECONDARY" (...)
    		},
    		{
    			"_id" : 2,
    			"name" : "test:27019",
    			"stateStr" : "SECONDARY" (...)
    		} { ... }
    	],
    	"ok" : 1
    }
  2. Choose the secondary for the new storage engine, and change its priority to 0:

    foo:PRIMARY> cfg = rs.config()

    We are going to work with test:27018 and test:27019. They are respectively the index 1 and 2 in the array members.
  3. Change the last secondary to the first instance to replace the storage engine:

    foo:PRIMARY> cfg.members[2].name
    test:27019
    foo:PRIMARY> cfg.members[2].priority = 0
    0
    foo:PRIMARY> cfg.members[2].hidden = true
    true
    foo:PRIMARY> rs.reconfig(cfg)
    { "ok" : 1 }
  4. Check if the configuration is in place:
    foo:PRIMARY>rs.config()
    {
    	"_id" : "foo",
    	"version" : 4,
    	"protocolVersion" : NumberLong(1),
    	"members" : [
    		{
    			"_id" : 0,
    			"host" : "test:27017",
    			"arbiterOnly" : false,
    			"buildIndexes" : true,
    			"hidden" : false,
    			"priority" : 1,
    			"votes" : 1
    		},
    		{
    			"_id" : 1,
    			"host" : "test:27018",
    			"arbiterOnly" : false,
    			"buildIndexes" : true,
    			"hidden" : false,
    			"priority" : 1,
    			"slaveDelay" : NumberLong(0),
    			"votes" : 1
    		},
    		{
    			"_id" : 2,
    			"host" : "test:27019",
    			"arbiterOnly" : false,
    			"buildIndexes" : true,
    			"hidden" : true, <--
    			"priority" : 0, <--
    			"slaveDelay" : NumberLong(0),
    			"votes" : 1
    		}
    	],
    	"settings" : {...}
    }
  5. Then stop the desired secondary and wipe the database folder. As we are running the replica set in a testing box, I’m going to kill the process running on port 27019. If using services please run:
    sudo service mongod stop
     on the secondary box. Before starting the mongod service, add the
    --storageEngine
     parameter to the config file or application parameter:
    ps -ef | grep mongodb | grep 27019
    kill < mongod pid>;
    rm -rf /data3/*
    ./mongod --dbpath data3 --logpath data3/log3.log --fork --port 27019 <strong>--storageEngine=rocksdb</strong> --replSet foo

    <config file>
    storage:
      engine: rocksdb
  6. This instance is now using the RocksDB storage engine and will perform an initial sync. When it finishes, to get the data from the primary node remove the
    hidden = false
     flag and let the application query this box:
    foo:PRIMARY> cfg = rs.config()
    foo:PRIMARY> cfg.members[2].hidden = false
    false
    foo:PRIMARY> rs.reconfig(cfg)
    { "ok" : 1 }
  7. Repeat step 6 for box test:27018, and use the following command as step 6. This makes one of the secondaries become the primary. Please be sure all secondaries are healthy before proceeding:
    foo:PRIMARY> cfg = rs.config()
    foo:PRIMARY> cfg.members[2].hidden = false
    false
    foo:PRIMARY> cfg.members[2].priority = 1
    foo:PRIMARY> cfg.members[1].priority = 1
  8. When both secondaries are available for reading and in sync with the primary, we need to change the primary’s storage engine. To do so, please perform a
    stepdown()
     in the primary, making this instance secondary. An election is triggered (and may take a few seconds to complete):
    foo:PRIMARY> rs.stepDown()
    2017-02-20T16:34:53.814-0300 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27019' :
    DB.prototype.runCommand@src/mongo/shell/db.js:135:1
    DB.prototype.adminCommand@src/mongo/shell/db.js:153:16
    rs.stepDown@src/mongo/shell/utils.js:1182:12
    @(shell):1:1
    2017-02-20T16:34:53.815-0300 I NETWORK [thread1] trying reconnect to 127.0.0.1:27019 (127.0.0.1) failed
    2017-02-20T16:34:53.816-0300 I NETWORK [thread1] reconnect 127.0.0.1:27019 (127.0.0.1) ok
    foo:SECONDARY> rs.status()
  9. Please identify the new primary with
    rs.status()
     and repeat the step 5 and 7 with the old primary.

After this process, the instances will run RocksDB without experiencing downtime (just an election to change the primary).

Please feel free to ping us on Twitter @percona with any questions and suggestions for this blog post.

by Adamo Tonete at March 08, 2017 01:30 AM

March 07, 2017

MariaDB AB

Replication Manager is Ready for Flashback and Much More!

Replication Manager is Ready for Flashback and Much More! svaroqui_g Tue, 03/07/2017 - 13:51

MariaDB 10.2.4 has fantastic new features that perfectly match Replication Manager's ultimate goals: transparent automated failover on MariaDB master slave architecture (with as little as possible lost in transaction:)).  We are going to explore those new features and how Replication Manager uses them for your benefit! 

The first feature is constant binlog fetching from remote master via mysqlbinlog.

Replication Manager will use this feature when your old master comes back to live.  It will take a snapshot of the transactions events differences from the position where the new elected master was introduced and the current position of the joiner old master.

Those events are saved in a crash directory in the replication-manager working directory for later use.  

Another exciting feature is the binlog flashback.  

Replication Manager will use this feature to reintroduce the old master in the active topology in multiple cases.

The first case, when semi-sync replication was synced during the crash: This is good as it saves you from recovering the dead master from a backup of the new master.

The picture looks like this: In semi-sync the transactional state of the dead master can be ahead of the new leader, as the sync part in the feature name refers to the state of the client, not the state of the database.

I'll try a metaphor:

A regular MariaDB replication GOAL is to make sure you mostly never lose a transaction under HA by accepting the nature of an unpredictable future. So If this were a performance of a show, you could enter even if you are a criminal. If you disturb the show, the show can recover: in this example, if the show gets disturbed, Replication Manager will transfer you and others to the same show at a later time, or in another place, or begin again at the same position in the show. Semi-sync is the speed of light delay, as if the event has already happened on the active stage but never made it to your eyes. We will transfer you before or after that time that is under your control and make sure that the show is closely synchronized!

So in semi-sync, the state is SYNC, the show stops in a state that is ahead of where others would be stuck with a "delayed show".  Since clients' connections have never seen such differences, you can flashback by rewinding the show from when the disturbance occurred and continue the show from the same point in another location.

This is exactly the same concept as a delayed broadcast. If going to the bathroom takes more time than the broadcast delay, you may have lost some important parts of the story when you press resume.

The second case is when semi-sync delay is passed or you have been not running semi sync replication:  We can resume the show but you have possibly lost events, Replication Manager can or flashback or use dump for recovering.  

Let examine the options available to make this happen.

      # MARIADB >= 10.2
      # ---------------
      mariadb-binary-path = "/usr/sbin"
      # REJOIN
      # --------
      autorejoin = true
      autorejoin-semisync = true
      autorejoin-flashback = true
      autorejoin-mysqldump = false

Don't forget to set in the cluster configuration that you want to auto resume:

      interactive = false

The default of Replication Manager is to alert on failure, not to do the job of failover.

Another exciting feature is the “no slaves behind” availability.

Can you use your slaves and transparently load balance reads with replication-manager topology? The answer was maybe with MaxScale read write splitting, but only if you didn’t care about reading delayed slave in auto commit workload.

For example, insert followed by close connection and passing the ball to another micro service that read that same data, would be insecure.

Now there is a possibility to configure read write splitter to failback to master under some replication delay lower than the one setup via no slaves behinds.

It brings the solution to slow down the master commit workload under that delay so that read on slave can become committed read!

Extra features:

The new Replication Manager release also addresses some requirement to manage multi clusters management within the same replication-manager (note the change in the configuration file).

      [cluster1]
      specific options for this cluster 
      [cluster3]  
      specific options for this cluster 
      [default] 

If you have a single cluster just use default.

In the console mode one can switch cluster using Ctrl-P  & Ctrl-N and in HTTP mode a drop box is available to switch the active cluster view. 

Some respected members of the community have addressed some possible issues and here with the choice made to separate failover logic in Replication Manager instead of putting it directly in MariaDB MaxScale proxy. This new release addresses such concerns. 

Let's look at the core new features of Replication Manager when it comes to MaxScale Proxy.

      failover-falsepositive-heartbeat = true
      failover-falsepositive-heartbeat-timeout = 3
      failover-falsepositive-maxscale = true
      failover-falsepositive-maxscale-timeout = 14

One can get this just by the names. Having separate pieces make it possible for better false positive detection of leader death. Here all your slaves acting have a leader failure detection and MaxScale does as well. This is on top of all previous checks and conditions checks.

      failcount = 5
      failover-max-slave-delay = 30
      failover-limit = 3
      failover-at-sync = true
      failover-time-limit = 10 

Stay tuned as more time will pass and the failover-falsepositive method will be added as it is already in the roadmap. I guess this task is addressing some of our fellow ace director musings found here.  Also, Etcd is already in the roadmap and will be worked on in the future and receive contributions for sure!

While failover and MaxScale monitoring can be tricky (as noted by Johan and Shlomi), Replication Manager is addressing the issue of the last slave available slave being elected as the new master.

In this case MaxScale is lost without a topology and this will be similar to having a single slave for HA. The solution to address this issue is to let Replication Manager fully drive MaxScale server state.

      maxscale-monitor = false
      # maxinfo|maxadmin
      maxscale-get-info-method = "maxinfo"
      maxscale-maxinfo-port = 4002
      maxscale-host = "192.168.0.201"
      maxscale-port = 4003
      maxscale-user = "admin"
      maxscale-pass = "mariadb"

By setting MaxScale monitoring = false replication-manager.  Tell MaxScale to disable monitoring and it will impose server status to MaxScale.

Don't forget to simply activate maxscale usage.

      maxscale = true 

Last but not least of the Replication Manager’s new features is the tracking of metrics via an embedded carbon graphite server. This internal server can be a relay for graphite for custom reporting and is also used by the HTTP server of Replication Manager.

 

replicationmanager.png

 

      graphite-metrics = true
      graphite-carbon-host = "127.0.0.1"
      graphite-carbon-port = 2003
      graphite-embedded = true
      graphite-carbon-api-port = 10002
      graphite-carbon-server-port = 10003
      graphite-carbon-link-port = 7002
      graphite-carbon-pickle-port = 2004
      graphite-carbon-pprof-port = 7007

All those features are passing the new non regression test cases and can be found in the dev branch of Replication Manager.

New binaries will be available this week for linux in rpm, deb packages or tar.gz , contact me or Guillaume for any other OS, we will be pleased to provide custom build images that match your setup.

Are you thinking, "Hey this is good technical content but my team does not know much about replication technical details?" No worries! We DO have helpers in Replication Manager to enforce best practices, and it's always best to plan HA before starting any new serious DB project.

      force-slave-heartbeat= true
      force-slave-gtid-mode = true
      force-slave-semisync = true
      force-slave-readonly = true
      force-binlog-row = true
      force-binlog-annotate = true
      force-binlog-slowqueries = true
      force-inmemory-binlog-cache-size = true
      force-disk-relaylog-size-limit = true
      force-sync-binlog = true
      force-sync-innodb = true
      force-binlog-checksum = true

*Note that some following enforcements do not get covered by test cases and we would welcome any contributors.

 

Happy usage from the Replication Manager team!

MariaDB 10.2.4 has fantastic new features that perfectly match Replication Manager's ultimate goals: transparent automated failover on MariaDB master slave architecture (with as little as possible lost in transaction:)).  We are going to explore those new features and how Replication Manager uses them for your benefit!  

Login or Register to post comments

by svaroqui_g at March 07, 2017 06:51 PM

Peter Zaitsev

Improving TokuDB Hot Backup Usage with the autotokubackup Command Line Tool

autotokubackup

autotokubackup In this blog post, we’ll look at how the command line tool autotokubackup can make TokuDB hot backups easier.

I would like to share an experimental tool named autotokubackup, for TokuBackup. This tool is aimed at helping simplify the life of TokuDB system administrators. autotokubackup is written in the Python language.

General information:

So why would you need this tool? Let’s clarify a bit what you might face while using tokubackup. You have a backup solution that you can use from the MySQL shell:

mysql > set tokudb_backup_dir='/var/lib/tokubackupdir';

Now you want to automate this process. The first problem is that the second backup will fail, because it’s required that the backup directory is empty before starting a backup process. One solution is to create time-stamped directories and for the backups.

Further, you have a backup policy that requires some other necessary files are copied as part of your backup process. You need to write a script to put those files into a separate folder under backup directory.

Another issue you will face is the lack of any clear output on backup progress. The shell just pauses until the backup completes. The one possible way to obtain information about the backup process is displaying the MySQL processlist in a separate MySQL shell. But it isn’t the best way, and there are some issues, as reported here: Unclear status information of backup state while taking backups using TokuBackup.

Generally, we need to know which files are backed up during the backup process. There should also be a clear message indicating the end of the backup process.

To make your life easier, the autotokubackup tool:

  • Automates the TokuDB database backup procedures
  • Creates timestamped backups inside the backup directory, overcoming the need for copy/remove old backups to empty the backup directory
  • Copies all necessary files for your backup policy (you can specify up to ten supplementary files to be in the backup directory as part of backup process)
  • Clearly describes what is going to be in the backup director, by showing newly created files inside backup directory
  • Clearly shows the end of backup process

To start, we only need two things:

  • Installed Percona Server with TokuDB engine + TokuBackup plugin
  • Installed Python3

To install the tool you can use following methods:

* From source:

cd /home
git clone https://github.com/Percona-Lab/autotokubackup.git
cd autotokubackup
python3 setup.py install

* or via pip3:

pip3 install autotokubackup

The result will be something like:

Collecting autotokubackup
  Downloading autotokubackup-1.1-py3-none-any.whl
Collecting watchdog>=0.8.3 (from autotokubackup)
  Downloading watchdog-0.8.3.tar.gz (83kB)
    100% |████████████████████████████████| 92kB 8.2MB/s
Collecting click>=3.3 (from autotokubackup)
  Downloading click-6.7-py2.py3-none-any.whl (71kB)
    100% |████████████████████████████████| 71kB 10.6MB/s
Collecting mysql-connector>=2.0.2 (from autotokubackup)
  Downloading mysql-connector-2.1.4.zip (355kB)
    100% |████████████████████████████████| 358kB 4.7MB/s
Collecting PyYAML>=3.10 (from watchdog>=0.8.3->autotokubackup)
  Downloading PyYAML-3.12.tar.gz (253kB)
    100% |████████████████████████████████| 256kB 6.5MB/s
Collecting argh>=0.24.1 (from watchdog>=0.8.3->autotokubackup)
  Downloading argh-0.26.2-py2.py3-none-any.whl
Collecting pathtools>=0.1.1 (from watchdog>=0.8.3->autotokubackup)
  Downloading pathtools-0.1.2.tar.gz

After that there should be a configuration file for this tool located -> /etc/tokubackup.conf.

The structure of the config file is:

[MySQL]
mysql=/usr/bin/mysql
user=root
password=12345
port=3306
socket=/var/run/mysqld/mysqld.sock
host=localhost
datadir=/var/lib/mysql
[Backup]
backupdir=/var/lib/tokubackupdir
[Copy]
# The following copy_file_x options allow you to copy various files together with your backup
# Highly recommended; a copy of your my.cnf file (usually /etc/my.cnf) and any cnf files referenced from it (i.e. includedir etc.)
# You can also include other files you would like to take a copy of, like for example a text report or the mysqld error log
# copy_file_1=
# copy_file_2=
# copy_file_...=
# copy_file_10=
#copy_file_1=/etc/my.cnf
#copy_file_2=/var/log/messages
#copy_file_3=
#copy_file_4=
#copy_file_5=
#copy_file_6=
#copy_file_7=
#copy_file_8=
#copy_file_9=
#copy_file_10=

You can change options to reflect your environment and start to use. Available command line options for the tool can be displayed using --help

tokubackup --help
Usage: tokubackup [OPTIONS]
Options:
--backup              Take full backup using TokuBackup.
--version             Version information.
--defaults_file TEXT  Read options from the given file
--help                Show this message and exit.

You can prepare different config files. For example, one for the slave. Specify using the –defaults_file option, and the overall result of the run should be something like the below:

tokubackup --backup --defaults_file=/etc/tokubackup_node2.conf
Backup will be stored in  /var/lib/tokubackupdir/2017-02-09_20-25-40
Running backup command => /home/sh/percona-server/5.7.17/bin/mysql -uroot --password=msandbox --host=localhost --socket=/tmp/mysql_sandbox20194.sock -e set tokudb_backup_dir='/var/lib/tokubackupdir/2017-02-09_20-25-40'
mysql: [Warning] Using a password on the command line interface can be insecure.
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/__tokudb_lock_dont_delete_me_data
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/__tokudb_lock_dont_delete_me_logs
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/__tokudb_lock_dont_delete_me_temp
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/log000000000006.tokulog29
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/tokudb.rollback
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/tokudb.environment
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/tokudb.directory
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/tc.log
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/client-key.pem
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/server-cert.pem
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/server-key.pem
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/ca.pem
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/ca-key.pem
Created file in backup directory -> /var/lib/tokubackupdir/2017-01-31_14-15-46/mysql_data_dir/auto.cnf
Completed - OK

The backup directory will store the following:

ls -l 2017-02-09_20-25-40/
  copied_files            - Directory for copied files.
  global_variables        - File for MySQL global variables.
  mysql_data_dir          - Directory for copied MySQL datadir.
  session_variables       - File for MySQL session variables.
  tokubackup_binlog_info  - File for storing binary log position.(The new feature for TokuBackup) [Not released yet]
  tokubackup_slave_info   - File for storing slave info.(The new feature for TokuBackup) [Not released yet]

That’s it. If you test it and find bugs, send a feature request to further improve our “helper.” Thanks! 🙂

by Shahriyar Rzayev at March 07, 2017 12:44 AM

March 06, 2017

Peter Zaitsev

Webinar Thursday, March 9, 2017: Troubleshooting Issues with MySQL Character Sets

MySQL Character Sets

MySQL Character SetsPlease join Percona’s Principal Technical Services Engineer, Sveta Smirnova as she presents “Troubleshooting Issues with MySQL Character Sets ” on March 9, 2017, at 11:00 am PST / 2:00 pm EST (UTC-8).


Many MySQL novices find MySQL character sets support puzzling. But after you understand how it is designed, you will find it much more powerful than many other competing database solutions.

MySQL allows to specify a character set for every object, and change it online. For years this has helped to create fast applications that can work with audiences all around the world. However, it also requires any DBA troubleshooting character set issues to have a deep understanding of how they work. Different sort rules and collations can complicate the process.

In the webinar we will discuss:

  • Which character sets and collations MySQL supports
  • How they can be set
  • How to understand error messages
  • How to solve character sets/collations compatibility issues
  • What server, application, command line and graphical tool options are available
  • What to check first and how to continue troubleshooting
  • What the various compatibility issues are
  • How to convert data, created in earlier versions
  • What the best practices are

Register for the webinar here.

MySQL Character SetsSveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona, Sveta worked as Support Engineer in MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle. She is the author of the book “MySQL Troubleshooting” and JSON UDF functions for MySQL.

 

by Dave Avery at March 06, 2017 11:52 PM

MySQL, –i-am-a-dummy!

--I-am-a-dummyIn this blog post, we’ll look at how “operator error” can cause serious problems (like the one we saw last week with AWS), and how to avoid them in MySQL using

--i-am-a-dummy
.

Recently, AWS had some serious downtime in their East region, which they explained as the consequence of a bad deployment. It seems like most of the Internet was affected in one way or another. Some on Twitter dubbed it “S3 Dependency Awareness Day.”

Since the outage, many companies (especially Amazon!) are reviewing their production access and deployment procedures. It would be a lie if I claimed I’ve never made a mistake in production. In fact, I would be afraid of working with someone who claims to have never made a mistake in a production environment.

Making a mistake or two is how you learn to have a full sense of fear when you start typing:

UPDATE t1 SET c1='x' ...

I think many of us have experienced forehead sweats and hand shaking in these cases – they save us from major mistakes!

The good news is that MySQL can help you with this. All you have to do is admit that you are human, and use the following command (you can also set this in your user directory .my.cnf):

mysql --i-am-a-dummy

Using this command (also known as safe-updates) sets the following SQL mode when logging into the server:

SET sql_safe_updates=1, sql_select_limit=1000, max_join_size=1000000;

The safe-updates and iam-a-dummy flags were introduced together in MySQL 3.23.11, and according to some sites from around the time of release, it’s “for users that once may have done a DELETE FROM table_name but forgot the WHERE clause.”

What this does is ensure you can’t perform an UPDATE or DELETE without a WHERE clause. This is great because it forces you to think through what you are doing. If you still want to update the whole table, you need to do something like WHERE ID > 0. Interestingly, safe-updates also blocks the use of WHERE 1, which means “where true” (or basically everything).

The other safety you get with this option is that SELECT is automatically limited to 1000 rows, and JOIN is limited to examining 1 million rows. You can override these latter limits with extra flags, such as:

--select_limit=500 --max_join_size=10000

I have added this to the .my.cnf on my own servers, and definitely use this with my clients.

by Manjot Singh at March 06, 2017 07:54 PM

Jean-Jerome Schmidt

Announcing NinesControl 3.0 with added support for Google Cloud and more

This week we’re happy to announce a new release of NinesControl, our cloud service for open source database management, with no need to install anything. NinesControl 3.0 offers extended scaling support with an improved user experience and security all while adding a new cloud provider, Google Cloud.

Built on the capabilities of ClusterControl, NinesControl enables users to uniformly and transparently deploy and manage secure mixed database environments on any cloud, with no vendor lock-in. It offers quick, easy, point-and-click deployment of a standalone or a clustered SQL and NoSQL database on a cloud provider of your choice; and each provisioned database is automatic, repeatable and completes in minutes.

NinesControl is for developers and admins of all skills levels and removes the complexity and learning curve that typically comes with highly-available database clusters. Users of Amazon AWS, DigitalOcean, and now Google Cloud can spin-up MySQL or MongoDB instances in under a minute; with more cloud providers and datastores being planned for. It is currently a free-to-use service with an additional paid-for version in the pipeline.You can sign up for free on ninescontrol.com.

Release Highlights

  • Addition of Google Cloud support
  • New management features around scaling, automated recovery & introduction of new web-based consoles
  • Improved user experience
  • Improved security

Features Highlights

  1. Google Cloud - NinesControl continues to increase the number of supported cloud providers. You can now scale and manage your database on Google Cloud, as well as AWS and Digital Ocean.
  2. SSH Web Terminal - NinesControl expands your ability to easily SSH into your database nodes.
  3. Database Client - NinesControl now provides direct access to the database client interface from within the service; allowing you to connect directly to your nodes via your internet browser by opening up a SQL or MongoDB console.
  4. Stability - Based on feedback from our users and through our continued improvements of the service, NinesControl provides you with the stability you need to manage your open source databases easily and efficiently.
  5. Increased Security - NinesControl continues to improve the security of your database environments with new user management features and security optimizations.
  6. Scaling - In addition to supporting MariaDB or MySQL Galera Cluster and MongoDB, NinesControl now allows you to easily add and remove nodes to meet your growing data needs.
  7. Automatic Recovery - NinesControl now provides automated node and cluster recovery to ensure uptime with your high availability applications.
  8. New Technologies - Support for Percona XtraDB Cluster 5.7
  9. Extend NinesControl to Any Cloud - You can now use NinesControl with any cloud provider of your choosing using our CloudLink Framework with these two new public Github repositories
    1. https://github.com/severalnines/ninescontrol-cloudlink-api
    2. https://github.com/severalnines/ninescontrol-cloudlink

NB: Cloudlink is a wrapper framework for Cloud providers' apis. The main goal is to have standardized input and output data structures. The Framework is written in NodeJs and easy to extend. It uses native SDKs, provided by the Cloud vendors. The Framework does not require a database and does not store any of the credentials. In short: it is a smart proxy to the cloud api.

Sign up for Ninescontrol (FREE)

Whether you’re a developer who wants an easy way to securely deploy and manage high-availability database setups in any cloud; or you want the flexibility to utilize or migrate to different cloud vendors, and avoid being locked into using a specific vendor; or you simply want full control over your database instances, with the ability to connect via SSH if required … NinesControl has the answers for you. Check our service out today and let us know your feedback.

Happy clustering in the cloud!

by jj at March 06, 2017 02:10 PM

March 04, 2017

Daniël van Eeden

Improving MySQL out of disk space behaviour

Running out of disk space is something which, of course, should never happen as we all setup monitoring and alerting and only run well behaved applications. But when it does happen we want things to fail gracefully.

So what happens when mysqld runs out of disk space?
The answer is: It depends
  1. It might start to wait until disk space becomes available.
  2. It might crash intentionally after a 'long semaphore wait'
  3. It might return an error to the client (e.g. 'table full')
  4. It might skip writing to the binlog (see binlog_error_action )
What actually happens might depend on the filesystem and OS.

Fixing the disk space issue can be done by adding more space or cleaning up some space. The later can often be done without help of the administrator of the system.

So I wanted to change the behaviour so that it MySQL wouldn't crash or stop to respond to read queries. And to also make it possible for a user of the system to cleanup data to get back to a normal state.

So I wrote a audit plugin which does this:
  1. The DBA sets the maxdiskusage_minfree variable to a threshold for the minimum amount of MB free.
  2. If the amount of free disk space goes under this threshold:
    1. Allow everything for users with the SUPER privilege
    2. Allow SELECT and DELETE
    3. Disallow INSERT
  3. If the amount of free space goes back to normal: Allow everything again
This works, but only if you delete data and then run optimize table to actually make the free space available for the OS.

Note that DELETE can actually increase disk usage because of binlogs, undo, etc.

The code is available on github: https://github.com/dveeden/mysql_maxdiskusage

by Daniël van Eeden (noreply@blogger.com) at March 04, 2017 05:10 PM

March 03, 2017

Peter Zaitsev

MongoDB Audit Log: Why and How

MMAPv1

MongoDB Audit LogThis blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we’ll talk about the MongoDB audit log.

Percona’s development team has always invested in the open-source community a priority – especially for MongoDB. As part of this commitment, Percona continues to build MongoDB Enterprise Server features into our free, alternative, open-source Percona Server for MongoDB. One of the key features that we have added to Percona Server for MongoDB is audit logging. Auditing your MongoDB environment strengthens your security and helps you keep track of who did what in your database.

In this blog post, we will show how to enable this functionality, what general actions can be logged, and how you can filter only the information that is important for your use-case.

Enable Audit Log

Audit messages can be logged into syslog, console or file (JSON or BSON format). In most cases, it’s preferable to log to the file in BSON format (the performance impact is smaller than JSON). In the last section, you can find some simple examples of how to further query this type of file.

Enable the audit log in the command line or the config file with:

mongod --dbpath /var/lib/mongodb --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/auditLog.bson

auditLog:
   destination: file
   format: BSON
   path: /var/lib/mongodb/auditLog.bson

Just note that until this bug is fixed and released, if you’re using Percona Server for MongoDB and the --fork option while starting the mongod instance you’ll have to provide an absolute path for audit log file instead of relative path.

Actions logged

Generally speaking, the following actions can be logged:

  • Authentication and authorization
  • Cluster operations
  • Read and write operations (logged under authCheck event and require auditAuthorizationSuccess parameter to be enabled)
  • Schema operations
  • Custom application messages (logged under applicationMessage event if the client/app issues a logApplicationMessage command,  the user needs to have clusterAdmin role or the one that inherits from it to issue this command)

You can see the whole list of actions logged here.

By default, MongoDB doesn’t log all the read and write operations. So if you want to track those, you’ll have to enable the auditAuthorizationSuccess parameter. They then will be logged under the authCheck event. Note that this can have a serious performance impact.

Also, this parameter can be enabled dynamically on an already running instance with the audit log setup, while some other things can’t be changed once setup.

Enable logging of CRUD operations in the command line or config file:

mongod --dbpath /var/lib/mongodb --setParameter auditAuthorizationSuccess=true --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/auditLog.bson

auditLog:
  destination: file
  format: BSON
  path: /var/lib/mongodb/auditLog.bson
setParameter: { auditAuthorizationSuccess: true }

Or to enable it on the running instance, issue this command in the client:

db.adminCommand( { setParameter: 1, auditAuthorizationSuccess: true } )

Filtering

If you don’t want to track all the events MongoDB is logging by default, you can specify filters in the command line or the config file. Filters need to be valid JSON queries on the audit log message (format available here). In the filters, you can use standard query selectors ($eq, $in, $gt, $lt, $ne, …) as well as regex. Note that you can’t change the filters dynamically after the start.

Also, Percona Server for MongoDB 3.2 and 3.4 have slightly different message formats. 3.2 uses a “params” field, and 3.4 uses “param” just like MongoDB. When filtering on those fields, you might want to check for the difference.

Filter only events from one user:

mongod --dbpath /var/lib/mongodb --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/auditLog.bson --auditFilter '{ "users.user": "prod_app" }'

auditLog:
  destination: file
  format: BSON
  path: /var/lib/mongodb/auditLog.bson
  filter: '{ "users.user": "prod_app" }'

Filter events from several users based on username prefix (using regex):

mongod --dbpath /var/lib/mongodb --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/auditLog.bson --auditFilter '{ "users.user": /^prod_app/ }'

auditLog:
  destination: file
  format: BSON
  path: /var/lib/mongodb/auditLog.bson
  filter: '{ "users.user": /^prod_app/ }'

Filtering multiple event types by using standard query selectors:

mongod --dbpath /var/lib/mongodb --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/auditLog.bson --auditFilter '{ atype: { $in: [ "dropCollection", "dropDatabase" ] } }'

auditLog:
  destination: file
  format: BSON
  path: /var/lib/mongodb/auditLog.bson
  filter: '{ atype: { $in: [ "dropCollection", "dropDatabase" ] } }'

Filter read and write operations on all the collections in the test database (notice the double escape of dot in regex):

mongod --dbpath /var/lib/mongodb --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/auditLog.bson --setParameter auditAuthorizationSuccess=true --auditFilter '{ atype: "authCheck", "param.command": { $in: [ "find", "insert", "delete", "update", "findandmodify" ] }, "param.ns": /^test\\./ } }'

auditLog:
  destination: file
  format: BSON
  path: /var/lib/mongodb/auditLog.bson
  filter: '{ atype: "authCheck", "param.command": { $in: [ "find", "insert", "delete", "update", "findandmodify" ] }, "param.ns": /^test\\./ } }'
setParameter: { auditAuthorizationSuccess: true }

Example messages

Here are two example messages from an audit log file. The first one is from a failed client authentication, and the second one is where the user tried to insert a document into a collection for which he has no write authorization.

> bsondump auditLog.bson
{"atype":"authenticate","ts":{"$date":"2017-02-14T14:11:29.975+0100"},"local":{"ip":"127.0.1.1","port":27017},"remote":{"ip":"127.0.0.1","port":42634},"users":[],"roles":[],"param":{"user":"root","db":"admin","mechanism":"SCRAM-SHA-1"},"result":18}

> bsondump auditLog.bson
{"atype":"authCheck","ts":{"$date":"2017-02-14T14:15:49.161+0100"},"local":{"ip":"127.0.1.1","port":27017},"remote":{"ip":"127.0.0.1","port":42636},"users":[{"user":"antun","db":"admin"}],"roles":[{"role":"read","db":"admin"}],"param":{"command":"insert","ns":"test.orders","args":{"insert":"orders","documents":[{"_id":{"$oid":"58a3030507bd5e3486b1220d"},"id":1.0,"item":"paper clips"}],"ordered":true}},"result":13}

Querying audit log for specific event

The audit log feature is now working, and we have some data in the BSON binary file. How do I query it to find some specific event that interests me? Obviously there are many simple or more complex ways to do that using different tools (Apache Drill or Elasticsearch come to mind), but for the purpose of this blog post, we’ll show two simple ways to do that.

The first way without exporting data anywhere is using the bsondump tool to convert BSON to JSON and pipe it into the jq tool (command-line JSON processor) to query JSON data. Install the jq tool in Ubuntu/Debian with:

sudo apt-get install jq

Or in Centos with:

sudo yum install epel-release
sudo yum install jq

Then, if we want to know who created a database with the name “prod” for example, we can use something like this (I’m sure you’ll find better ways to use the jq tool for querying this kind of data):

> bsondump auditLog.bson | jq -c 'select(.atype == "createDatabase") | select(.param.ns == "prod")'
{"atype":"createDatabase","ts":{"$date":"2017-02-17T12:13:48.142+0100"},"local":{"ip":"127.0.1.1","port":27017},"remote":{"ip":"127.0.0.1","port":47896},"users":[{"user":"prod_app","db":"admin"}],"roles":[{"role":"root","db":"admin"}],"param":{"ns":"prod"},"result":0}

In the second example, we’ll use the mongorestore tool to import data into another instance of mongod, and then just query it like a normal collection:

> mongorestore -d auditdb -c auditcol auditLog.bson
2017-02-17T12:28:56.756+0100    checking for collection data in auditLog.bson
2017-02-17T12:28:56.797+0100    restoring auditdb.auditcol from auditLog.bson
2017-02-17T12:28:56.858+0100    no indexes to restore
2017-02-17T12:28:56.858+0100    finished restoring auditdb.auditcol (142 documents)
2017-02-17T12:28:56.858+0100    done

The import is done, and now we can query the collection for the same data from the MongoDB client:

> use auditdb
switched to db auditdb
> db.auditcol.find({atype: "createDatabase", param: {ns: "prod"}})
{ "_id" : ObjectId("58a6de78bdf080b8e8982a4f"), "atype" : "createDatabase", "ts" : { "$date" : "2017-02-17T12:13:48.142+0100" }, "local" : { "ip" : "127.0.1.1", "port" : 27017 }, "remote" : { "ip" : "127.0.0.1", "port" : 47896 }, "users" : [ { "user" : "prod_app", "db" : "admin" } ], "roles" : [ { "role" : "root", "db" : "admin" } ], "param" : { "ns" : "prod" }, "result" : 0 }

It looks like the audit log in MongoDB/Percona Server for MongoDB is a solid feature. Setting up tracking for information that is valuable to you only depends on your use case.

by Tomislav Plavcic at March 03, 2017 11:24 PM

A Look at MariaDB Subquery Cache

MariaDB Subquery Cache

MariaDB Subquery CacheThe MariaDB subquery cache feature added in MariaDB 5.3 is not widely known. Let’s see what it is and how it works.

What is a subquery cache?

The MariaDB subquery cache optimizes the execution of correlated subqueries. Correlated subqueries refer to a value from the parent query. For example:

SELECT id FROM product WHERE price NOT IN (SELECT MAX(price) FROM product GROUP BY category);

MariaDB only uses this optimization if the parent query is a SELECT, not an UPDATE or a DELETE. The subquery results get cached only for the duration of the parent query.

MariaDB added the subquery cache in v5.3. It is controlled by optimizer_switch, a dynamic variable that contains many flags that enable or disable several optimizations. To disable the subquery cache, run:

SET GLOBAL optimizer_switch='subquery_cache=OFF';

You can also do this at the session level.

How does subquery cache work?

Let’s see how it works. To make things clearer, we will use an example. Consider these tables:

CREATE TABLE t1 (a INT, b INT);
INSERT INTO t1 VALUES
(1,2),
(3,4),
(1,2),
(3,4),
(3,4),
(3,5),
(3,5),
(5,1),
(5,2),
(3,6),
(1,5);
CREATE TABLE t2 (c INT, d INT);
INSERT INTO t2 VALUES
(1,10),
(2,20),
(3,30),
(4,40);

Now, we issue this query:

SELECT b, (SELECT d FROM t2 WHERE a = c) FROM t1;

The server decides to read t1 first (the bigger table, as expected), and then access t2 using the subquery cache. It creates a MEMORY temporary table to store the results of the subquery, with an index on c (it is used to match the rows). Then it reads the first row from t1, and checks if the search is cached. It is not, so it reads t1 looking for rows with c=1 and copies the results into the cache. The next time it will find the value 1, and it will not need to access t2 because the matches are already cached. If you look at the data, you may notice that the value “5” appears twice in t1 (and is absent in t2). But the search is cached anyway, so the server searches for 5 in t2 only once.

I hope that you aren’t blindly accepting what I wrote until now: good DBAs need facts and metrics. Let’s be scientific: we’ll make a prediction, conduct an experiment and check the status variables to verify the prediction. The prediction is the list of rows that will be read from t1, with the running total of hits and misses:

(1,2) -- Miss: 1
(3,4) -- Miss: 2
(1,2) -- Hit:  1
(3,4) -- Hit:  2
(3,4) -- Hit:  3
(3,5) -- Hit:  4
(3,5) -- Hit:  5
(5,1) -- Miss: 3
(5,2) -- Hit:  6
(3,6) -- Hit:  7
(1,5) -- Hit:  8

MariaDB [test]> SHOW STATUS LIKE "subquery_cache%";
+---------------------+-------+
| Variable_name | Value |
+---------------------+-------+
| Subquery_cache_hit | 8 |
| Subquery_cache_miss | 3 |
+---------------------+-------+
2 rows in set (0.00 sec)
MariaDB [test]> SHOW STATUS LIKE "handler_tmp_write";
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Handler_tmp_write | 3 |
+-------------------+-------+
1 row in set (0.00 sec)

The totals match, and the number of writes to the cache is equal to the misses (after a miss, a table access is done and cached).

The maximum size of an individual table is the minimum of

tmp_table_size
 and max_heap_table_size. If the table size grows over this limit, the table is written to disk. If the MEMORY table creation fails (perhaps because MEMORY does not support BLOB), the subquery is not cached.

The total of hits and misses can be seen by reading two status variables:

subquery_cache_hit
 and
subquery_cache_miss
. After 200 misses, the server checks the hit ratio for that particular subquery. If it is < 20%, it disables the cache for that subquery. If the hit rate is < 70% the table cannot be written to disk in case it exceeds the size limit. These numbers (200, 0.2, 0.7) are hardcoded and cannot be changed. But if you really want to test how MariaDB behaves with different parameters, you can change these constants in sql/sql_expression_cache.cc and recompile the server.

Isn’t this subquery materialization?

Subquery materialization is another strategy that the optimizer can choose to execute a query. It might look similar, because some data from a subquery are written to a MEMORY table – but this is the only similarity. The purpose and implementation of subquery materialization is different.

Let’s try to explain this with pseudocode.

Subquery materialization is for uncorrelated IN subqueries. Therefore the subquery is executed and materialized altogether:

# Query to optimize:
SELECT ... WHERE col1 IN (subquery)
materialize subquery into a MEMORY table with UNIQUE keys;
foreach (row in outer query) {
	check if col1 current value exists in materialized table
}

The subquery cache is for correlated subqueries. Thus the subquery gets executed only for non-cached values:

# Query to optimize:
SELECT col1, (SELECT ... WHERE ... = col1) ... FROM ...
foreach (outer query row) {
	if (col1 current value is cached) {
		read from cache
	} else {
		read from subquery
		cache col1 current value
	}
}

Some considerations

Despite the similarity in names, the MariaDB subquery cache is not a query cache for subqueries. These features are different, implemented for different purposes. Obviously, the subquery cache doesn’t have the scalability and performance problems of the query cache (global mutex, table invalidation). As mentioned, a subquery cache table only survives for the duration of a statement, so it should be considered an optimizer strategy. For example, in some cases you might use the subquery cache for a WHERE … NOT IN subquery, but not for the WHERE … IN version, because the optimizer prefers to rewrite it as a JOIN.

Of course, not all correlated subqueries automatically benefit from this feature. Consider the example above: it is built to show that the subquery cache is useful. But we can easily build an example to show that can have a negative impact on performance: add rows to t1, and delete all duplicate values of a. There will be no hits, a temporary table is created, 200 reads and writes are performed, but it won’t help. After 200 misses, the cache will be disabled, yes, but what if this happens for each subquery? The damage may not be huge in a realistic case, but it is still damage. That’s why you can disable the MariaDB subquery cache.

by Federico Razzoli at March 03, 2017 02:11 AM

March 02, 2017

Peter Zaitsev

Using Percona Toolkit pt-mongodb-summary

Percona Server for MongoDB

pt-mongodb-summaryThis blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we’ll look at the

pt-mongodb-summary
 tool in Percona Toolkit.

The

pt-mongodb-summary
 tool from Percona Toolkit provides a quick at-a-glance overview of MongoDB and Percona Server for MongoDB instances. It is equivalent to
pt-mysql-summary
 for MySQL. 

pt-mongodb-summary
 also collects information about a MongoDB cluster. It collects information from several sources to provide an overview of the cluster.

How It Works

The usage for the command is as follows:

pt-mongodb-summary [OPTIONS] [HOST[:PORT]]

Options:

  • -a, –auth-db: Specifies the database used to establish credentials and privileges with a MongoDB server. By default, the admin database is used.
  • -p, –password: Specifies the password to use when connecting to a server with authentication enabled. Do not add a space between the option and its value:
    -p<password>
    .
    If you specify the option without any value,
    pt-mongodb-summary
     will ask for the password interactively.
  • -u, –user: Specifies the user name for connecting to a server with authentication enabled.

By default, if you run

pt-mongodb-summary
 without parameters, it tries to connect to the localhost on port 27017. It collects information about the MongoDB instances by running administration commands, and formatting the output.

Sections

Instances

The first thing the tool does is get the list of hosts connected to the specified MongoDB instance by running the

listShards
 command. It also runs
replSetGetStatus
 
on every instance to collect the ID, type, and replica set for each instance.

This host

Next, it gathers information about the host it is connected to by grouping information collected from

hostInfo
,
getCmdLineOpts
,
serverStatus
 
and the OS process (by process ID). The result provides an overview of the running instance and the underlying OS.

Running ops

This section collects statistics by running the

serverStatus
 command five times at regular intervals (every one second), and provides the minimum, maximum and average operation counters for
insert
,
query
,
update
,
delete
,
getMore
 and
command
 operations.

Security

This collects information about security configurations by parsing the 

getCmdLineOpts
 command and asking the admin.system.users, and admin.system.roles collections.

Oplog

From the MongoDB website:

The oplog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the data stored in your databases. MongoDB applies database operations on the primary and then records the operations on the primary’s oplog. The secondary members then copy and apply these operations in an asynchronous process. All replica set members contain a copy of the oplog, in the local.oplog.rs collection, which allows them to maintain the current state of the database.

How do we get the oplog info? The program collects statistics from the oplog for every host in the cluster, and returns the information on the statistics having the smaller

TimeDiffHours
  value.

Cluster-wide

This section provides information about the number of sharded/unsharded databases, collections and their size.The information is collected by running the

listDatabases
 command, and then running
collStats
 
for every collection in every database.

Conditional Sections

You may notice not all sections appear all the time. This is because there are three main patterns:

Sharded Connection to Mongos

  • Instances
  • This host
  • Running ops
  • Security
  • Cluster-wide

ReplicaSet Connection

  • Instances (limited to the current Replica Set)
  • This host
  • Running ops
  • Security
  • Oplog

Standalone Connection

  • Instances (limited to this host)
  • This host
  • Running ops
  • Security

Output Example

The following is an example of the output for

pt-mongodb-summary
:

./pt-mongodb-summary
# Instances ##############################################################################################
  PID    Host                         Type                      ReplSet                   Engine
 11037 localhost:17001                SHARDSVR/PRIMARY          r1                    wiredTiger
 11065 localhost:17002                SHARDSVR/SECONDARY        r1                    wiredTiger
 11136 localhost:17003                SHARDSVR/SECONDARY        r1                    wiredTiger
 11256 localhost:17004                SHARDSVR/ARBITER          r1                    wiredTiger
 11291 localhost:18001                SHARDSVR/PRIMARY          r2                    wiredTiger
 11362 localhost:18002                SHARDSVR/SECONDARY        r2                    wiredTiger
 11435 localhost:18003                SHARDSVR/SECONDARY        r2                    wiredTiger
 11513 localhost:18004                SHARDSVR/ARBITER          r2                    wiredTiger
 11548 localhost:19001                CONFIGSVR                 -                     wiredTiger
 11571 localhost:19002                CONFIGSVR                 -                     wiredTiger
 11592 localhost:19003                CONFIGSVR                 -                     wiredTiger
# This host
# Mongo Executable #######################################################################################
       Path to executable | /home/karl/tmp/MongoDB32Labs/3.2/bin/mongos
# Report On karl-HP-ENVY ########################################
                     User | karl
                PID Owner | mongos
                 Hostname | karl-HP-ENVY
                  Version | 3.2.4
                 Built On | Linux x86_64
                  Started | 2017-02-22 11:39:20 -0300 ART
                Processes | 12
             Process Type | mongos
# Running Ops ############################################################################################
Type         Min        Max        Avg
Insert           0          0          0/5s
Query            0          0          0/5s
Update           0          0          0/5s
Delete           0          0          0/5s
GetMore          0          0          0/5s
Command          1          1          5/5s
# Security ###############################################################################################
Users  : 0
Roles  : 0
Auth   : disabled
SSL    : disabled
Port   : 0
Bind IP:
# Cluster wide ###########################################################################################
            Databases: 4
          Collections: 21
  Sharded Collections: 5
Unsharded Collections: 16
    Sharded Data Size: 134.87 MB
  Unsharded Data Size: 1.44 GB
          ###  Chunks:
                   5 : samples.col2
                 132 : carlos.sample4
                 400 : carlos.sample3
                  50 : carlos.sample2
                 100 : carlos.sample1
# Balancer (per day)
              Success: 18
               Failed: 0
               Splits: 682
                Drops: 0

 The following is an output example when connected to a secondary in the replica set.

./pt-mongodb-summary localhost:17002
# Instances ##############################################################################################
  PID    Host                         Type                      ReplSet                   Engine
  9247 localhost:17001                SHARDSVR/PRIMARY          r1                    wiredTiger
  9318 localhost:17002                SHARDSVR/SECONDARY        r1                    wiredTiger
  9391 localhost:17003                SHARDSVR/SECONDARY        r1                    wiredTiger
  9466 localhost:17004                SHARDSVR/ARBITER          r1                    wiredTiger
# This host
# Mongo Executable #######################################################################################
       Path to executable | /home/karl/tmp/MongoDB32Labs/3.2/bin/mongod
# Report On karl-HP-ENVY:17002 ########################################
                     User | karl
                PID Owner | mongod
                 Hostname | karl-HP-ENVY:17002
                  Version | 3.2.4
                 Built On | Linux x86_64
                  Started | 2017-02-23 10:26:27 -0300 ART
                  Datadir | labs/r1-2
                Processes | 12
             Process Type | replset
# Running Ops ############################################################################################
Type         Min        Max        Avg
Insert           0          0          0/5s
Query            0          0          0/5s
Update           0          0          0/5s
Delete           0          0          0/5s
GetMore          0          1          1/5s
Command          1          3         13/5s
# Security ###############################################################################################
Users  : 0
Roles  : 0
Auth   : disabled
SSL    : disabled
Port   : 17002
Bind IP:

This next example shows when it is connected to a stand alone instance:

/pt-mongodb-summary localhost:27018
# Instances ##############################################################################################
PID Host Type ReplSet Engine
1 localhost:27018 - wiredTiger
# This host
# Report On 2f8862dce6c4 ########################################
PID Owner | mongod
Hostname | 2f8862dce6c4
Version | 3.2.10
Built On | Linux x86_64
Started | 2017-02-23 08:57:36 -0300 ART
Processes | 1
Process Type | mongod
# Running Ops ############################################################################################
Type Min Max Avg
Insert 0 0 0/5s
Query 0 0 0/5s
Update 0 0 0/5s
Delete 0 0 0/5s
GetMore 0 0 0/5s
Command 1 1 5/5s
# Security ###############################################################################################
Users : 0
Roles : 0
Auth : disabled
SSL : disabled
Port : 0
Bind IP:

Conclusion

The tool

pt-mongodb-summary
is new in Percona Toolkit. In the future, we hope we can make this grow to the size of its MySQL big brother!

by Carlos Salguero at March 02, 2017 11:08 PM

March 01, 2017

Peter Zaitsev

Using Percona Toolkit pt-mongodb-query-digest

Percona Server for MongoDB

pt-mongodb-query-digestThis blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, we’ll look at how to use the

pt-mongodb-query-digest
 tool in Percona Toolkit 3.0.

Percona’s

pt-query-digest
 is one of our most popular Percona Toolkit MySQL tools. It is used on a daily basis by DBAs and developers to help identify the queries consuming the most resources. It helps in finding bottlenecks and optimizing database usage. The
pt-mongodb-query-digest
 is a similar tool for MongoDB.

About the Profiler

Before we start, remember that the MongoDB database profiler is disabled by default, and should be enabled. It can be enabled server-wide, but the full mode that logs all queries is not recommended in production unless you are using Percona Server for MongoDB 3.2 or higher. We added a feature to allow the sample rate of non-slow queries (like in MySQL) to limit the overhead this causes. 

Additionally, by default, the profiler is only 1MB per database. You may want to remove/create the profiler to sufficient size to find the results useful. To do this, use:

org_prof_level = db.getProfilingLevel();
//Disable Profiler
db.setProfilingLevel(0);
db.system.profile.drop();
//Setup  a  100M profile  1*Math.pow(1024,2) == 1M
profiler_size = 100 * Math.pow(1024,2);
db.runCommand( { create: "system.profile", capped: true, size: profiler_size } );
db.setProfilingLevel(org_prof_level);

According to the documentation, to check if the profiler is enabled for the samples database, run:

`echo "db.getProfilingStatus();" | mongo localhost:17001/samples`

Remember, you need to connect to a MongoDB instance, not a mongos. The output will be something like this:

MongoDB shell version: 3.2.12
connecting to: localhost:17001/samples
{ "was" : 0, "slowms" : 100 }
bye

The value for the field “was” is 0, which means profiling is disabled. Let’s enable the profiler for the samples database.

You must enable the profiler on all MongoDB instances that could be related to a shard of our database. To check on which instances we should enable the profiler, I am going to use the

pt-mongodb-summary
 tool. It shows us the information we need about our cluster:

./pt-mongodb-summary
./pt-mongodb-summary
# Instances ##############################################################################################
  PID    Host                         Type                      ReplSet                   Engine
 11037 localhost:17001                SHARDSVR/PRIMARY          r1                    wiredTiger
 11065 localhost:17002                SHARDSVR/SECONDARY        r1                    wiredTiger
 11136 localhost:17003                SHARDSVR/SECONDARY        r1                    wiredTiger
 11256 localhost:17004                SHARDSVR/ARBITER          r1                    wiredTiger
 11291 localhost:18001                SHARDSVR/PRIMARY          r2                    wiredTiger
 11362 localhost:18002                SHARDSVR/SECONDARY        r2                    wiredTiger
 11435 localhost:18003                SHARDSVR/SECONDARY        r2                    wiredTiger
 11513 localhost:18004                SHARDSVR/ARBITER          r2                    wiredTiger
 11548 localhost:19001                CONFIGSVR                 -                     wiredTiger
 11571 localhost:19002                CONFIGSVR                 -                     wiredTiger
 11592 localhost:19003                CONFIGSVR                 -                     wiredTiger

We have mongod service running on the localhost on ports 17001~17003 and 18001~18003.

Now, let’s enable the profiler for the samples database on those instances. For this example, I am going to set the profile level to “2”, to collect information about all queries.

for port in 17001 17002 17003 18001 18002 18003; do echo "db.setProfilingLevel(2);" | mongo localhost:${port}/samples; done

Running pt-mongodb-query-profile

Now we are ready to get statistics about our queries. To run

pt-mongodb-query-digest
, we need to specify at least “host: port/database”, like:

./pt-mongodb-query-digest localhost:27017/samples

The output will be something like this (I am showing a section for only one query):

# Query 0:  0.27 QPS, ID 2c0e2f94937d6660f510adeea98618f3
# Ratio    1.00  (docs scanned/returned)
# Time range: 2017-02-22 12:27:21.004 -0300 ART to 2017-02-22 12:28:00.867 -0300 ART
# Attribute            pct     total        min         max        avg         95%        stddev      median
# ==================   ===   ========    ========    ========    ========    ========     =======    ========
# Count (docs)                   845
# Exec Time ms          99      1206           0         697           1           0          29           0
# Docs Scanned           7    594.00        0.00       75.00        0.70        0.00        7.19        0.00
# Docs Returned          7    594.00        0.00       75.00        0.70        0.00        7.19        0.00
# Bytes recv             0      8.60M     215.00        1.06M      10.17K     215.00      101.86K     215.00
# String:
# Namespaces          samples.col1
# Operation           query
# Fingerprint         user_id
# Query               {"user_id":{"$gte":3506196834,"$lt":3206379780}}

From the output, we can see that this query was seen 97 times, and it provides statistics for the number of documents scanned/retrieved by the server, the execution time and size of the results. The tool also provides information regarding the operation type, the fingerprint and a query example to help to identify the source. 

By default, the results are sorted by query count. It can be changed by setting the

--order-by
 parameter to: count, ratio, query-time, docs-scanned or docs-returned.

A “-” in front of the field name denotes the reverse order. Example:

--order-by=-ratio

When considering what ordering to use, you need to know if you are looking for the most common queries (-count), the most cache abusive (-docs-scanned), or the worst ratio of scanned to returned (-ratio)? Please note you may be tempted to use (-query-time), however you will find this almost always ends up being more queries affected by, but not causing, issues.

Conclusion

This is a new tool in the Percona Toolkit. We hope in the future we can make it grow like its big brother for MySQL (

pt-query-digest
). This tool helps DBAs and developers identify and solve bottlenecks, and keep servers running at top performance.

by Carlos Salguero at March 01, 2017 11:21 PM

Open Source Databases on Big Machines: Disk Speed and innodb_io_capacity

In this blog post, I’ll look for the bottleneck that prevented the performance in my previous post from achieving better results.

The powerful machine I used in the tests in my previous post has a comparatively slow disk, and therefore I expected my tests would hit a point when I couldn’t increase performance further due to the disk speed.

Hardware configuration:

Processors: physical = 4, cores = 72, virtual = 144, hyperthreading = yes
Memory: 3.0T
Disk speed: about 3K IOPS
OS: CentOS 7.1.1503
File system: XFS

Versions tested and configuration: same as in the first post of this series (check the post for specifics).

Even though I expected my tests would stop increasing in performance due to the disk speed, I did not observe high IO rates in the 

iostat
 output. I already tested with a full data set that fits in memory. In this case, write performance only affected data flushes and log writes. But we should still see a visible decrease in speed. So I decided to try RW tests totally in memory. I created a ramdisk and put the MySQL datadir on it. Surprisingly, results on the SSD and ramdisk did not differ.

I asked my colleagues from “Postgres Professional” to test PostgreSQL with the ramdisk. They got similar results:

It’s interesting that the value of

innodb_io_capacity
 does not have any effect on this situation. Data for the graph below was taken when I ran tests on ramdisk. I wanted to see if I could control the IO activity of a disk, which is extremely fast by default, using this variable.

This totally contradicts all my past experiences with smaller machines. Percona re-purposed the machine with a faster disk (which I used before, described in this post), so I used a similar one with slower disk speed.

Hardware configuration:

Processors: physical = 2, cores = 12, virtual = 24, hyperthreading = yes
Memory: 47.2G
Disk speed: about 3K IOPS
OS: Ubuntu 14.04.5 LTS (trusty)
File system: ext4

Again, in this case

innodb_io_capacity
 benchmarks with a smaller number of CPU cores showed more predictable results.

Conclusion:

Both MySQL and PostgreSQL on a machine with a large number of CPU cores hit CPU resources limits before disk speed can start affecting performance. We only tested one scenario, however. With other scenarios, the results might be different.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

by Sveta Smirnova at March 01, 2017 11:00 PM

February 28, 2017

Peter Zaitsev

Percona Monitoring and Management (PMM) Graphs Explained: MongoDB MMAPv1

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM)This blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog post, I hope to cover some areas to watch with Percona Monitoring and Management (PMM) when running MMAPv1. The graph examples from this article are from the MMAPv1 dashboard that will be released for the first time in PMM 1.1.2.

Since the very beginning of MongoDB, the MMAPv1 storage engine has existed. MongoDB 3.0 added a pluggable storage engine API. You could only use MMAPv1 with MongoDB before that. While MMAPv1 often offers good read performance, it has become famous for its poor write performance and fragmentation at scale. This means there are many areas to watch for regarding performance and monitoring.

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB. It was developed by Percona on top of open-source technology. Behind the scenes, the graphing features this article covers use Prometheus (a popular time-series data store), Grafana (a popular visualization tool), mongodb_exporter (our MongoDB database metric exporter) plus other technologies to provide database and operating system metric graphs for your database instances.

(Beware of) MMAPv1

mmap() is a system-level call that causes the operating system kernel to map on-disk files to memory while it is being read and written by a program.

As mmap() is a core feature of the Unix/Linux operating system kernel (and not the MongoDB code base), I’ve always felt that calling MMAPv1 a “storage engine” is quite misleading, although it does allow for a simpler explanation. The distinction and drawbacks of the storage logic being in the operating system kernel vs. the actual database code (like most database storage engines) becomes very important when monitoring MMAPv1.

As Unix/Linux are general-purpose operating systems that can have many processes, users and uses cases, they offer limited OS-level metrics in terms of activity, latency and performance of mmap(). Those metrics are for the entire operating system, not just for the MongoDB processes.

mmap() uses memory from available OS-level buffers/caches for mapping the MMAPv1 data to RAM — memory that can be “stolen” away by any other operating system process that asks for it. As many deployments “micro-shard” MMAPv1 to reduce write locks, this statement can become exponentially more important. If 3 x MongoDB instances run on a single host, the kernel fights to cache and evict memory pages created by 3 x different instances with no priority or queuing, essentially at random, while creating contention. This causes inefficiencies and less-meaningful monitoring values.

When monitoring MMAPv1, you should consider MongoDB AND the operating system as one “component” more than most engines. Due to this, it is critical that a database host runs a single MongoDB instance with no other processes except database monitoring tools such as PMM’s client. This allows MongoDB to be the only user of the operating system filesystem cache that MMAPv1 relies on. This also makes OS-level memory metrics more accurate because MongoDB is the only user of memory. If you need to “micro-shard” instances, I recommend using containers (Docker or plain cgroups) or virtualization to separate your memory for each MongoDB instance, with just one MongoDB instance per container.

Locking

MMAPv1’s has locks for both reads and writes. In the early days the lock was global only. Locking became per-database in v2.2 and per-collection in v3.0.

Locking is the leading cause of the performance issues we see on MMAPv1 systems, particularly write locking. To measure how much locking an MMAPv1 instance is waiting on, first we look at the “MMAPv1 Lock Ratio”:

Another important metric to watch is “MongoDB Lock Wait Time”, breaking down a number of time operations spend waiting on locks:

Three factors in combination influence locking:

  1. Data hotspots — if every query hits the same collection or database, locking increases
  2. Query performance — a lock is held for the duration of an operation; if that operation is slow, lock time increases
  3. Volume of queries — self-explanatory

Page Faults

Page faults happen when MMAPv1 data is not available in the cache and needs to be fetched from disk. On systems with data that is smaller than memory page faults usually only occur on reboot, or if the file system cache is dumped. On systems where data exceeds memory, this happens more frequently — MongoDB is asked for data not in memory.

How often this happens depends on how your application accesses your data. If it accesses new or frequently-queried data, it is more likely to be in memory. If it accesses old or infrequent data, more page faults occur.

If page faults suddenly start occurring, check to see if your data set has grown beyond the size of memory. You may be able to reduce your data set by removing fragmentation (explained later).

Journaling

As MMAPv1 eventually flushes changes to disk in batches, journaling is essential for running MongoDB with any real data integrity guarantees. As well as being included in the lock statistic graphs mentioned above, there are some good metrics for journaling (which is a heavy consumer of disk writes).

Here we have “MMAPv1 Journal Write Activity”, showing the data rates of journaling (max 19MB/sec):

“MMAPv1 Journal Commit Activity” measures the commits to the journal ops/second:

A very useful metric for write query performance is “MMAPv1 Journaling Time” (there is another graph with 99th percentile times):

This is important to watch, as write operations need to wait for a journal commit. In the above example, “write_to_journal” and “write_to_data_files” are the main metrics I tend to look at. “write_to_journal” is the rate of changes being written to the journal, and “write_to_data_files” is the rate that changes are written to on-disk data.

If you see very high journal write times, you may need faster disks or in-sharding scenarios. Adding more shards spreads out the disk write load.

Background Flushing

“MMAPv1 Background Flushing Time” graphs the background operation that calls flushes to disk:

This process does not block the database, but does cause more disk activity.

Fragmentation

Due to the way MMAPv1 writes to disk, it creates a high rate of fragmentation (or holes) in its data files. Fragmentation slows down scan operations, wastes some filesystem cache memory and can use much more disk space than there is actual data. On many systems I’ve seen, the size of MMAPv1 data files on disk take over twice the true data size.

Currently, our Percona Monitoring and Management MMAPv1 support does not track this, but we plan to add it in the future.

To track it manually, look at the output of the “.stats()” command for a given collection (replace “sbtest1” with your collection name):

> 1 - ( db.sbtest1.stats().size / db.sbtest1.stats().storageSize )
0.14085410557184752

Here we can see this collection is about 14% fragmented on disk. To fix fragmentation, the most common fix is dropping and recreating the collection using a backup. Many just remove a replication member, clear the data and let it do a new replication initial sync.

Operating System Memory

In PMM we have graphed the operating system cached memory as it acts as the primary cache for MMAPv1:

For the most part, “Cached” is the value showing the amount of data that is cached MMAPv1 data (assuming the host is only running MongoDB).

We also graph the dirty memory pages:

It is important that dirty pages do not exceed the hard dirty page limit (which causes pauses). It is also important that dirty pages don’t accumulate (which wastes cache memory). The “soft” dirty page limit is the limit that starts dirty page cleanup without pausing.

On this host, you could probably lower the soft limit to clean up memory faster, assuming the increase in disk activity is acceptable. This topic is covered in this post: https://www.percona.com/blog/2016/08/12/tuning-linux-for-mongodb/.

What’s Missing?

As mentioned earlier, fragmentation rates are missing for MMAPv1 (this would be a useful addition). Due to the limited nature of the metrics offered for MMAPv1, PMM probably won’t provide the same level of graphs for MMAPv1 compared to what we provide for WiredTiger or RocksDB. There will likely be fewer additions to the graphing capabilities going forward.

If you are using a highly concurrent system, we highly recommend you upgrade to WiredTiger or RocksDB (both also covered in this monitoring series). These engines provide several solutions to MMAPv1 headaches: document-level locking, built-in compression, checkpointing that cause near-zero fragmentation on disk and much-improved visibility for monitoring. We just released Percona Server for MongoDB 3.4, and it provides many exciting features (including these engines).

Look out for more monitoring posts from this series!

by Tim Vaillancourt at February 28, 2017 11:28 PM

MariaDB AB

JSON with MariaDB 10.2

JSON with MariaDB 10.2 anderskarlsson4 Tue, 02/28/2017 - 08:43

JSON is fast becoming the standard format for data interchange and for unstructured data, and MariaDB 10.2 adds a range on JSON supporting functions, even though a JSON datatype isn't implemented yet. There are some reasons why there isn't a JSON datatype, but one is that there are actually not that many advantages to that as JSON is a text-based format. This blog post aims to describe JSON and the use cases for it, as well as to describe the MariaDB 10.2 JSON functions and uses for these, as well as showing some other additions to MariaDB 10.2 that are useful for JSON processing.

So to begin with then, why do we need JSON? Or to put it differently, why do we not store all data in JSON? Well, the reason as I see it is that some data we work with really is best treated as schemaless whereas some other data really should be handled in a more strict way in a schema. Which means that in my mind mixing relational data with unstructured data is what we really want. And using JSON for unstructured data is rather neat, and JSON is even standardized (see json.org).

There are reasons why this hasn't always been so.  When the sad old git that is writing this stuff started working in this industry, which I think was during the Harding administration, computers were rare, expensive, handled only by experts (so how I got to work with them is a mystery) and built from lego-bricks, meccano and pieces of solid gold (to keep up the price). Also, they were as powerful as a slide-ruler, except it was feed with punched-cards (and probably powered by steam). Anyway, no one in their right mind would have considered string pictures of cute felines as something to be on a computer, or actually stuff to be stored in the database. The little that would fit was the really important stuff - like price, amount in stock, customer name, billing address and such - and nothing else.  And not only that, stuff that was stored had some kind of value, somehow, which meant it had to follow certain rules (and following rules is something I am good at? I wonder how I ended up in this business. Again). Like, a price had to be a number of some kind, with a value 0 or higher and some other restrictions. As you see, these were hard and relentless times.

And then time moved on and people started buying things on the internet (whatever the internet is. I think it is some kind of glorified, stylish version of punched cards) and stuff such as Facebook and Google came around. The issue with computer storage was now not how to fit all that nicely structured data in it, but rather once we have filled that hard drive on your desktop with all the product, customers and transactions from Amazon (I think Amazon has something to do with Internet, but I am not sure) and a full 17.85% of that drive is now occupied by that old-style structured data, what more do we put in there? Maybe we could put some more information on the items for sale in that database, and some general information on who is buying it? That should fill up that disk nicely, right? Well, yes, but that new data, although related to the structured data I already have, is largely unstructured. Say, for example, you write a review of a product on Amazon late in the morning after a good deal of heavy "partying" (which is not an Internet thing, I think), the contents of that would hardly be considered "structured". If you didn't like the product (which you probably didn't), then the appropriate terms for large parts of that review would probably be "profanity" or "foul language").

The way to deal with the above is a mix of structured and unstructured data, with some kind of relation between the two. Like a column of unstructured data in each relational database table (or should I say "relation", just to show my age? Or maybe I should pretend to be really young, modern and cool, possibly sporting a hipster beard and all that, by calling it a "collection").

With that out of the way, let's consider an example using structured as well as non-structured JSON data. Assume we have a store selling different types of clothing, pants, jackets, shoes and we are to create a table to hold the inventory. This table would have some columns that are always there and which have the same meaning for all rows in the table, like name, amount in stock and price. These are items that are well suited for a relational format. On top of this we want to add attributes that have different meaning for each type of or even each instance of items. Here we have things like colour, width, length and size. These we consider non-relational as the interpretation of these attributes are different depending of the type of garment (like size M or shoe sizes or a "zebra striped" colour) and some garments might have some unique attribute, like designer or recommended by staff or something. Our table might then look something like this:

MariaDB> CREATE TABLE products(id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(255) NOT NULL,
  price DECIMAL(9,2) NOT NULL,
  stock INTEGER NOT NULL,
  attr VARCHAR(1024));

In this table we have a few columns that look like columns in any relational database table, and then we have a column, called attr, that can hold any relevant attribute for the garment in question and we will store that as JSON a JSON string. You probably notice that we aren't using a JSON datatype here as that is not present in MariaDB, despite that there are JSON functions, but those JSON functions act on a text-string with JSON content. These functions are introduced in MariaDB 10.2 (which is in Beta as I write this), but there are a few bugs that means you should use MariaDB 10.2.4 or higher, which means as for now we assume that MariaDB 10.2.4 or higher is being used.

But there is one issue with the above that I don't particularly care for and that is, as the attr column is plain text, any kind of data can be put in the attr column, even non-valid JSON. The good thing is that there is a fix for this in MariaDB 10.2, which is CHECK constraints that actually work, and this is a little discussed feature of MariaDB 10.2. The way this works is that this kind of constraint kicks in whenever a row is INSERTed or UPDATEed, any CHECK constraint runs and validates the data and if the validation fails the operation also fails. Before I show an example I just want to mention one JSON function we are to use here, which is JSON_VALID which takes a string and checks if it is valid JSON. Note that although CHECK constraints are particularly valid here, check constraints can be used for any kind of data validation.

Armed with this, let's rewrite the statement that creates the table like this:

MariaDB> CREATE TABLE products(id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(255) NOT NULL,
  price DECIMAL(9,2) NOT NULL,
  stock INTEGER NOT NULL,
  attr VARCHAR(1024),
  CHECK (JSON_VALID(attr)));

Let's give this a try now:

MariaDB> INSERT INTO products VALUES(NULL, 'Jeans', 10.5, 165, NULL);
ERROR 4025 (23000): CONSTRAINT `CONSTRAINT_1` failed for `inventory`.`products`

Ok, that didn't work out. What happens here is that a NULL string isn't a valid JSON value, so we need to rewrite our table definition:

MariaDB> CREATE TABLE products(id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(255) NOT NULL,
  price DECIMAL(9,2) NOT NULL,
  stock INTEGER NOT NULL,
  attr VARCHAR(1024),
  CHECK (attr IS NULL OR JSON_VALID(attr)));

Following this we can try it again:

MariaDB> INSERT INTO products VALUES(NULL, 'Jeans', 10.5, 165, NULL);
Query OK, 1 row affected (0.01 sec)
MariaDB> INSERT INTO products VALUES(NULL, 'Shirt', 10.5, 78, '{"size": 42, "colour": "white"}');
Query OK, 1 row affected (0.01 sec)
MariaDB> INSERT INTO products VALUES(NULL, 'Blouse', 17, 15, '{"colour": "white}');
ERROR 4025 (23000): CONSTRAINT `CONSTRAINT_1` failed for `inventory`.`products`

That last statement failed because of malformed JSON (a double quote was forgotten about), so let's correct that:

MariaDB> INSERT INTO products VALUES(NULL, 'Blouse', 17, 15, '{"colour": "white"}');
Query OK, 1 row affected (0.01 sec)

One thing that has yet to be discussed is indexes on JSON values. As the attr column in our example is a plain text, we can of course index it as usual, but that is probably not what you want to do, rather what would be neat would be to create an index on individual attributes in that JSON string. MariaDB doesn't yet support functional indexes, i.e. functions not on values but on computed values. What MariaDB does have though is Virtual Columns, and these can be indexed and as of MariaDB 10.2 these virtual columns don't have to be persistent, (read more on Virtual Columns here: https://mariadb.com/resources/blog/putting-virtual-columns-good-use).

The easiest way to explain this is with an example. Let's say we want an index on the colour attribute, if such a thing exists. For this we need two things: A virtual column that contains the colour attribute as extracted from the attr column, and then an index on that. In this case we will be using the JSON_VALUE function that takes a JSON value and a path, the latter describing the JSON operation to be performed, somewhat like a query language for JSON. 

We end up with something like this:

MariaDB> ALTER TABLE products ADD attr_colour VARCHAR(32) AS (JSON_VALUE(attr, '$.colour'));
MariaDB> CREATE INDEX products_attr_colour_ix ON products(attr_colour);

With that in place, let's see how that works:

MariaDB> SELECT * FROM products WHERE attr_colour = 'white';
+----+--------+-------+-------+---------------------------------+-------------+
| id | name   | price | stock | attr                            | attr_colour |
+----+--------+-------+-------+---------------------------------+-------------+
|  2 | Shirt  | 10.50 |    78 | {"size": 42, "colour": "white"} | white       |
|  3 | Blouse | 17.00 |    15 | {"colour": "white"}             | white       |
+----+--------+-------+-------+---------------------------------+-------------+
2 rows in set (0.00 sec)

And let's see if that index is working as it should:

MariaDB> EXPLAIN SELECT * FROM products WHERE attr_colour = 'white';
+------+-------------+----------+------+-------------------------+-------------------------+---------+-------+------+------------+
| id   | select_type | table    | type | possible_keys           | key                     | key_len | ref   | rows | Etra       |
+------+-------------+----------+------+-------------------------+-------------------------+---------+-------+------+------------+
|    1 | SIMPLE      | products | ref  | products_attr_colour_ix | products_attr_colour_ix | 99      | const |    2 | Uing where |
+------+-------------+----------+------+-------------------------+-------------------------+---------+-------+------+------------+
1 row in set (0.00 sec)

And just to show that the column attr_colour is a computed column that depends on the attr column, lets try updating the colour for the blouse and make that red instead of white and then search that. To replace a value in a JSON object MariaDB 10.2 provides the JSON_REPLACE functions (for all JSON functions in MariaDB 10.2 see http://mariadb.com/kb/en/mariadb/json-functions/).

MariaDB> UPDATE products SET attr = JSON_REPLACE(attr, '$.colour', 'red') WHERE name = 'Blouse';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
MariaDB> SELECT attr_colour FROM products WHERE name = 'blouse';
+-------------+
| attr_colour |
+-------------+
| red         |
+-------------+
1 row in set (0.00 sec)

There is more to say about JSON in MariaDB 10.2 but I hope you now have a feel on what's for offer.

Happy SQL'ing
/Karlsson

MariaDB 10.2 includes a number of JSON functions, but why would you use JSON as aren't JSON and SQL data contradictory? This blog shows you why this isn't so and how these JSON functions can be a very useful addition to MariaDB and how MariaDB 10.2 has some other tricks that make JSON with MariaDB even more useful.

Login or Register to post comments

by anderskarlsson4 at February 28, 2017 01:43 PM

Peter Zaitsev

Webinar Thursday March 2, 2017: MongoDB Query Patterns

MongoDB Query

MongoDB QueryJoin Percona’s Senior Technical Services Engineer Adamo Tonete on Thursday, March 2, 2017, at 11:00 a.m. PST / 2:00 p.m. EST (UTC-8) as he reviews and discusses MongoDB® query patterns.


MongoDB is a fast and simple-to-query schema-free database. It features a smart query optimizer that tries to use the easiest data retrieval method.

In this webinar, Adamo will discuss common query operators and how to use them effectively. The webinar will cover not only common query operations, but also the best practices for their usage.

Register for the webinar here.

MongoDB QueryAdamo Tonete, Senior Technical Services Engineer

Adamo joined Percona in 2015, after working as a MongoDB/MySQL Database Administrator for three years. As the main database member of a startup, he was responsible for suggesting the best architecture and data flows for a worldwide company in a 24/7 environment. Before that, he worked as a Microsoft SQL Server DBA for a large e-commerce company, mainly on performance tuning and automation. Adamo has almost eight years of experience working as a DBA, and in the past three years, he has moved to NoSQL technologies without giving up relational databases.

by Dave Avery at February 28, 2017 01:06 AM

February 27, 2017

Peter Zaitsev

MySQL Ransomware: Open Source Database Security Part 3

MySQL Ransomware

MySQL RansomwareThis blog post examines the recent MySQL® ransomware attacks, and what open source database security best practices could have prevented them.

Unless you’ve been living under a rock, you know that there has been an uptick in ransomware for MongoDB and Elasticsearch deployments. Recently, we’re seeing the same for MySQL.

Let’s look and see if this is MySQL’s fault.

Other Ransomware Targets

Let’s briefly touch on how Elasticsearch and MongoDB became easy targets…

Elasticsearch

Elasticsearch® does not implement any access control: neither authentication nor authorization. For this, you need to deploy the Elastic’s shield offering. As such, if you have an Elasticsearch deployment that is addressable from the Internet, you’re asking for trouble. We see many deployments have some authentication around their access, such as HTTP Basic Auth – though sadly, some don’t employ authentication or network isolation. We already wrote a blog about this here.

MongoDB

MongoDB (< 2.6.0) does allow for access control through account creation. It binds to

0.0.0.0
 by default (allowing access from anywhere). This is now changed in /etc/mongod.conf in versions >= 2.6.0. Often administrators don’t realize or don’t know to look for this. (Using MongoDB? My colleague David Murphy wrote a post on this issue here).

We began to see incidents where both Elasticsearch and MongoDB had their datasets removed and replaced with a

README/note
 instructing the user to pay a ransom of 0.2BTC (Bitcoin) to the specified wallet address (if they wanted their data back).

MySQL

So is this latest (and similar) attack on MySQL MySQL’s fault? We don’t think so. MySQL and Percona Server® for MySQL by default do not accept authentication from everywhere without a password for the 

root
 user.

Let’s go over the various security options MySQL has, and describe some other best practices in order to protect your environment.

Default
bind_address=127.0.0.1
 in Percona Server for MySQL

MySQL currently still binds to

0.0.0.0
 (listen to all network interfaces) by default. However, Percona Server for MySQL and Percona XtraDB Cluster have different defaults, and only bind on
127.0.0.1:3306
 in its default configuration (Github pull request).

Recall, if you will, CVE-2012-2122. This ALONE should be enough to ensure that you as the administrator use best practices, and ONLY allow access to the MySQL service from known good sources. Do not setup root level or equivalent access from any host (

%
 indicates any host is allowed). Ideally, you should only allow root access from
127.0.0.1
 – or if you must, from a subset of a secured network (e.g., 
10.10.0.%
 would only allow access to
10.10.0.0/24
).

Prevent Access

Also, does the MySQL database really need a publicly accessible IP address? If you do have a valid reason for this, then you should firewall port 3306 and whitelist access only from hosts that need to access the database directly. You can easily use 

iptables
 for this.

Default Users

MySQL DOES NOT by default create accounts that can be exploited for access. This comes later through an administrator’s lack of understanding, sadly. More often than not, the grant will look something like the following.

GRANT ALL PRIVILEGES TO 'root'@'%' IDENTIFIED BY '123456' WITH GRANT OPTION;

You may scoff at the above (and rightly so). However, don’t discount this just yet: “123456” was the MOST USED password in 2016! So it’s reasonable to assume that somewhere out there this is a reality.

Max Connection Errors

You can deploy max_connection_errors with a suitably low value to help mitigate a direct attack. This will not prevent a distributed attack, where many thousands of hosts are used. Network isolation is the only way to ensure your mitigation against this attack vector.

MySQL 5.7 Improvements on Security

Default Root Password

Since MySQL 5.7, a random password is generated for the only root user (

root@localhost
) when you install MySQL for the first time. That password is then written in the error log and has to be changed. Miguel Ángel blogged about this before.

Connection Control Plugin

MySQL 5.7.17 introduced a new open source plugin called Connection Control. When enabled, it delays the authentication of users that failed to login by default more than three times. This is also part as of Percona Server for MySQL 5.7.17.

Here’s an example where the 4th consecutive try caused a one-second delay (default settings were used):

$ time mysql -u bleh2 -pbleh
ERROR 1045 (28000): Access denied for user 'bleh2'@'localhost' (using password: YES)
real	0m0.009s
$ time mysql -u bleh2 -pbleh
ERROR 1045 (28000): Access denied for user 'bleh2'@'localhost' (using password: YES)
real	0m0.008s
$ time mysql -u bleh2 -pbleh
ERROR 1045 (28000): Access denied for user 'bleh2'@'localhost' (using password: YES)
real	0m0.008s
$ time mysql -u bleh2 -pbleh
ERROR 1045 (28000): Access denied for user 'bleh2'@'localhost' (using password: YES)
real	0m1.008s
mysql> SELECT * FROM INFORMATION_SCHEMA.CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS;
+---------------------+-----------------+
| USERHOST            | FAILED_ATTEMPTS |
+---------------------+-----------------+
| 'bleh2'@'localhost' |               4 |
+---------------------+-----------------+
1 row in set (0.01 sec)

Password Validation Plugin

MySQL 5.6.6 and later versions also ship with a password validation plugin, which prevents creating users with unsafe passwords (such as 

123456
) by ensuring passwords meet certain criteria: https://dev.mysql.com/doc/refman/5.7/en/validate-password-plugin.html

Summary

In order to get stung, one must ignore the best practices mentioned above (which in today’s world, should take some effort). These best practices include:

  1. Don’t use a publicly accessible IP address with no firewall configured
  2. Don’t use a 
    root@%
     account, or other equally privileged access account, with poor MySQL isolation
  3. Don’t configure those privileged users with a weak password, allowing for brute force attacks against the MySQL service

Hopefully, these are helpful security tips for MySQL users. Comment below!

by David Busby at February 27, 2017 10:28 PM

Percona Monitoring and Management (PMM) Graphs Explained: WiredTiger and Percona Memory Engine

This blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. In this blog, we’ll go over some useful metrics WiredTiger outputs and how we visualize them in Percona Monitoring and Management (PMM).

WiredTiger is the default storage engine for MongoDB since version 3.2. The addition of this full-featured, comprehensive storage engine offered a lot of new, useful metrics Percona Monitoring and Management (PMM)that were not available before in MMAPv1.

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB, developed by Percona on top of open-source technology. Behind the scenes, the graphing features this article covers use Prometheus (a popular time-series data store), Grafana (a popular visualization tool), mongodb_exporter (our MongoDB database metric exporter) plus other technologies to provide database and operating system metric graphs for your database instances.

Please see a live demo of our PMM 1.1.1 release of the MongoDB WiredTiger graphs covered in this article: https://pmmdemo.percona.com/graph/dashboard/db/mongodb-wiredtiger.

You can see a sneak peak demo of our Percona Memory Engine graphs we’ll release in PMM 1.1.2 here: https://pmmdemo.percona.com/graph/dashboard/db/mongodb-inmemory.

WiredTiger and Percona Memory Engine

WiredTiger is a storage engine that was developed outside of MongoDB, and was acquired and integrated into MongoDB in version 3.0. WiredTiger offers document-level locking, inline compression and many other useful storage engine features. WiredTiger writes data to disk in “checkpoints” and internally uses Multi-Version Concurrency Control (MVCC) to create “transactions” or “snapshots” when accessing data in the engine. In WiredTiger’s metrics, you will see the term “transactions” used often. It is important to note, however, that MongoDB does not support transactions at this time (this only occurs within the storage engine).

WiredTiger has an in-heap cache for mostly uncompressed pages (50% RAM by default). Like many other engines, it relies on the performance of the Linux filesystem cache, which ends up caching hot, compressed WiredTiger disk blocks.

Besides supporting WiredTiger, Percona Server for MongoDB also ships with a free, open-source in-memory storage engine: Percona Memory Engine for MongoDB. Since we based the Memory Engine on WiredTiger, all graphs and troubleshooting techniques for in-memory are essentially the same (the database data is not stored on disk, of course).

Checkpointing Graphs

WiredTiger checkpoints data to disk every 60 seconds, or after writing 2GB of journaled data.

PMM graphs the current minimum and maximum checkpoint times for WiredTiger checkpoints in the “WiredTiger Checkpoint Time” graph:

Above I have selected “current,” and we can see we have an average of 176ms checkpoints and over a long period it remains flat, not worsening or “snowballing” each checkpoint (which may indicate a performance issue).

Checkpointing is important to watch because it requires WiredTiger to use system resources, and also can affect query performance in an possibly unexpected way — WiredTiger Cache dirty pages:

The WiredTiger Cache is an LRU cache of mostly uncompressed pages. Like most caches, it creates dirty pages that can take up useful memory until flushed. The WiredTiger Cache uses checkpointing as the point in which it clears dirty pages, making the relationship between dirty pages and checkpointing important to note. WiredTiger cleans dirty pages less often if checkpoint performance is slow. They then can slowly consume more and more of the available cache memory.

In the above graph, we can see on average about 8.8% of the cache is dirty pages with spikes up/down aligning with checkpointing. Systems with a very high rate of dirty pages benefit from more RAM to provide more room for “clean” pages. Another option could be improving storage performance, so checkpoints happen faster.

Concurrency Graph

Similar to InnoDB, WiredTiger uses a system of tickets to control concurrency. Where things differ from InnoDB is both “reads” and “writes” have their own ticket pools with their own maximum-ticket limits. The defaults of “128” tickets for both read and write concurrency is generally enough for even medium-high usage systems. Some systems are capable of more than the default concurrency limit, however (usually systems with very fast storage). Also, concurrency can sometimes reduce overhead on network-based storage.

If you notice higher ticket usage, it can sometimes be due to a lot of single-document locking in WiredTiger. This is something to check if you see high rates alongside storage performance and general query efficiency.

In Percona Monitoring and Management, we have the “WiredTiger Concurrent Transactions” graph to visualize the usage of the tickets. In most cases, tickets shouldn’t reach the limit and you shouldn’t need to tweak this tuneable. If you do require more concurrency, however, PMM’s graphing helps indicate when limits are being reached and whether a new limit will mitigate the problem.

Here we can see a max usage of 8/128 write tickets and 5/128 read tickets. This means this system isn’t having any concurrency issues.

Throughput Graphs

There are several WiredTiger graphs to explain the rate of data moving through the engine. As storage is a common bottleneck, I generally look at “WiredTiger Block Activity” first when investigating storage resource usage. This graph shows the total rates written and read to/from storage by WiredTiger (disk for WiredTiger, memory for in-memory).

For correlation, there are also rates for the amount of data written from and read into the WiredTiger cache, from disk. The “read” metric shows the rate of data added to the cache due to query patterns (e.g.: scanning), while the “written” metric shows the rate of data written out to storage from the WiredTiger cache.

Also there are rates to explain the IO caused by the WiredTiger Log. The metric “payload” is the essentially the write rate of raw BSON pages, and “written” is a combined total of log bytes written (including overhead, likely the frames around the payload, etc.). You should watch changes to the average rate of “read” carefully, as they may indicate changes in query patterns or efficiency.

Detailed Cache Graphs

In addition to the Dirty Pages in the cache graph, “WiredTiger Cache Capacity” graphs the size and usage of the WiredTiger cache:

The rate of cache eviction is graphed in “WiredTiger Cache Eviction,” with a break down of modified vs. unmodified pages:

Very large spikes in eviction can indicate collection scanning or generally poor performing queries. This pushes data out of caches. You should avoid high rates of cache evictions, as they can cause a high overhead to the overall engine.

When increasing the size of the WiredTiger cache it is useful to look at both of the above cache graphs. You should look for more “Used” memory in the “WiredTiger Cache Capacity” graph and less rate of eviction in the “WiredTiger Cache Eviction” graph. If you do not see changes to these metrics, you may see better performance leaving the cache size as-is.

Transactions and Document Operations

The “WiredTiger Transactions” graph shows the overall operations happening inside the engine. All transactions start with a “begin,” and operations that changed data end with a “commit.” Read-only operations show a “rollback” at the time they returned data:

This graph above correlates nicely with the “Mongod – Document Activity” graph, which shows the rate of operations from the MongoDB-layer perspective instead of the storage engine level:

Detailed Log Graphs

The graph “WiredTiger Log Operations” explains activity inside the WiredTiger Log system:

Also, the rate of log record compression is graphed as “WiredTiger Log Records.” WiredTiger only compresses log operations that are greater than 128 bytes, which explains why some log records are not compressed:

In some cases, changes in the ratio of compressed vs. uncompressed pages may help explain changes in CPU% used.

What’s Missing?

As you’ll see in my other blog post “Percona Monitoring and Management (PMM) Graphs Explained: MongoDB with RocksDB” from this series, RocksDB includes read latency metrics and a hit ratio for the RocksDB block cache. These are two things I would like to see added to WiredTiger’s metric output, and thus PMM. I would also like to improve the user-experience of this dashboard. Some areas use linear-scaled graphs when a logarithmic-scaled graph could provide more value. “WiredTiger Concurrent Transactions” is one example of this.

A known-mystery (so-to-speak) is why WiredTiger reports the cache “percentage overhead” always as 8% in “db.serverStatus().cache.” We added this metric to PMM as a graph named “WiredTiger Cache Overhead.” We assumed it provided a variable overhead metric. However, I’ve seen that it returns 8% regardless of usage: it is 8% on a busy system or even on an empty system with no data or traffic. We’re aware of this, and plan to investigate, as a hit ratio for the cache is a very valuable metric:

Also, if you’ve ever seen the full output of the WiredTiger status metrics (‘db.serverStatus().wiredTiger’ in Mongo shell), you’ll know that there are a LOT more WiredTiger metrics than are currently graphed in Percona Monitoring and Management. In our initial release, we’ve aimed to only include high-value graphs to simplify monitoring WiredTiger. A major barrier in our development of monitoring features for WiredTiger has been the little-to-no documentation on the meaning of many status metrics. I hope this improves with time. As we understand more correlations and useful metrics to determine the health of WiredTiger, we plan to integrate those into Percona Monitoring and Management in the future. As always, we appreciate your suggestions.

Lastly, look out for an upcoming blog post from this series regarding creating custom dashboards, graphs and raw data queries with Percona Monitoring and Management!

by Tim Vaillancourt at February 27, 2017 05:34 PM

Henrik Ingo

February 25, 2017

Valeriy Kravchuk

MySQL Support Engineer's Chronicles, Issue #5

A lot of time passed since my previous post in this series. I was busy with work, participating in FOSDEM, blogging about profilers and sharing various lists of MySQL bugs. But I do not plan to stop writing about my usual weeks of doing support engineer's job. So, time for the next post in this series, based on my random notes taken during the week here and there.

This week started for me with checking recent MySQL bug reports (actually I do it every day). I noted recent report by Roel, Bug #85065. Unfortunately it was quickly marked as "Won't fix", and I tend to agree with Laurynas Biveinis that this was probably a wrong decision. See a discussion here. Out of memory conditions happen in production and it would be great for MySQL to process them properly, not just crash randomly. Roel does a favor to all of us by checking debug builds with additional unusual failure conditions introduced, and this work should not be ignored based on some formal approach.

It's a common knowledge already that I try to avoid not only all kind of clusters, but all kinds of connection pools as well, by all means. Sometimes I still fail, and when I find myself in unlucky situation of dealing with connection pool in Apache Tomcat, I consider this reference useful.

This week I had a useful discussion on the need for xtrabackup (version 2.4.6 was released this week) on Windows (customers ask about it once in a while and some even try to use older binaries of version 1.6.x or so from Percona) and any alternatives. I was pointed out to this blog post by Anders Karlsson. I remember reading something about using Volume Snapshot Service on Windows to backup MySQL back in Oracle in 2012, and really have to just try how it works based on the blog above and this reference. But I still think that, at least without a command line wrapper like mylvmbackup, this approach is hardly easy to follow for average Windows user (like me) and is not on pair with xtrabackup for ease of use etc.

I spent some time building new releases of Percona Server, MariaDB and MySQL 5.6 from Facebook on my Fedora 25 box (this is also one of the first things I do every morning, for software that got updates on GitHub), so just to remind myself, here is my usual cmake command line for 5.7-based builds:

cmake . -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/5.7 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_CONFIG=mysql_release -DFEATURE_SET=community -DWITH_EMBEDDED_SERVER=OFF -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/home/openxs/boost
I do not have it automated, so sometimes the exact command line gets lost in the history of bash... Never happens with MySQL from Facebook or MariaDB (as we see new commits almost every day), but easily happens with MySQL 5.7 (because they push commit only after the official releases) or Percona Server (as I do not care that much about it for a year already...). This time I noted that one actually needs numactl-devel package to build Percona Server 5.7. Was not the case last time I tried.

I also paid attention to blog posts on two hot topics this week. The first one is comparing ProxySQL (here and there) to recently released MaxScale 2.1. I probably have to build and test both myself, but ProxySQL seems to rock, still, and I have no reasons NOT to trust my former colleague René Cannaò, who had provided a lot of details about his tests.

Another topic that is hot in February, 2017 is group replication vs Galera. Przemek's recent post on this topic caused quite a hot discussion here. I still consider the post as a good, useful and sincere attempt to compare these technologies and highlight problems typical experienced Galera user may expect, suspect or note in the group replication in MySQL 5.7.17. Make sure you read comments to the blog post, as they help to clarify the implementation details, differences and strong features of group replication. IMHO, this post, along with great series by LeFred and comments from community, really helps to set proper expectations and setup proper understanding of both technologies.

The last but not the least, this week Mark Callaghan had confirmed that InnoDB is still faster than MyRocks on all kinds of read-only all-in-memory workloads (he used sysbench 1.0 tests) on "small server"/older hardware. I've noted this on a specific use case of Bug #68079 almost a month ago...

This week I worked on several interesting customer issues (involving Galera, CONNECT engine, security concerns, MySQL 5.1 and the limit of 64 secondary indexes per InnoDB table etc) that are not yet completely resolved, so I expect a lot of fun and useful links noted next week. Stay tuned!

by Valeriy Kravchuk (noreply@blogger.com) at February 25, 2017 04:37 PM

February 24, 2017

Peter Zaitsev

Installing Percona Monitoring and Management (PMM) for the First Time

Percona Monitoring and Management 2

Percona Monitoring and ManagementThis blog post is another in the series on the Percona Server for MongoDB 3.4 bundle release. This post is meant to walk a prospective user through the benefits of Percona Monitoring and Management (PMM), how it’s architected and the simple install process. By the end of this post, you should have a good idea of what PMM is, where it can add value in your environment and how you can get PMM going quickly.

Percona Monitoring and Management (PMM) is Percona’s open-source tool for monitoring and alerting on database performance and the components that contribute to it. PMM monitors MySQL (Percona Server and MySQL CE), Amazon RDS/Aurora, MongoDB (Percona Server and MongoDB CE), Percona XtraDB/Galera Cluster, ProxySQL, and Linux.

What is it?

Percona Monitoring and Management is an amalgamation of exciting, best in class, open-source tools and Percona “engineering wizardry,” designed to make it easier to monitor and manage your environment. The real value to our users is the amount of time we’ve spent integrating the tools, plus the pre-built dashboards we’ve constructed that leverage the ten years of performance optimization experience. What you get is a tool that is ready to go out of the box, and installs in minutes. If you’re still not convinced, like ALL Percona software it’s completely FREE!

Sound good? I can hear you nodding your head. Let’s take a quick look at the architecture.

What’s it made of?

PMM, at a high-level, is made up of two basic components: the client and the server. The PMM Client is installed on the database servers themselves and is used to collect metrics. The client contains technology specific exporters (which collect and export data), and an “admin interface” (which makes the management of the PMM platform very simple). The PMM server is a “pre-integrated unit” (Docker, VM or AWS AMI) that contains four components that gather the metrics from the exporters on the PMM client(s). The PMM server contains Consul, Grafana, Prometheus and a Query Analytics Engine that Percona has developed. Here is a graphic from the architecture section of our documentation. In order to keep this post to a manageable length, please refer to that page if you’d like a more “in-depth” explanation.

How do I use it?

PMM is very easy to access once it has been installed (more on the install process below). You will simply open up the web browser of your choice and connect to the PMM Landing Page by typing

http://<ip_address_of _PMM_server>
. That takes you to the PMM landing page, where you can access all of PMM’s tools. If you’d like to get a look into the user experience, we’ve set up a great demo site so you can easily test it out.

Where should I use it?

There’s a good chance that you already have a monitoring/alerting platform for your production workloads. If not, you should set one up immediately and start analyzing trends in your environment. If you’re confident in your production monitoring solution, there is still a use for PMM in an often overlooked area: development and testing.

When speaking with users, we often hear that their development and test environments run their most demanding workloads. This is often due to stress testing and benchmarking. The goal of these workloads is usually to break something. This allows you to set expectations for normal, and thus abnormal, behavior in your production environment. Once you have a good idea of what’s “normal” and the critical factors involved, you can alert around those parameters to identify “abnormal” patterns before they cause user issues in production. The reason that monitoring is critical in your dev/test environment(s) is that you want to easily spot inflection points in your workload, which signal impending disaster. Dashboards are the easiest way for humans to consume and analyze this data.

Are you sold? Let’s get to the easiest part: installation.

How do you install it?

PMM is very easy to install and configure for two main reasons. The first is that the components (mentioned above) take some time to install, so we spent the time to integrate everything and ship it as a unit: one server install and a client install per host. The second is that we’re targeting customers looking to monitor MySQL and MongoDB installations for high-availability and performance. The fact that it’s a targeted solution makes pre-configuring it to monitor for best practices much easier. I believe we’ve all seen a particular solution that tries to do a little of everything, and thus actually does no particular thing well. This is the type of tool that we DO NOT want PMM to be. Now, onto the installation procedure.

There are four basic steps to get PMM monitoring your infrastructure. I do not want to recreate the Deployment Guide in order to maintain the future relevancy of this post. However, I’ll link to the relevant sections of the documentation so you can cut to the chase. Also, underneath each step, I’ll list some key takeaways that will save you time now and in the future.

  1. Install the integrated PMM server in the flavor of your choice (Docker, VM or AWS AMI)
    1. Percona recommends Docker to deploy PMM server as of v1.1
      1. As of right now, using Docker will make the PMM server upgrade experience seamless.
      2. Using the default version of Docker from your package manager may cause unexpected behavior. We recommend using the latest stable version from Docker’s repositories (instructions from Docker).
    2. PMM server AMI and VM are “experimental” in PMM v1.1
    3. When you open the “Metrics Monitor” for the first time, it will ask for credentials (user: admin pwd: admin).
  2. Install the PMM client on every database instance that you want to monitor.
    1. Install with your package manager for easier upgrades when a new version of PMM is released.
  3. Connect the PMM client to the PMM Server.
    1. Think of this step as sending configuration information from the client to the server. This means you are telling the client the address of the PMM server, not the other way around.
  4. Start data collection services on the PMM client.
    1. Collection services are enabled per database technology (MySQL, MongoDB, ProxySQL, etc.) on each database host.
    2. Make sure to set permissions for PMM client to monitor the database: Cannot connect to MySQL: Error 1045: Access denied for user ‘jon’@’localhost’ (using password: NO)
      1. Setting proper credentials uses this syntax sudo pmm-admin add <service_type> –user xxxx –password xxxx
    3. There’s good information about PMM client options in the “Managing PMM Client” section of the documentation for advanced configurations/troubleshooting.

What’s next?

That’s really up to you, and what makes sense for your needs. However, here are a few suggestions to get the most out of PMM.

  1. Set up alerting in Grafana on the PMM server. This is still an experimental function in Grafana, but it works. I’d start with Barrett Chambers’ post on setting up email alerting, and refine it with  Peter Zaitsev’s post.
  2. Set up more hosts to test the full functionality of PMM. We have completely free, high-performance versions of MySQL, MongoDB, Percona XtraDB Cluster (PXC) and ProxySQL (for MySQL proxy/load balancing).
  3. Start load testing the database with benchmarking tools to build your troubleshooting skills. Try to break something to learn what troubling trends look like. When you find them, set up alerts to give you enough time to fix them.

by Jon Tobin at February 24, 2017 09:32 PM

Quest for Better Replication in MySQL: Galera vs. Group Replication

Group Replication

Group ReplicationUPDATE: Some of the language in the original post was considered overly-critical of Oracle by some community members. This was not my intent, and I’ve modified the language to be less so. I’ve also changed term “synchronous” (which the use of is inaccurate and misleading) to “virtually synchronous.” This term is more accurate and already used by both technologies’ founders, and should be less misleading.

I also wanted to thank Jean-François Gagné for pointing out the incorrect sentence about multi-threaded slaves in Group Replication, which I also corrected accordingly.

In today’s blog post, I will briefly compare two major virtually synchronous replication technologies available today for MySQL.

More Than Asynchronous Replication

Thanks to the Galera plugin, founded by the Codership team, we’ve had the choice between asynchronous and virtually synchronous replication in the MySQL ecosystem for quite a few years already. Moreover, we can choose between at least three software providers: Codership, MariaDB and Percona, each with its own Galera implementation.

The situation recently became much more interesting when MySQL Group Replication went into GA (stable) stage in December 2016.

Oracle, the upstream MySQL provider, introduced its own replication implementation that is very similar in concept. Unlike the others mentioned above, it isn’t based on Galera. Group Replication was built from the ground up as a new solution. MySQL Group Replication shares many very similar concepts to Galera. This post doesn’t cover MySQL Cluster, another and fully-synchronous solution, that existed much earlier then Galera — it is a much different solution for different use cases.

In this post, I will point out a couple of interesting differences between Group Replication and Galera, which hopefully will be helpful to those considering switching from one to another (or if they are planning to test them).

This is certainly not a full list of all the differences, but rather things I found interesting during my explorations.

It is also important to know that Group Replication has evolved a lot before it went GA (its whole cluster layer was replaced). I won’t mention how things looked before the GA stage, and will just concentrate on latest available 5.7.17 version. I will not spend too much time on how Galera implementations looked in the past, and will use Percona XtraDB Cluster 5.7 as a reference.

Multi-Master vs. Master-Slave

Galera has always been multi-master by default, so it does not matter to which node you write. Many users use a single writer due to workload specifics and multi-master limitations, but Galera has no single master mode per se.

Group Replication, on the other hand, promotes just one member as primary (master) by default, and other members are put into read-only mode automatically. This is what happens if we try to change data on non-master node:

mysql> truncate test.t1;
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

To change from single primary mode to multi-primary (multi-master), you have to start group replication with the 

group_replication_single_primary_mode
variable disabled.
Another interesting fact is you do not have any influence on which cluster member will be the master in single primary mode: the cluster auto-elects it. You can only check it with a query:

mysql> SELECT * FROM performance_schema.global_status WHERE VARIABLE_NAME like 'group_replication%';
+----------------------------------+--------------------------------------+
| VARIABLE_NAME                    | VARIABLE_VALUE                       |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 329333cd-d6d9-11e6-bdd2-0242ac130002 |
+----------------------------------+--------------------------------------+
1 row in set (0.00 sec)

Or just:

mysql> show status like 'group%';
+----------------------------------+--------------------------------------+
| Variable_name                    | Value                                |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 329333cd-d6d9-11e6-bdd2-0242ac130002 |
+----------------------------------+--------------------------------------+
1 row in set (0.01 sec)

To show the hostname instead of UUID, here:

mysql> select member_host as "primary master" from performance_schema.global_status join performance_schema.replication_group_members where variable_name='group_replication_primary_member' and member_id=variable_value;
+----------------+
| primary master |
+----------------+
| f18ff539956d   |
+----------------+
1 row in set (0.00 sec)

Replication: Majority vs. All

Galera delivers write transactions synchronously to ALL nodes in the cluster. (Later, applying happens asynchronously in both technologies.) However, Group Replication needs just a majority of the nodes confirming the transaction. This means a transaction commit on the writer succeeds and returns to the client even if a minority of nodes still have not received it.

In the example of a three-node cluster, if one node crashes or loses the network connection, the two others continue to accept writes (or just the primary node in Single-Primary mode) even before a faulty node is removed from the cluster.

If the separated node is the primary one, it denies writes due to the lack of a quorum (it will report the error

ERROR 3101 (HY000): Plugin instructed the server to rollback the current transaction.
). If one of the nodes receives a quorum, it will be elected to primary after the faulty node is removed from the cluster, and will then accept writes.

With that said, the “majority” rule in Group Replication means that there isn’t a guarantee that you won’t lose any data if the majority nodes are lost. There is a chance these could apply some transactions that aren’t delivered to the minority at the moment they crash.

In Galera, a single node network interruption makes the others wait for it, and pending writes can be committed once either the connection is restored or the faulty node removed from cluster after the timeout. So the chance of losing data in a similar scenario is lower, as transactions always reach all nodes. Data can be lost in Percona XtraDB Cluster only in a really bad luck scenario: a network split happens, the remaining majority of nodes form a quorum, the cluster reconfigures and allows new writes, and then shortly after the majority part is damaged.

Schema Requirements

For both technologies, one of the requirements is that all tables must be InnoDB and have a primary key. This requirement is now enforced by default in both Group Replication and Percona XtraDB Cluster 5.7. Let’s look at the differences.

Percona XtraDB Cluster:

mysql> create table nopk (a char(10));
Query OK, 0 rows affected (0.08 sec)
mysql> insert into nopk values ("aaa");
ERROR 1105 (HY000): Percona-XtraDB-Cluster prohibits use of DML command on a table (test.nopk) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER
mysql> create table m1 (id int primary key) engine=myisam;
Query OK, 0 rows affected (0.02 sec)
mysql> insert into m1 values(1);
ERROR 1105 (HY000): Percona-XtraDB-Cluster prohibits use of DML command on a table (test.m1) that resides in non-transactional storage engine with pxc_strict_mode = ENFORCING or MASTER
mysql> set global pxc_strict_mode=0;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into nopk values ("aaa");
Query OK, 1 row affected (0.00 sec)
mysql> insert into m1 values(1);
Query OK, 1 row affected (0.00 sec)

Before Percona XtraDB Cluster 5.7 (or in other Galera implementations), there were no such enforced restrictions. Users unaware of these requirements often ended up with problems.

Group Replication:

mysql> create table nopk (a char(10));
Query OK, 0 rows affected (0.04 sec)
mysql> insert into nopk values ("aaa");
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.
2017-01-15T22:48:25.241119Z 139 [ERROR] Plugin group_replication reported: 'Table nopk does not have any PRIMARY KEY. This is not compatible with Group Replication'
mysql> create table m1 (id int primary key) engine=myisam;
ERROR 3161 (HY000): Storage engine MyISAM is disabled (Table creation is disallowed).

I am not aware of any way to disable these restrictions in Group Replication.

GTID

Galera has it’s own Global Transaction ID, which has existed since MySQL 5.5, and is independent from MySQL’s GTID feature introduced in MySQL 5.6. If MySQL’s GTID is enabled on a Galera-based cluster, both numerations exist with their own sequences and UUIDs.

Group Replication is based on a native MySQL GTID feature, and relies on it. Interestingly, a separate sequence block range (initially 1M) is pre-assigned for each cluster member.

WAN Support

The MySQL Group Replication documentation isn’t very optimistic on WAN support, claiming that both “Low latency, high bandwidth network connections are a requirement” and “Group Replication is designed to be deployed in a cluster environment where server instances are very close to each other, and is impacted by both network latency as well as network bandwidth.” These statements are found here and here. However there is network traffic optimization: Message Compression.

I don’t see group communication level tunings available yet, as we find in the Galera evs.* series of

wsrep_provider_options
.

Galera founders actually encourage trying it in geo-distributed environments, and some WAN-dedicated settings are available (the most important being WAN segments).

But both technologies need a reliable network for good performance.

State Transfers

Galera has two types of state transfers that allow syncing data to nodes when needed: incremental (IST) and full (SST). Incremental is used when a node has been out of a cluster for some time, and once it rejoins the other nodes has the missing write sets still in Galera cache. Full SST is helpful if incremental is not possible, especially when a new node is added to the cluster. SST automatically provisions the node with fresh data taken as a snapshot from one of the running nodes (donor). The most common SST method is using Percona XtraBackup, which takes a fast and non-blocking binary data snapshot (hot backup).

In Group Replication, state transfers are fully based on binary logs with GTID positions. If there is no donor with all of the binary logs (included the ones for new nodes), a DBA has to first provision the new node with initial data snapshot. Otherwise, the joiner will fail with a very familiar error:

2017-01-16T23:01:40.517372Z 50 [ERROR] Slave I/O for channel 'group_replication_recovery': Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236

The official documentation mentions that provisioning the node before adding it to the cluster may speed up joining (the recovery stage). Another difference is that in the case of state transfer failure, a Galera joiner will abort after the first try, and will shutdown its mysqld instance. The Group Replication joiner will then fall-back to another donor in an attempt to succeed. Here I found something slightly annoying: if no donor can satisfy joiner demands, it will still keep trying the same donors over and over, for a fixed number of attempts:

[root@cd81c1dadb18 /]# grep 'Attempt' /var/log/mysqld.log |tail
2017-01-16T22:57:38.329541Z 12 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2017-01-16T22:57:38.539984Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/10'
2017-01-16T22:57:38.806862Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 3/10'
2017-01-16T22:58:39.024568Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 4/10'
2017-01-16T22:58:39.249039Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 5/10'
2017-01-16T22:59:39.503086Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 6/10'
2017-01-16T22:59:39.736605Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 7/10'
2017-01-16T23:00:39.981073Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 8/10'
2017-01-16T23:00:40.176729Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 9/10'
2017-01-16T23:01:40.404785Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 10/10'

After the last try, even though it fails, mysqld keeps running and allows client connections…

Auto Increment Settings

Galera adjusts the auto_increment_increment and auto_increment_offset values according to the number of members in a cluster. So, for a 3-node cluster,

auto_increment_increment
  will be “3” and
auto_increment_offset
  from “1” to “3” (depending on the node). If a number of nodes change later, these are updated immediately. This feature can be disabled using the 
wsrep_auto_increment_control
 setting. If needed, these settings can be set manually.

Interestingly, in Group Replication the

auto_increment_increment
 seems to be fixed at 7, and only
auto_increment_offset
 is set differently on each node. This is the case even in the default Single-Primary mode! this seems like a waste of available IDs, so make sure that you adjust the
group_replication_auto_increment_increment
 setting to a saner number before you start using Group Replication in production.

Multi-Threaded Slave Side Applying

Galera developed its own multi-threaded slave feature, even in 5.5 versions, for workloads that include tables in the same database. It is controlled with the  wsrep_slave_threads variable. Group Replication uses a feature introduced in MySQL 5.7, where the number of applier threads is controlled with slave_parallel_workers. Galera will do multi-threaded replication based on potential conflicts of changed/locked rows. Group Replication parallelism is based on an improved LOGICAL_CLOCK scheduler, which uses information from writesets dependencies. This can allow it to achieve much better results than in normal asynchronous replication MTS mode. More details can be found here: http://mysqlhighavailability.com/zooming-in-on-group-replication-performance/

Flow Control

Both technologies use a technique to throttle writes when nodes are slow in applying them. Interestingly, the default size of the allowed applier queue in both is much different:

Moreover, Group Replication provides separate certifier queue size, also eligible for the Flow Control trigger:

 group_replication_flow_control_certifier_threshold
. One thing I found difficult, is checking the actual applier queue size, as the only exposed one via performance_schema.replication_group_member_stats is the
Count_Transactions_in_queue
 (which only shows the certifier queue).

Network Hiccup/Partition Handling

In Galera, when the network connection between nodes is lost, those who still have a quorum will form a new cluster view. Those who lost a quorum keep trying to re-connect to the primary component. Once the connection is restored, separated nodes will sync back using IST and rejoin the cluster automatically.

This doesn’t seem to be the case for Group Replication. Separated nodes that lose the quorum will be expelled from the cluster, and won’t join back automatically once the network connection is restored. In its error log we can see:

2017-01-17T11:12:18.562305Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2017-01-17T11:12:18.631225Z 0 [Note] Plugin group_replication reported: 'getstart group_id ce427319'
2017-01-17T11:12:21.735374Z 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-01-17T11:12:21.735519Z 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-01-17T11:12:21.735527Z 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-01-17T11:12:21.735553Z 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-01-17T11:12:21.735558Z 0 [Note] Plugin group_replication reported: 'new state x_start'

Its status changes to:

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 329333cd-d6d9-11e6-bdd2-0242ac130002 | f18ff539956d | 3306 | ERROR |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
1 row in set (0.00 sec)

It seems the only way to bring it back into the cluster is to manually restart Group Replication:

mysql> START GROUP_REPLICATION;
ERROR 3093 (HY000): The START GROUP_REPLICATION command failed since the group is already running.
mysql> STOP GROUP_REPLICATION;
Query OK, 0 rows affected (5.00 sec)
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (1.96 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 24d6ef6f-dc3f-11e6-abfa-0242ac130004 | cd81c1dadb18 | 3306 | ONLINE |
| group_replication_applier | 329333cd-d6d9-11e6-bdd2-0242ac130002 | f18ff539956d | 3306 | ONLINE |
| group_replication_applier | ae148d90-d6da-11e6-897e-0242ac130003 | 0af7a73f4d6b | 3306 | ONLINE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec

Note that in the above output, after the network failure, Group Replication did not stop. It waits in an error state. Moreover, in Group Replication a partitioned node keeps serving dirty reads as if nothing happened (for non-super users):

cd81c1dadb18 {test} ((none)) > SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 24d6ef6f-dc3f-11e6-abfa-0242ac130004 | cd81c1dadb18 | 3306 | ERROR |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
1 row in set (0.00 sec)
cd81c1dadb18 {test} ((none)) > select * from test1.t1;
+----+-------+
| id | a |
+----+-------+
| 1 | dasda |
| 3 | dasda |
+----+-------+
2 rows in set (0.00 sec)
cd81c1dadb18 {test} ((none)) > show grants;
+-------------------------------------------------------------------------------+
| Grants for test@% |
+-------------------------------------------------------------------------------+
| GRANT SELECT, INSERT, UPDATE, DELETE, REPLICATION CLIENT ON *.* TO 'test'@'%' |
+-------------------------------------------------------------------------------+
1 row in set (0.00 sec)

A privileged user can disable

super_read_only
, but then it won’t be able to write:

cd81c1dadb18 {root} ((none)) > insert into test1.t1 set a="split brain";
ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.
cd81c1dadb18 {root} ((none)) > select * from test1.t1;
+----+-------+
| id | a |
+----+-------+
| 1 | dasda |
| 3 | dasda |
+----+-------+
2 rows in set (0.00 sec)

I found an interesting thing here, which I consider to be a bug. In this case, a partitioned node can actually perform DDL, despite the error:

cd81c1dadb18 {root} ((none)) > show tables in test1;
+-----------------+
| Tables_in_test1 |
+-----------------+
| nopk |
| t1 |
+-----------------+
2 rows in set (0.01 sec)
cd81c1dadb18 {root} ((none)) > create table test1.split_brain (id int primary key);
ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.
cd81c1dadb18 {root} ((none)) > show tables in test1;
+-----------------+
| Tables_in_test1 |
+-----------------+
| nopk |
| split_brain |
| t1 |
+-----------------+
3 rows in set (0.00 sec)

In a Galera-based cluster, you are automatically protected from that, and a partitioned node refuses to allow both reads and writes. It throws an error: 

ERROR 1047 (08S01): WSREP has not yet prepared node for application use
. You can force dirty reads using the 
wsrep_dirty_reads
 variable.

There many more subtle (and less subtle) differences between these technologies – but this blog post is long enough already. Maybe next time 🙂

Article with Similar Subject

http://lefred.be/content/group-replication-vs-galera/

by Przemysław Malkowski at February 24, 2017 08:46 PM

February 23, 2017

Peter Zaitsev

Percona MongoDB 3.4 Bundle Release: Percona Server for MongoDB 3.4 Features Explored

Percona Server for MongoDB

Percona MongoDB 3.4 Bundle ReleaseThis blog post continues the series on the Percona MongoDB 3.4 bundle release. This release includes Percona Server for MongoDB, Percona Monitoring and Management, and Percona Toolkit. In this post, we’ll look at the features included in Percona Server for MongoDB.

I apologize for the long blog, but there is a good deal of important information to cover. Not just about what new features exist, but also why that are so important. I have tried to break this down into clear areas for you to cover the most amount of data, while also linking to further reading on these topics.

The first and biggest new feature for many people is the addition of collation in MongoDB. Wikipedia says about collation:

Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office filing systems, library catalogs, and reference books.

What this is saying is a collation is an ordering of characters for a given character set. Different languages order the alphabet differently or even have different base characters (such as Asian, Middle Eastern and other regions) that are not English-native. Collations are critical for multi-language support and sorting of non-English words for index ordering.

Sharding

General

All members of a cluster are aware of sharding (all members, sharding set name, etc.). Due to this, the

sharding.clusterRole
 must be defined on all shard nodes, a new requirement.

Mongos processes MUST connect to 3.4 mongod instances (shard and config nodes). 3.2 and lower is not possible.

Config Servers

Balancer on Config Server PRIMARY

In MongoDB 3.4, the cluster balancer is moved from the mongos processes (any) to the config server PRIMARY member.

Moving to a config-server-based balancer has the following benefits:

Predictability: the balancer process is always the config server PRIMARY. Before 3.4, any mongos processes could become the balancer, often chosen at random. This made troubleshooting difficult.

Lighter “mongos” process: the mongos/shard router benefits from being as light and thin as possible. This removes some code and potential for breakage from “mongos.”

Efficiency: config servers have dedicated nodes with very low resource utilization and no direct client traffic, for the most part. Moving the balancer to the config server set moves usage away from critical “router” processes.

Reliability: balancing relies on fewer components. Now the balancer can operate on the “config” database metadata locally, without the chance of network interruptions breaking balancing.

Config servers are a more permanent member of a cluster, unlikely to scale up/down or often change, unlike “mongos” processes that may be located on app hosts, etc.

Config Server Replica Set Required

In MongoDB 3.4, the former “mirror” config server strategy (SCCC) is no longer supported. This means all sharded clusters must use a replica-set-based set of config servers.

Using a replica-set based config server set has the following benefits:

Adding and removing config servers is greatly simplified.

Config servers have oplogs (useful for investigations).

Simplicity/Consistency: removing mirrored/SCCC config servers simplifies the high-level and code-level architecture.

Chunk Migration / Balancing Example

(from docs.mongodb.com)

Parallel Migrations

Previous to MongoDB 3.4, the balancer could only perform a single chunk migration at any given time. When a chunk migrates, a “source” shard and a “destination” shard are chosen. The balancer coordinates moving the chunks from the source to the target. In a large cluster with many shards, this is inefficient because a migration only involves two shards and a cluster may contain 10s or 100s of shards.

In MongoDB 3.4, the balancer can now perform many chunk migrations at the same time in parallel — as long as they do not involve the same source and destination shards. This means that in clusters with more than two shards, many chunk migrations can now occur at the same time when they’re mutually exclusive to one another. The effective outcome is (Number of Shards / 2) -1 == number of max parallel migrations: or an increase in the speed of the migration process.

For example, if you have ten shards, then 10/2 = 5 and  5-1 = 4. So you can have four concurrent moveChunks or balancing actions.

Tags and Zone

Sharding Zones supersedes tag-aware sharding. There is mostly no changes to the functionality. This is mostly a naming change and some new helper functions.

New commands/shell-methods added:

addShardToZone / sh.addShardToZone().

removeShardFromZone / sh.removeShardFromZone().

updateZoneKeyRange / sh.updateZoneKeyRange() + sh.removeRangeFromZone().

You might recall  MongoDB has for a long time supported the idea of shard and replication tags. They break into two main areas: hardware-aware tags and access pattern tags. The idea behind hardware-aware tags was that you could have one shard with slow disks, and as data ages, you have a process to move documents to a collection that lives on that shard (or tell specific ranges to live on that shard). Then your other shards could be faster (and multiples of them) to better handle the high-speed processing of current data.

The other is a case based more in replication, where you want to allow BI and other reporting systems access to your data without damaging your primary customer interactions. To do this, you could tag a node in a replica set to be

{reporting: true}
, and all reporting queries would use this tag to prevent affecting the same nodes the user-generated work would live on. Zones is this same idea, simplified into a better-understood term. For now, there is no major difference between these areas, but it could be something to look at more in the 3.6 and 3.8 MongoDB versions.

Replication

New “linearizable” Read Concern: reflects all successful writes issued with a “majority” and acknowledged before the start of the read operation.

Adjustable Catchup for Newly Elected Primary: the time limit for a newly elected primary to catch up with the other replica set members that might have more recent writes.

Write Concern Majority Journal Default replset-config option: determines the behavior of the 

{ w: "majority" }
 write concern if the write concern does not explicitly specify the journal option j.

Initial-sync improvements:

Now the initial sync builds the indexes as the documents are copied.

Improvements to the retry logic make it more resilient to intermittent failures on the network.

Data Types

MongoDB 3.4 adds support for the decimal128 format with the new decimal data type. The decimal128 format supports numbers with up to 34 decimal digits (i.e., significant digits) and an exponent range of −6143 to +6144.

When performing comparisons among different numerical types, MongoDB conducts a comparison of the exact stored numerical values without first converting values to a common type.

Unlike the double data type, which only stores an approximation of the decimal values, the decimal data type stores the exact value. For example, a decimal

NumberDecimal("9.99")
 has a precise value of 9.99, whereas a double 9.99 would have an approximate value of 9.9900000000000002131628….

To test for decimal type, use the $type operator with the literal “decimal” or 19
db.inventory.find( { price: { $type: "decimal" } } )
New Number Wrapper Object Type
db.inventory.insert( {_id: 1, item: "The Scream", price: NumberDecimal("9.99"), quantity: 4 } )

To use the new decimal data type with a MongoDB driver, an upgrade to a driver version that supports the feature is necessary.

Aggregation Changes

Stages

Recursive Search

MongoDB 3.4 introduces a stage to the aggregation pipeline that allows for recursive searches.

Stage
Description
$graphLookup   Performs a recursive search on a collection. To each output document, adds a new array field that contains the traversal results of the recursive search for that document.

Faceted Search

Faceted search allows for the categorization of documents into classifications. For example, given a collection of inventory documents, you might want to classify items by a single category (such as by the price range), or by multiple groups (such as by price range as well as separately by the departments).

3.4 introduces stages to the aggregation pipeline that allow for faceted search.

Stage
Description
$bucket Categorizes or groups incoming documents into buckets that represent a range of values for a specified expression.
$bucketAuto Categorizes or groups incoming documents into a specified number of buckets that constitute a range of values for a specified expression. MongoDB automatically determines the bucket boundaries.
$facet Processes multiple pipelines on the input documents and outputs a document that contains the results of these pipelines. By specifying facet-related stages ($bucket$bucketAuto, and$sortByCount) in these pipelines, $facet allows for multi-faceted search.
$sortByCount   Categorizes or groups incoming documents by a specified expression to compute the count for each group. Output documents are sorted in descending order by the count.

 

Reshaping Documents

MongoDB 3.4 introduces stages to the aggregation pipeline that facilitate replacing documents as well as adding new fields.

Stage
Description
$addFields Adds new fields to documents. The stage outputs documents that contain all existing fields from the input documents as well as the newly added fields.
$replaceRoot   Replaces a document with the specified document. You can specify a document embedded in the input document to promote the embedded document to the top level.

Count

MongoDB 3.4 introduces a new stage to the aggregation pipeline that facilitates counting document.

Stage
Description
$count   Returns a document that contains a count of the number of documents input to the stage.

Operators

Array Operators

Operator
Description
$in Returns a boolean that indicates if a specified value is in an array.
$indexOfArray    Searches an array for an occurrence of a specified value and returns the array index (zero-based) of the first occurrence.
$range Returns an array whose elements are a generated sequence of numbers.
$reverseArray Returns an output array whose elements are those of the input array but in reverse order.
$reduce Takes an array as input and applies an expression to each item in the array to return the final result of the expression.
$zip Returns an output array where each element is itself an array, consisting of elements of the corresponding array index position from the input arrays.

Date Operators

Operator
Description
$isoDayOfWeek   Returns the ISO 8601-weekday number, ranging from 1 (for Monday) to 7 (for Sunday).
$isoWeek Returns the ISO 8601 week number, which can range from 1 to 53. Week numbers start at 1with the week (Monday through Sunday) that contains the year’s first Thursday.
$isoWeekYear Returns the ISO 8601 year number, where the year starts on the Monday of week 1 (ISO 8601) and ends with the Sundays of the last week (ISO 8601).

String Operators

Operator
Description
$indexOfBytes   Searches a string for an occurrence of a substring and returns the UTF-8 byte index (zero-based) of the first occurrence.
$indexOfCP Searches a string for an occurrence of a substring and returns the UTF-8 code point index (zero-based) of the first occurrence.
$split Splits a string by a specified delimiter into string components and returns an array of the string components.
$strLenBytes Returns the number of UTF-8 bytes for a string.
$strLenCP Returns the number of UTF-8 code points for a string.
$substrBytes Returns the substring of a string. The substring starts with the character at the specified UTF-8 byte index (zero-based) in the string for the length specified.
$substrCP Returns the substring of a string. The substring starts with the character at the specified UTF-8 code point index (zero-based) in the string for the length specified.

Others/Misc

Other new operators:

$switch: Evaluates, in sequential order, the case expressions of the specified branches to enter the first branch for which the case expression evaluates to “true”.

$collStats: Returns statistics regarding a collection or view.

$type: Returns a string which specifies the BSON Types of the argument.

$project: Adds support for field exclusion in the output document. Previously, you could only exclude the _id field in the stage.

Views

MongoDB 3.4 adds support for creating read-only views from existing collections or other views. To specify or define a view, MongoDB 3.4 introduces:

    • theViewOn and pipeline options to the existing create command:
      • db.runCommand( { create: <view>, viewOn: <source>, pipeline: <pipeline> } )
    • or if specifying a default collation for the view:
      • db.runCommand( { create: <view>, viewOn: <source>, pipeline: <pipeline>, collation: <collation> } )
    • and a corresponding  mongo shell helper db.createView():
      • db.createView(<view>, <source>, <pipeline>, <collation>)

For more information on creating views, see Views.

by David Murphy at February 23, 2017 09:36 PM

February 22, 2017

MariaDB AB

How MariaDB ColumnStore Handles Big Data Workloads – Data Loading and Manipulation

How MariaDB ColumnStore Handles Big Data Workloads – Data Loading and Manipulation david_thompson_g Wed, 02/22/2017 - 18:34

MariaDB ColumnStore is a massively parallel scale out columnar database. Data loading and modification behaves somewhat differently from how a row based engine works. This article outlines the options available and how these affect performance.

Data Loading and Manipulation Options

MariaDB ColumnStore provides several options for writing data:

  1. DML operations: insert, update, delete
  2. Bulk DML: INSERT INTO … SELECT
  3. MariaDB Server bulk file load: LOAD DATA INFILE
  4. ColumnStore bulk data load: cpimport
  5. ColumnStore bulk delete: ColumnStore partition drop.

 

DML Operations

ColumnStore supports transactionally consistent insert, update, and delete statements using standard syntax. Performance of individual statements will be significantly slower than you’d expect with row based engines such as InnoDB. This is due to the system being optimized for block writes and the fact that a column based change must affect multiple underlying files. Updates in general will be faster since updates are performed inline only to the updated columns.

 

Bulk DML

INSERT INTO .., SELECT statements where the destination table is ColumnStore are optimized by default to internally convert and use the cpimport utility executing in mode 1 which will offer greater performance than utilizing raw DML operations. This can be a useful capability to migrate a non ColumnStore table. For further details please refer to the following knowledge base article: https://mariadb.com/kb/en/mariadb/columnstore-batch-insert-mode/

 

LOAD DATA INFILE

The LOAD DATA INFILE command can also be used and similarly is optimized by default to utilize cpimport mode 1 over DML operations to provide better performance. However greater performance (approximately 2x) and flexibility is provided for by utilizing cpimport directly. This can be useful for compatibility purposes.  For further details please refer to the following knowledge base article: https://mariadb.com/kb/en/mariadb/columnstore-load-data-infile/

 

cpimport

The cpimport utility is the fastest and most flexible data loading utility for ColumnStore. It works directly with the PM WriteEngine processes eliminating many overheads of the prior options. Cpimport is designed to work with either delimited text files or delimited data provided via stdin. The latter option provides for some simple integration capabilities such as streaming a query result from another database directly into cpimport. Multiple tables can be loaded in parallel and a separate utility colxml is provided to help automate this. For further details please refer to the following knowledge base article: https://mariadb.com/kb/en/mariadb/columnstore-bulk-data-loading/

 

The utility can operate in different modes as designated by the -m flag (default 1):

 

Mode 1 - Centralized Trigger, Distributed Loading

The data to be loaded is provided as one input file on a single server. The data is then divided and distributed evenly to each of the PM nodes for loading. The extent map is referenced to aim for even data distribution.  In addition the -P argument can be utilized to send the data to specific PM nodes which allows for the finer grain control of modes 2 and 3 while preserving centralized loading.

Mode 2 - Centralized Trigger, Local Loading

In this mode, the data must be pre divided and pushed to each pm server. The load on each server is triggered from a central location and triggers a local cpimport on each pm server.

Mode 3 - Local Trigger, Local Loading

This mode allows for loading data individually per PM node across some or all of the nodes. The load is triggered from the PM server and runs cpimport locally on that PM node only.

Modes 2 and 3 allow for more direct control of where data is loaded and in what order within the corresponding extents however care needs to be taken to allow for even distribution across nodes. This direct control does allow for explicit partitioning by PM node, for example, with 3 nodes you could have one node with only America's data, one with EMEA, and one with APAC data. Local query can be enabled to allow querying a PM directly limiting to that regions data while still allowing querying all data from the UM level.

 

Partition Drop

ColumnStore provides a mechanism to support bulk delete by extents. An extent can be dropped by partition id or by using a value range corresponding to the minimum and maximum values for the extents to be dropped. This allows for an effective and fast purging mechanism if the data has an increasing date based column then the minimum and maximum values for the columns extents will form a (potentially overlapping) range based partitioning scheme. Data can be dropped by specifying the range values to be removed. This can form a very effective information lifecycle management strategy by removing old data by partition range. For further details please refer to the following knowledge base article: https://mariadb.com/kb/en/mariadb/columnstore-partition-management/

 

Transactional Consistency

MariaDB ColumnStore provides read committed transaction isolation. Changes to the data whether performed through DML or bulk import are always applied such that reads are not blocked and other transactions maintain a consistent view of the prior data until the data is successfully committed.

The cpimport utility interacts with a high water mark value for each column. All queries will only read below the high water mark and cpimport will insert new rows above the high water mark. When the load is completed the high water mark is updated atomically.


SQL DML operations utilize a block based MVCC architecture to provide for transactional consistency. Other transactions will read blocks at a particular version while the uncommitted version is maintained in a version buffer.

MariaDB ColumnStore is a massively parallel scale out columnar database. Data loading and modification behaves somewhat differently from how a row based engine works. This article outlines the options available and how these affect performance.

Login or Register to post comments

by david_thompson_g at February 22, 2017 11:34 PM

Peter Zaitsev

Webinar Thursday, February 23, 2017: Troubleshooting MySQL Access Privileges Issues

Troubleshooting MySQL Access Privileges

Troubleshooting MySQL Access PrivilegesPlease join Sveta Smirnova, Percona’s Principal Technical Services Engineer, as she presents Troubleshooting MySQL Access Privileges Issues on
February 23, 2017 at 11:00 am PST / 2:00 pm EST (UTC-8).

Do you have registered users who can’t connect to the MySQL server? Strangers modifying data to which they shouldn’t have access?

MySQL supports a rich set of user privilege options and allows you to fine tune access to every object in the server. The latest versions support authentication plugins that help to create more access patterns.

However, finding errors in such a big set of options can be problematic. This is especially true for environments with hundreds of users, all with different privileges on multiple objects. In this webinar, I will show you how to decipher error messages and unravel the complicated setups that can lead to access errors. We will also cover network errors that mimic access privileges errors.

In this webinar, we will discuss:

  • Which privileges MySQL supports
  • What GRANT statements are
  • How privileges are stored
  • How to find out why a privilege does not work properly
  • How authentication plugins make difference
  • What the best access control practices are

To register for this webinar please click here.

InnoDB TroubleshootingSveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker, and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona, Sveta worked as Support Engineer in the MySQL Bugs Analysis Support Group at MySQL AB-Sun-Oracle. She is the author of book “MySQL Troubleshooting” and JSON UDF functions for MySQL.

by Dave Avery at February 22, 2017 08:50 PM

Percona Monitoring and Management (PMM) Graphs Explained: MongoDB with RocksDB

Percona Monitoring and Management (PMM)

Percona Monitoring and ManagementThis post is part of the series of Percona’s MongoDB 3.4 bundle release blogs. In mid-2016, Percona Monitoring and Management (PMM) added support for RocksDB with MongoDB, also known as “MongoRocks.” In this blog, we will go over the Percona Monitoring and Management (PMM) 1.1.0 version of the MongoDB RocksDB dashboard, how PMM is useful in the day-to-day monitoring of MongoDB and what we plan to add and extend.

Percona Monitoring and Management (PMM)

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB, developed by Percona on top of open-source technology. Behind the scenes, the graphing features this article covers use Prometheus (a popular time-series data store), Grafana (a popular visualization tool), mongodb_exporter (our MongoDB database metric exporter) plus other technologies to provide database and operating system metric graphs for your database instances.

The mongodb_exporter tool, which provides our monitoring platform with MongoDB metrics, uses RocksDB status output and optional counters to provide detailed insight into RocksDB performance. Percona’s MongoDB 3.4 release enables RocksDB’s optional counters by default. On 3.2, however, you must set the following in /etc/mongod.conf to enable this:

storage.rocksdb.counters: true
 .

This article shows a live demo of our MongoDB RocksDB graphs: https://pmmdemo.percona.com/graph/dashboard/db/mongodb-rocksdb.

RocksDB/MongoRocks

RocksDB is a storage engine available since version 3.2 in Percona’s fork of MongoDB: Percona Server for MongoDB.

The first thing to know about monitoring RocksDB is compaction. RocksDB stores its data on disk using several tiered levels of immutable files. Changes written to disk are written to the first RocksDB level (Level0). Later the internal compactions merge the changes down to the next RocksDB level when Level0 fills. Each level before the last is essentially deltas to the resting data set that soon merges down to the bottom.

We can see the effect of the tiered levels in our “RocksDB Compaction Level Size” graph, which reflects the size of each level in RocksDB on-disk:

RocksDB

Note that most of the database data is in the final level “L6” (Level 6). Levels L0, L4 and L5 hold relatively smaller amounts of data changes. These get merged down to L6 via compaction.

More about this design is explained in detail by the developers of MongoRocks, here: https://www.percona.com/live/plam16/sessions/everything-you-wanted-know-about-mongorocks.

RocksDB Compaction

Most importantly, RocksDB compactions try to happen in the background. They generally do not “block” the database. However, the additional resource usage of compactions can potentially cause some spikes in latency, making compaction important to watch. When compactions occur, between levels L4 and L5 for example, L4 and L5 are read and merged with the result being written out as a new L5.

The memtable in MongoRocks is a 64mb in-memory table. Changes initially get written to the memtable. Reads check the memtable to see if there are unwritten changes to consider. When the memtable has filled to 100%, RocksDB performs a compaction of the memtable data to Level0, the first on-disk level in RocksDB.

In PMM we have added a single-stat panel for the percentage of the memtable usage. This is very useful in indicating when you can expect a memtable-to-level0 compaction to occur:

Above we can see the memtable is 125% used, which means RocksDB is late to finish (or start) a compaction due to high activity. Shortly after taking this screenshot above, however, our test system began a compaction of the memtable and this can be seen at the drop in active memtable entries below:

RocksDB

Following this compaction further through PMM’s graphs, we can see from the (very useful) “RocksDB Compaction Time” graph that this compaction took 5 seconds.

In the graph above, I have singled-out “L0” to show Level0’s compaction time. However, any level can be selected either per-graph (by clicking on the legend-item) or dashboard-wide (by using the RocksDB Level drop-down at the top of the page).

In terms of throughput, we can see from our “RocksDB Write Activity” graph (Read Activity is also graphed) that this compaction required about 33MBps of disk write activity:

On top of additional resource consumption such as the write activity above, compactions cause caches to get cleared. One example is the OS cache due to new level files being written. These factors can cause some increases to read latencies, demonstrated in this example below by the bump in L4 read latency (top graph) caused by the L4 compaction (bottom graph):

This pattern above is one area to check if you see latency spikes in RocksDB.

RocksDB Stalls

When RocksDB is unable to perform compaction promptly, it uses a feature called “stalls” to try and slow down the amount of data coming into the engine. In my experience, stalls almost always mean something below RocksDB is not up to the task (likely the storage system).

Here is the “RocksDB Stall Time” graph of a host experiencing frequent stalls:

PMM can graph the different types of RocksDB stalls in the “RocksDB Stalls” graph. In our case here, we have 0.3-0.5 stalls per second due to “level0_slowdown” and “level0_slowdown_with_compaction.” This happens when Level0 stalls the engine due to slow compaction performance below its level.

Another metric reflecting the poor compaction performance is the pending compactions in “RocksDB Pending Operations”:

As I mentioned earlier, this almost always means something below RocksDB itself cannot keep up. In the top-right of PMM, we have OS-level metrics in a drop-down, I recommend you look at “Disk Performance” in these scenarios:

On the “Disk Performance” dashboard you can see the “sda” disk has an average write time of 212ms, and a max of 1100ms (1.1 seconds). This is fairly slow.

Further, on the same dashboard I can see the CPU is waiting on disk I/O 98.70% of the time on average. This explains why RocksDB needs to stall to hold back some of the load!

The disks seem too busy to keep up! Looking at the “Mongod – Document Activity” graph, it explains the cause of the high disk usage: 10,000-60,000 inserts per second:

Here we can draw the conclusion that this volume of inserts on this system configuration causes some stalling in RocksDB.

RocksDB Block Cache

The RocksDB Block Cache is the in-heap cache RocksDB uses to cache uncompressed pages. Generally, deployments benefit from dedicating most of their memory to the Linux file system cache vs. the RocksDB Block Cache. We recommend using only 20-30% of the host RAM for block cache.

PMM can take away some of the guesswork with the “RocksDB Block Cache Hit Ratio” graph, showing the efficiency of the block cache:

It is difficult to define a “good” and “bad” number for this metric, as the number varies for every deployment. However, one important thing to look for is significant changes in this graph. In this example, the Block Cache has a page in cache 3000 times for every 1 time it does not.

If you wanted to test increasing your block cache, this graph becomes very useful. If you increase your block cache and do not see an improvement in the hit ratio after a lengthy period of testing, this usually means more block cache memory is not necessary.

RocksDB Read Latency Graphs

PMM graphs Read Latency metrics for RocksDB in several different graphs, one dedicated to Level0:

And three other graphs display Average, 99th Percentile and Maximum latencies for each RocksDB level. Here is an example from the 99th Percentile latency metrics:

Coming Soon

Percona Monitoring and Management needs to add some more metrics that explain the performance of the engine. The rate of deletes/tombstones in the system affects RocksDB’s performance. Currently, this metric is not something our system can easily gather like other engine metrics. Percona Monitoring and Management can’t easily graph the efficiency of the Bloom filter yet, either. These are currently open feature requests to the MongoRocks (and likely RocksDB) team(s) to add in future versions.

Percona’s release of Percona Server for MongoDB 3.4 includes a new, improved version of MongoRocks and RocksDB. More is available in the release notes!

by Tim Vaillancourt at February 22, 2017 08:36 PM

Percona XtraBackup 2.4.6 is Now Available

CVE-2016-6225

Percona XtraBackup 2.4.6Percona announces the GA release of Percona XtraBackup 2.4.6 on February 22, 2017. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New features:
  • Percona XtraBackup implemented a new --remove-original option that can be used to remove the encrypted and compressed files once they’ve been decrypted/decompressed.
Bugs Fixed:
  • XtraBackup was using username set for the server in a configuration file even if a different user was defined in the user’s configuration file. Bug fixed #1551706.
  • Incremental backups did not include xtrabackup_binlog_info and xtrabackup_galera_info files. Bug fixed #1643803.
  • In case a warning was written to stout instead of stderr during the streaming backup, it could cause assertion in the xbstream. Bug fixed #1647340.
  • xtrabackup --move-back option did not always restore out-of-datadir tablespaces to their original directories. Bug fixed #1648322.
  • innobackupex and xtrabackup scripts were showing the password in the ps output when it was passed as a command line argument. Bug fixed #907280.
  • Incremental backup would fail with a path like ~/backup/inc_1 because xtrabackup didn’t properly expand tilde. Bug fixed #1642826.
  • Fixed missing dependency check for Perl Digest::MD5 in rpm packages. This will now require perl-MD5 package to be installed from EPEL repositories on CentOS 5 and CentOS 6 (along with libev). Bug fixed #1644018.
  • Percona XtraBackup now supports -H, -h, -u and -p shortcuts for --hostname, --datadir, --user and --password respectively. Bugs fixed #1655438 and #1652044.

[UPDATE 2016-02-28]: New packages have been pushed to repositories with incremented package version to address the bug #1667610.

Release notes with all the bugfixes for Percona XtraBackup 2.4.6 are available in our online documentation. Please report any bugs to the launchpad bug tracker.

by Hrvoje Matijakovic at February 22, 2017 06:49 PM

Percona XtraBackup 2.3.7 is Now Available

Percona XtraBackup 2.3.5

Percona XtraBackup 2.3.7Percona announces the release of Percona XtraBackup 2.3.7 on February 22, 2017. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release is the current GA (Generally Available) stable release in the 2.3 series.

New Features
  • Percona XtraBackup has implemented a new --remove-original option that can be used to remove the encrypted and compressed files once they’ve been decrypted/decompressed.
Bugs Fixed:
  • XtraBackup was using username set for the server in a configuration file even if a different user was defined in the user’s configuration file. Bug fixed #1551706.
  • Incremental backups did not include xtrabackup_binlog_info and xtrabackup_galera_info files. Bug fixed #1643803.
  • Percona XtraBackup would fail to compile with -DWITH_DEBUG and -DWITH_SSL=system options. Bug fixed #1647551.
  • xtrabackup --move-back option did not always restore out-of-datadir tablespaces to their original directories. Bug fixed #1648322.
  • innobackupex and xtrabackup scripts were showing the password in the ps output when it was passed as a command line argument. Bug fixed #907280.
  • Incremental backup would fail with a path like ~/backup/inc_1 because xtrabackup didn’t properly expand tilde. Bug fixed #1642826.
  • Fixed missing dependency check for Perl Digest::MD5 in rpm packages. This will now require perl-MD5 package to be installed from EPEL repositories on CentOS 5 and CentOS 6 (along with libev). Bug fixed #1644018.
  • Percona XtraBackup now supports -H, -h, -u and -p shortcuts for --hostname, --datadir, --user and --password respectively. Bugs fixed #1655438 and #1652044.

Other bugs fixed: #1655278.

[UPDATE 2016-02-28]: New packages have been pushed to repositories with incremented package version to address the bug #1667610.

Release notes with all the bugfixes for Percona XtraBackup 2.3.7 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

by Hrvoje Matijakovic at February 22, 2017 06:48 PM

February 21, 2017

Peter Zaitsev

Percona Monitoring and Management (PMM) Upgrade Guide

Percona Monitoring and Management

Percona Monitoring and ManagementThis post is part of a series of Percona’s MongoDB 3.4 bundle release blogs. The purpose of this blog post is to demonstrate current best-practices for an in-place Percona Monitoring and Management (PMM) upgrade. Following this method allows you to retain data previously collected by PMM in your MySQL or MongoDB environment, while upgrading to the latest version.

Step 1: Housekeeping

Before beginning this process, I recommend that you use a package manager that installs directly from Percona’s official software repository. The install instructions vary by distro, but for Ubuntu users the commands are:

wget https://repo.percona.com/apt/percona-release_0.1-4.$(lsb_release -sc)_all.deb

sudo dpkg -i percona-release_0.1-4.$(lsb_release -sc)_all.deb

Step 2: PMM Server Upgrade

Now that we have ensured we’re using Percona’s official software repository, we can continue with the upgrade. To check which version of PMM server is running, execute the following command on your PMM server host:

docker ps

This command shows a list of all running Docker containers. The version of PMM server you are running is found in the image description.

docker_ps

Once you’ve verified you are on an older version, it’s time to upgrade!

The first step is to stop and remove your docker pmm-server container with the following command:

docker stop pmm-server && docker rm pmm-server

Please note that this command may take several seconds to complete.
docker_stop

The next step is to create and run the image with the new version tag. In this case, we are installing version 1.1.0. Please make sure to verify the correct image name in the install instructions.

Run the command below to create and run the new image.

docker run -d
 -p 80:80
 --volumes-from pmm-data
 --name pmm-server
 --restart always
 percona/pmm-server:1.1.0

docker_run

We can confirm our new image is running with the following command:

docker ps

docker_ps

As you can see, the latest version of PMM server is installed. The final step in the process is to update the PMM client on each host to be monitored.

Step 3: PMM Client Upgrade

The GA version of Percona Monitoring and Management supports in-place upgrades. Instructions can be found in our documentationOn the client side, update the local apt cache, and upgrade to the new version of pmm-client by running the following commands:

apt-get update

apt-get_update

apt-get install pmm-client

Congrats! We’ve successfully upgraded to the latest PMM version. As you can tell from the graph below, there is a slight gap in our polling data due to the downtime necessary to upgrade the version. However, we have verified that the data that existed prior to the upgrade is still available and new data is being gathered.

grafana_graph

Conclusion

I hope this blog post has given you the confidence to do an in-place Percona Monitoring and Management upgrade. As always, please submit your feedback on our forums with regards to any PMM-related suggestions or questions. Our goal is to make PMM the best-available open-source MySQL and MongoDB monitoring tool.

by Barrett Chambers at February 21, 2017 10:53 PM

Webinar Wednesday February 22, 2017: Percona Server for MongoDB 3.4 Product Bundle Release

Percona Server for MongoDB

Product Bundle ReleaseJoin Percona’s MongoDB Practice Manager David Murphy on Wednesday, February 22, 2017 at 10:00 am PST / 1:00 pm EST (UTC-8) as he reviews and discusses the Percona Server for MongoDB, Percona Monitoring and Management (PMM) and Percona Toolkit product bundle release.

The webinar covers how this new bundled release ensures a robust, secure database that can be adapted to changing business requirements. It demonstrates how MongoDB, PMM and Percona Toolkit are used together so that organizations benefit from the cost savings and agility provided by free and proven open source software.

Percona Server for MongoDB 3.4 delivers all the latest MongoDB 3.4 Community Edition features, additional Enterprise features and a greater choice of storage engines.

Along with improved insight into the database environment, the solution provides enhanced control options for optimizing a wider range of database workloads with greater reliability and security.

Some of the features that will be discussed are:

  • Percona Server for MongoDB 3.4
    • All the features of MongoDB Community Edition 3.4, which provides an open source, fully compatible, drop-in replacement:
      • Integrated, pluggable authentication with LDAP to provide a centralized enterprise authentication service
      • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as user names and IP addresses) from log files
      • Hot backups for the WiredTiger engine protect against data loss in the case of a crash or disaster, without impacting performance
      • Two storage engine options not supported by MongoDB Community Edition 3.4:
        • MongoRocks, the RocksDB-powered storage engine, designed for demanding, high-volume data workloads such as in IoT applications, on-premises or in the cloud.
        • Percona Memory Engine is ideal for in-memory computing and other applications demanding very low latency workloads.
  • Percona Monitoring and Management 1.1
    • Support for MongoDB and Percona Server for MongoDB
    • Graphical dashboard information for WiredTiger, MongoRocks and Percona Memory Engine
  • Percona Toolkit 3.0
    • Two new tools for MongoDB:
      • pt-mongodb-summary (the equivalent of pt-mysql-summary) provides a quick, at-a-glance overview of a MongoDB and Percona Server for MongoDB instance.
      • pt-mongodb-query-digest (the equivalent of pt-query-digest for MySQL) offers a query review for troubleshooting.

You can register for the webinar here.

MongoDB BackupsDavid Murphy, MongoDB Practice Manager

David joined Percona in October 2015 as Practice Manager for MongoDB. Prior to that, David joined the ObjectRocket by Rackspace team as the Lead DBA in Sept 2013. With the growth involved with a any recently acquired startup, David’s role covered a wide range from evangelism, research, run book development, knowledge base design, consulting, technical account management, mentoring and much more.

Prior to the world of MongoDB, David was a MySQL and NoSQL architect at Electronic Arts. There, he worked with some of the largest titles in the world like FIFA, SimCity, and Battle Field providing tuning, design, and technology choice responsibilities. David maintains an active interest in database speaking and exploring new technologies.

by Dave Avery at February 21, 2017 09:06 PM

Installing Percona Monitoring and Management (PMM) for the First Time

Percona Monitoring and Management

Percona Monitoring and Management

This post is part of a series of Percona’s MongoDB 3.4 bundle release blogs. In this blog, we’ll look at the process for installing Percona Monitoring and Management (PMM) for the first time.

Installing Percona Monitoring and Management

Percona Monitoring and Management (PMM) is Percona’s open source tool for monitoring databases. You can use it with either MongoDB and MySQL databases.

PMM requires the installation of a server and client component on each database server to be monitored. You can install the server component on a local or remote server, and monitor any MySQL or MongoDB instance (including Amazon RDS environments).

What is it?

PMM provides a graphical view of the status of monitored databases. You can use it to perform query analytics and metrics review. The graphical component relies on Grafana, and uses Prometheus for information processing. It includes a Query Analytics module that allows you to analyze queries over a period of time and it uses Orchestrator for replication. Since the integration of these items is the most difficult part, the server is distributed as a preconfigured Docker image.

PMM works with any variety of MongoDB or MySQL.

How do you install it?

As mentioned, there is a server component to PMM. You can install it on any server in your database environment, but be aware that if the server on which it is installed goes down or runs out of space, monitoring also fails. The server should have at least 5G of available disk space for each monitored client.

You must install the client component on each monitored database server. It is available as a package for a variety of Linux distributions.

To install the server

  • Create the Docker container for PMM. This container is the storage location for all PMM data and should not be altered or removed.
  • Create and run the PMM server container
  • Verify the installation

Next, install the client on each server to be monitored. It is installed based on the Linux distribution of the server.

Last, you connect the client(s) to the PMM server and monitoring begins.

The Query Analytics tool monitors and reviews queries run in the environment. It displays the current top 10 most time intensive queries. You can click on a query to view a detailed analysis of the query.

The Metrics Monitor gives you a historical view of queries. PMM separates time-based graphs by theme for additional clarity.

PMM includes Orchestrator for replication management and visualization. You must enable it separately, then you can view the metrics on the Discover page in Orchestrator.

Ongoing administration and maintenance

PMM includes an administration tool that adds, removes or monitors services. The pmm-admin tool requires a user with root or sudo access.

You can enable HTTP password protection to add authentication when accessing the PMM Server web interface or use SSL encryption to secure traffic between PMM Client and PMM Server.

You can edit the config files located in <your Docker container>/etc/grafana to set up alerting. Log files are stored in <your Docker container>/var/log.

by Rick Golba at February 21, 2017 07:22 PM

February 20, 2017

Peter Zaitsev

MongoDB 3.4 Bundle Release: Percona Server for MongoDB 3.4, Percona Monitoring and Management 1.1, Percona Toolkit 3.0 with MongoDB

Percona Server for MongoDB

This blog post is the first in a series on Percona’s MongoDB 3.4 bundle release. This release includes Percona Server for MongoDB, Percona Monitoring and Management, and Percona Toolkit. In this post, we’ll look at the features included in the release.

We have a lot of great MongoDB content coming your way in the next few weeks. However, I wanted first to give you a quick list of the major things to be on the look out for.

This new bundled release ensures a robust, secure database that you can adapt to changing business requirements. It helps demonstrate how organizations can use MongoDB (and Percona Server for MongoDB), PMM and Percona Toolkit together to benefit from the cost savings and agility provided by free and proven open source software.

Percona Server for MongoDB 3.4 delivers all the latest MongoDB 3.4 Community Edition features, additional Enterprise features and a greater choice of storage engines.

Some of these new features include:

  • Shard member types. All nodes now need to know what they do – this helps with reporting and architecture planning more than the underlying code, but it’s an important first step.
  • Sharding balancer moved to config server primary
  • Configuration servers must now be a replica set
  • Faster balancing (shard count/2) – concurrent balancing actions can now happen at the same time!
  • Sharding and replication tags renamed to “zones” – again, an important first step
  • Default write behavior moved to majority – this could majorly impact many workloads, but moving to a default safe write mode is important
  • New decimal data type
  • Graph aggregation functions – we will talk about these more in a later blog, but for now note that graph and faceted searches are added.
  • Collations added to most access patterns for collections and databases
  • . . .and much more

Percona Server for MongoDBPercona Server for MongoDB includes all the features of MongoDB Community Edition 3.4, providing an open source, fully-compatible, drop-in replacement with many improvements, such as:

  • Integrated, pluggable authentication with LDAP that provides a centralized enterprise authentication service
  • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as user names and IP addresses) from log files
  • Hot backups for the WiredTiger engine to protect against data loss in the case of a crash or disaster, without impacting performance
  • Two storage engine options not supported by MongoDB Community Edition 3.4 (doubling the total engine count choices):
    • MongoRocks, the RocksDB-powered storage engine, designed for demanding, high-volume data workloads such as in IoT applications, on-premises or in the cloud.
    • Percona Memory Engine is ideal for in-memory computing and other applications demanding very low latency workloads.

Percona Monitoring and Management 1.1

  • Support for MongoDB and Percona Server for MongoDB
  • Graphical dashboard information for WiredTiger, MongoRocks and Percona Memory Engine
  • Cluster and replica set wide views
  • Many more graphable metrics available for both for the OS and the database layer than currently provided by other tools in the ecosystem

Percona Toolkit 3.0

  • Percona Server for MongoDBTwo new tools for MongoDB are now in Percona’s Toolkit:
    • pt-mongodb-summary (the equivalent of pt-mysql-summary) provides a quick, at-a-glance overview of a MongoDB and Percona Server for MongoDB instance
      • This is useful for any DBA who wants a general idea of what’s happening in the system, what the state of their cluster/replica set is, and more.
    • pt-mongodb-query-digest (the equivalent of pt-query-digest for MySQL) offers a query review for troubleshooting
      • Query digest is one of the most used Toolkit features ever. In MongoDB, this is no different. Typically you might only look at your best and worst query times and document scans. However, this will show 90th percentiles, and top 10 queries take seconds versus minutes.

For all of these topics, you will see more blogs in the next few weeks that cover them in detail. Some people have asked what Percona’s MongoDB commitment looks like. Hopefully, this series of blogs help show how improving open source databases is central to the Percona vision. We are here to make the world better for developers, DBAs and other MongoDB users.

by David Murphy at February 20, 2017 09:51 PM

Percona Toolkit 3.0.1 is now available

Percona Server for MongoDB

Percona ToolkitPercona announces the availability of Percona Toolkit 3.0.1 on February 20, 2017. This is the first general availability (GA) release in the 3.0 series with a focus on padding MongoDB tools:

Downloads are available from the Percona Software Repositories.

NOTE: If you are upgrading using Percona’s yum repositories, make sure that the you enable the basearch repo, because Percona Toolkit 3.0 is not available in the noarch repo.

Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL and MongoDB server and system tasks too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release includes changes from the previous 3.0.0 RC and the following additional changes:

  • Added requirement to run pt-mongodb-summary as a user with the clusterAdmin or root built-in roles.

You can find release details in the release notes. Bugs can be reported on Toolkit’s launchpad bug tracker.

by Alexey Zhebel at February 20, 2017 09:50 PM

Percona Monitoring and Management 1.1.1 is now available

Percona Monitoring and Management

Percona Monitoring and ManagementPercona announces the release of Percona Monitoring and Management 1.1.1 on February 20, 2017. This is the first general availability (GA) release in the PMM 1.1 series with a focus on providing alternative deployment options for PMM Server:

NOTE: The AMIs and VirtualBox images above are still experimental. For production, it is recommended to run Docker images.

The instructions for installing Percona Monitoring and Management 1.1.1 are available in the documentation. Detailed release notes are available here.

There are no changes compared to previous 1.1.0 Beta release, except small fixes for MongoDB metrics dashboards.

A live demo of PMM is available at pmmdemo.percona.com.

We welcome your feedback and questions on our PMM forum.

About Percona Monitoring and Management
Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

by Alexey Zhebel at February 20, 2017 09:49 PM