Open-sourcing Pinterest MySQL management tools
In the past, we’ve shared why you should love MySQL and how it helped Pinterest scale via sharding . At Oracle Open World today, we announced that we’re open-sourcing the vast majority of our automation that maintains our MySQL infrastructure. In this post, we’ll detail our MySQL environment, the tools used to manage it and how you can implement them to automate your MySQL infrastructure.
Basics of MySQL at Pinterest
We’vehistorically used MySQL to store some of our most important data, including Pins, boards, image metadata and Pinners’ credentials.
Recently, we’ve added the following use cases:
PinLater: Thanks in part tokernel optimizations, MySQL has replaced Redis and is becoming the only supported backend for our asynchronous job execution engine. Zen : MySQL has joined HBase as a supported backend for our graph storage engine.
For all of the MySQL use cases at Pinterest, the environment is identical from an administrative perspective:
A single master with one or two slaves: Historically MySQL was used with multiple writable instances in a replica set. This topology is error prone and has been simplified to a single master with one or more slaves. Zookeeper provides service discovery : The contract between the administrative tools and the MySQL applications is ZooKeeper. With few exceptions, ZooKeeper provides clients with database hostnames, usernames and passwords. The lifecycle of a MySQL instance in the cloud
MySQL servers at Pinterest are launched, live and die with only the rarest of configuration changes. Upgrading kernels, MySQL versions and any other changes that would require a restart of the database are never done in-place. Instead, these actions are always performed through server replacements and failovers/slave promotions as needed. This choice has greatly simplified our automation by removing the need to manage intermediate state. We call this mindset “Operational Buddhism,” meaning we don’t get attached to our servers, because they might be gone tomorrow.
One of our most important scripts is launch_replacement_db_host.py . In the simplest case, the only required argument to launch_replacement_db_host.py is the hostname of a failed slave. The existing instance is examined, all required parameters for a new server are computed and then the new server is launched. For other changes, such as MySQL upgrades, hardware upgrades/downgrades and datacenter migrations, there are optional arguments.
After the new server has booted and received its initial base configuration from our provisioning system, a cron job will notice that the data directory is empty and run mysql_restore_xtrabackup.py . Based on information from our service directory (ZooKeeper), this script will attempt to find a database backup, restore it, set up replication and add the new MySQL instance to the directory. Like launch_replacement_db_host.py , mysql_restore_xtrabackup.py accepts many optional arguments for non-standard uses.
If a MySQL master server requires replacement, the mysql_failover.py script must be run to promote the primary slave to master. This script deals with either living or dead initial masters, and then modifies MySQL replication topology and updates the service directory.
After a server has been removed from ZooKeeper, it’s subject to a retirement queue system. This system has several steps that lead to eventual termination of a server:
Servers that do not exist in ZooKeeper are considered to be not in use, and will have several status counters reset. After a day, the servers will be inspected to see if any activity has caused the status counters to increment. If the counters have incremented, the retirement process is aborted. If the counters have not incremented, the MySQL instance is sent a shutdown command. After another day, the server is subject to termination if its database has not been restarted. Other utilities
We’ve built a variety of other utilities that we’re also open-sourcing:
mysql_replica_mappings.py : This script provides administrators a quick view of what’s in production (we define “in production” as “in ZooKeeper”) in a format that’s easy to use for shell scripting.
mysql_backup_xtrabackup.py : This is our primary backup system for MySQL. These backups are used by
mysql_restore_xtrabackup for building new MySQL instances.
archive_mysql_binlogs.py : We backup MySQL replication logs in order to perform point-in-time recoveries in the case where all servers in a replica set are lost.
mysql_grants.py : This script manages our database users. It’s one of our oldest bits of automation and one of our most limited. It fulfills our needs for the time being but sooner or later will need to be significantly expanded.
mysql_cnf_builder.py : This script builds MySQL configuration files based on global defaults and then overrides for workload type, hardware and MySQL version. Several example configuration files are included.
get_recent_checksums.py : Every day, we run a pt-checksum against a subset of the shards in order to determine if master and slave are out of sync and, if so, by how much. (Most of the time, our replication drift is zero or very near zero.) The
mysql_checksum.py script runs the checksums and stores the results. The
get_recent_checksums.py script will retrieve all recent checksum results and display the data in a user-friendly manner. And a bunch more! Not a panacea
These tools are tightly integrated into our service discovery mechanism and would likely require moderate modification of the code that reads and writes from service discovery. Unlike other technologies we’ve open-sourced, there are some legacy limitations to these utilities, such as the lack of support for more than two slaves. It’s our hope that these tools are useful to others that wish to create automation for their MySQL infrastructure.
As a result of the effectiveness of our tooling, we’re able to maintain the MySQL environment with hundreds of terabyte data with less than two dedicated engineers.
Our public code repository is now live andcan be found with our other open-sourced projects on GitHub at https://github.com/pinterest/mysql_utils
Acknowledgements: Ernie Souhrada also contributed to the MySQL management tools.
Axure汉化版已经发布，版本号Axure 220.127.116.119，下面是截图效果 Axure汉化版文件下载地址：Axure汉化补丁 Axure RP pro 18.104.22.1689 下载地址注册用户名：Axure 序列...
BI中文站 6月7日报道 艾默生·斯帕茨(Emerson Spartz)今年28岁，已婚，是Spartz Inc公司的首席执行官。Spartz Inc是一个网站媒体帝国，旗下的30多家网站专门发布有趣、励志和让人感到不可思议的文章和帖子，其分享量非常...
- MySQL Study之–MySQL schema_information数据库
- JDBC Mysql 批处理注意事项
- Spring + Ibatis + MySql实例详解
- MySQL Study之–MySQL下图形工具的使用（MySQL Administrator）
- 如何使用PDO查询Mysql来避免SQL注入风险？ThinkPHP 3.1中的SQL注入漏洞分析！
- MySQL连不上，报Host is blocked because of many connection errors; unblock with &amp;#39;mysqladmin fl