Why TokuDB hates Transparent HugePages
If you try to install the TokuDB storage engine on a modern Linux distribution it might fail with following error message:
2014-07-17 19:02:55 13865 [ERROR] TokuDB will not run with transparent huge pages enabled.
2014-07-17 19:02:55 13865 [ERROR] Please disable them to continue.
2014-07-17 19:02:55 13865 [ERROR] (echo never > /sys/kernel/mm/transparent_hugepage/enabled)
You might be curious why TokuDB refuses to start with Transparent HugePages. Are they not a good thing… allowing smaller kernel page tables and less TLB misses when accessing data in the buffer pool? I was curious, so I asked Tim Callaghan this very question.
This problem originates with TokuDB using jemalloc memory allocator, which uses a particular trick to deal with memory fragmentation. The classical problem with memory allocators is fragmentation – if you allocated a say 2MB chunk from the operating system (typically using mmap), as the process runs it is likely some of that 2MB memory block will become free but not all of it, hence it can’t be given back to operating system completely. jemalloc uses a clever trick being able to give back portions of memory allocated in such a way through madvise(…, MADV_DONTNEED) call.
Now what happens when you use transparent huge pages? In this case the operating system (and CPU, really) works with pages of a much larger size which only can be unmapped from the address space in its entirety – which does not work when smaller objects are freed which produce smaller free “holes.”
As a result, without being able to free memory efficiently the amount of allocated memory may grow unbound until the process starts to swap out – and in the end being killed by “out of memory” killer at least under some workloads. This is not a behavior you want to see from the database server. As such requiring to disable huge pages is a better choice.
Having said that this is pretty crude requirement/solution – disabling huge pages on complete operating system image to make one application work while others might be negatively impacted. I hope with a future jemalloc version/kernel releases there will be solution where jemalloc simply prevents huge pages usage for its allocations.
Using jmalloc and its approach to remove pages from resident space also makes TokuDB a lot different than typical MySQL instances running Innodb from the process space. With Innodb VSZ and RSS are often close. In fact we often monitor VSZ to ensure it is not excessively large to avoid danger of process starting to swap actively or be killed with OOM killer. TokuDB however often can look like this
[[email protected] mysql]# ps aux | grep mysqld
mysql 14604 21.8 50.6 12922416 4083016 pts/0 Sl Jul17 1453:27 /usr/sbin/mysqld –basedir=/usr –datadir=/var/lib/mysql –plugin-dir=/usr/lib64/mysql/plugin –user=mysql –log-error=/var/lib/mysql/smt1.pz.percona.com.err –pid-file=/var/lib/mysql/smt1.pz.percona.com.pid
root 28937 0.0 0.0 103244 852 pts/2 S+ 10:38 0:00 grep mysqld
In this case TokuDB is run with defaults on 8GB system – it takes approximately 50% of memory in terms of RSS size, however the VSZ of the process is over 12GB – this is a lot more than memory available.
This is completely fine for TokuDB. If I would not have Transparent HugePages disabled, though, my RSS would be a lot closer to VSZ causing intense swapping or even process killed by OOM killer.
In addition to explaining this to me, Tim Callaghan was also kind enough to share some links on this issue from other companies such asOracle,NuoDB,Splunk,SAP,SAP(2), which provide more background information on this topic.
Axure汉化版已经发布，版本号Axure 184.108.40.2069，下面是截图效果 Axure汉化版文件下载地址：Axure汉化补丁 Axure RP pro 220.127.116.119 下载地址注册用户名：Axure 序列...
BI中文站 6月7日报道 艾默生·斯帕茨(Emerson Spartz)今年28岁，已婚，是Spartz Inc公司的首席执行官。Spartz Inc是一个网站媒体帝国，旗下的30多家网站专门发布有趣、励志和让人感到不可思议的文章和帖子，其分享量非常...
- mysql 将字段time按天/月/年分组
- 新安装mysql 第三方工具连接不上问题
- CentOS 安装MySQL 5.1.69
- mysql出现“Incorrect key file for table”解决办法
- mysql无法启动——cannot allocate the memory for the buffer pool
- Mysql 日志删除
- mysql 修改字段类型 删除字段类型
- 修改mysql字符编码出现Job failed to start解决办法
- Why TokuDB hates Transparent HugePages