breakdown caused by swap in MySQL

Posted by Vincent on October 22, 2012

One case about breakdown caused by swap in MySQL.
一个因为swap导致故障解决案例。
MySQL version: 5.1.60
A lot of queries( more then 500) were blocked, the applations could not get more connections.
The old connections could not be released, and the new connections could not be established.
大量连接被阻塞,sql的状态都是openging tables/closing tables。应用的端的直接反馈就是连接数不够,老的连接没释放,新的连接无法创建。
[cc lang='sql']
Query 246 Opening tables DELETE FROM XXXXXXX_DRAFT WHERE id=645643
Query 234 Opening tables DELETE FROM XXXXXXX_DRAFT WHERE id=645643
Query 222 Opening tables DELETE FROM XXXXXXX_DRAFT WHERE id=645643
Query 282 closing tables update xdual set x=now() where id = 5
Query 282 closing tables update xdual set x=now() where id = 5
Query 282 closing tables update xdual set x=now() where id = 5
....[/cc]

The load of DB rise instant, and the phenomenon was last 4 minutes.
数据库的load出现瞬间上涨,现象持续3-4分钟

In consideration of the phenomenon, opening of closing the tables. So I adjust the parameter about the "open", but it not take effect.
考虑到看到的sql都在打开或者关闭表,调整了如下参数,但是收效甚微
table_cache=2048 (old 1024)
table_definition_cache=2048 (300)
innodb_open_files=65536 (2000)

Then I used the MySQL monitor tool 'orzdba' for a long time survey.
后来使用orzdba 进行持续关注,发现了如下现象

When the load rised up, the os swap is serious. When swap out, the qps fall down from 2000 to 100, the thread running rised from 5 to 700. I considered the phenomenon was caused by swap.
在发生load上涨时,发生了严重的swap。swap out发生时,qps急剧下降(从2000详见到100以内),thread runing从5上涨到700. 所以基本可以认为是swap导致的

Check the Memory Info.
查看内存情况
[cc lang='sql']$free -m
total used free
Mem: 48289 47737 551
-/+ buffers/cache: 44670 3618
Swap: 15633 2905 12727[/cc]

I got the information that the volume of business was growth 5 times from the developers. I check the MySQL parameters which related the memory, these values were too high and the OS had been set the Hugepages.
和开发沟通,知道最近业务量异常增长5倍以上,大量的查询导致了swap,同时我在检查mysql参数时,几个消耗os内存的参数设置的比较高,而且系统设置了大页内存。

Adjust the parameter in MySQL,
在数据库段进行了如下调整
[cc lang='sql']vi my.cnf
sort_buffer_size=256K (old:2M)
join_buffer_size=2M (old:4M)
read_buffer_size=1M (old:2M)[/cc]

Close the hugepages,
关闭大页
[cc lang='sql']vi sysctl.conf
#vm.nr_hugepages = 8704
#vm.hugetlb_shm_group = 601

--do it in mysql my.cnf
#large-pages[/cc]

We also checked the applications, adjust the unconscionable query from database. It is very important in dbatabase's tuning!
应用调用量的不合理增长也进行了细致排查,将不合理的调用取消掉。应用的配合优化在db优化中非常重要。

The database was well from then on.

--eof


This work is licensed under a CC A-S 4.0 International License.