One case about breakdown caused by swap in MySQL.
一个因为swap导致故障解决案例。
MySQL version: 5.1.60
A lot of queries( more then 500) were blocked, the applations could not get more connections.
The old connections could not be released, and the new connections could not be established.
大量连接被阻塞,sql的状态都是openging tables/closing tables。应用的端的直接反馈就是连接数不够,老的连接没释放,新的连接无法创建。
[cc lang='sql']
Query   246     Opening tables  DELETE FROM XXXXXXX_DRAFT WHERE id=645643
Query   234     Opening tables  DELETE FROM XXXXXXX_DRAFT WHERE id=645643
Query   222     Opening tables  DELETE FROM XXXXXXX_DRAFT WHERE id=645643
Query   282     closing tables  update xdual set x=now() where id = 5
Query   282     closing tables  update xdual set x=now() where id = 5
Query   282     closing tables  update xdual set x=now() where id = 5
....[/cc]
The load of DB rise instant, and the phenomenon was last 4 minutes.
数据库的load出现瞬间上涨,现象持续3-4分钟
In consideration of the phenomenon, opening of closing the tables. So I adjust the parameter about the "open", but it not take effect.
考虑到看到的sql都在打开或者关闭表,调整了如下参数,但是收效甚微
table_cache=2048   (old 1024)
table_definition_cache=2048   (300)
innodb_open_files=65536   (2000)
Then I used the MySQL monitor tool 'orzdba' for a long time survey.
后来使用orzdba  进行持续关注,发现了如下现象

When the load rised up, the os swap is serious. When swap out, the qps fall down from 2000 to 100, the thread running rised from 5 to 700. I considered the phenomenon was caused by swap.
在发生load上涨时,发生了严重的swap。swap out发生时,qps急剧下降(从2000详见到100以内),thread runing从5上涨到700. 所以基本可以认为是swap导致的
Check the Memory Info.
查看内存情况
[cc lang='sql']$free -m
             total       used       free
Mem:         48289      47737        551
-/+ buffers/cache:      44670       3618
Swap:        15633       2905      12727[/cc]
I got the information that the volume of business was growth 5 times from the developers. I check the MySQL parameters which related the memory, these values were too high and the OS had been set the Hugepages.
和开发沟通,知道最近业务量异常增长5倍以上,大量的查询导致了swap,同时我在检查mysql参数时,几个消耗os内存的参数设置的比较高,而且系统设置了大页内存。
Adjust the parameter in MySQL,
在数据库段进行了如下调整
[cc lang='sql']vi my.cnf
sort_buffer_size=256K  (old:2M)
join_buffer_size=2M    (old:4M)
read_buffer_size=1M    (old:2M)[/cc]
Close the hugepages,
关闭大页
[cc lang='sql']vi sysctl.conf
#vm.nr_hugepages = 8704
#vm.hugetlb_shm_group = 601
--do it in mysql my.cnf
#large-pages[/cc]
We also checked the applications, adjust the unconscionable query from database. It is very important in dbatabase's tuning!
应用调用量的不合理增长也进行了细致排查,将不合理的调用取消掉。应用的配合优化在db优化中非常重要。
The database was well from then on.
--eof
This work is licensed under a CC A-S 4.0 International License.