技術分享 | 大量 Opening tables 案例分析

首頁>技術>愛可生2020-12-17 16:31

技術分享 | 大量 Opening tables 案例分析

一、現象

某專案反饋系統非常卡，登陸 MySQL 後發現大量 SQL 處於 Opening tables 狀態，斷斷續續，一會好一會又出現，比較離奇。

之前遇到過類似的問題現象，是由於 MySQL 開啟了 AHI（自適應雜湊索引）後如果 DROP 或 TRUNCATE 一張大表，MySQL 會同時刪除 buffer pool 中對應表上的 AHI，這個過程需會持有一把資料字典鎖（非常重的鎖），導致其他使用者執行緒因為無法獲資料字典鎖而處於 Opening tables 狀態。

但是這裡檢查過所有的會話 SQL，並沒有發現有 DROP 或者 TRUNCATE 語句，所以應該不是上述的這種問題，而是其他的原因導致了類似的現象。

值得懷疑的是，這臺 MySQL 伺服器的磁碟寫入效能很差，會不會是因為磁碟寫入太差導致這個現象？

因為一時間沒法排查出具體原因，所以針對 MySQL 程序打了個快照，方便事後詳細排查。

二、分析棧幀

用 pt-pmp 工具處理下快照後，內容比較多，這裡只顯示一些比較重要的堆疊：

1402 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165),sync_array_wait_event(sync0arr.cc:475),TTASEventMutex::wait(ut0mutex.ic:89),spin_and_try_lock(ib0mutex.h:850),enter(ib0mutex.h:850),PolicyMutex<TTASEventMutex<GenericPolicy>(ib0mutex.h:850),dict_table_open_on_name(dict0dict.cc:1238),ha_innobase::open_dict_table(ha_innodb.cc:6250),ha_innobase::open(ha_innodb.cc:5888),handler::ha_open(handler.cc:2759),open_table_from_share(table.cc:3353),open_table(sql_base.cc:3559),open_and_process_table(sql_base.cc:5145),open_tables(sql_base.cc:5145),open_tables_for_query(sql_base.cc:6531),execute_sqlcom_select(sql_parse.cc:5127),mysql_execute_command(sql_parse.cc:2792),mysql_parse(sql_parse.cc:5582),dispatch_command(sql_parse.cc:1458),do_command(sql_parse.cc:999),handle_connection(connection_handler_per_thread.cc:300),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)36 nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:279),buf_flush_wait_flushed(buf0flu.cc:2074),log_preflush_pool_modified_pages(log0log.cc:1531),log_checkpoint_margin(log0log.cc:1973),log_check_margins(log0log.cc:1973),log_free_check(log0log.ic:491),row_ins_sec_index_entry(log0log.ic:491),row_ins_index_entry(row0ins.cc:3460),row_ins_index_entry_step(row0ins.cc:3460),row_ins(row0ins.cc:3460),row_ins_step(row0ins.cc:3460),row_insert_for_mysql_using_ins_graph(row0mysql.cc:1738),ha_innobase::write_row(ha_innodb.cc:7566),handler::ha_write_row(handler.cc:7991),write_record(sql_insert.cc:1873),Sql_cmd_insert::mysql_insert(sql_insert.cc:769),Sql_cmd_insert::execute(sql_insert.cc:3105),mysql_execute_command(sql_parse.cc:3566),mysql_parse(sql_parse.cc:5582),dispatch_command(sql_parse.cc:1458),do_command(sql_parse.cc:999),handle_connection(connection_handler_per_thread.cc:300),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)1 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165),buf_dblwr_flush_buffered_writes(buf0dblwr.cc:979),buf_dblwr_add_to_batch(buf0dblwr.cc:1154),buf_flush_write_block_low(buf0flu.cc:1099),buf_flush_page(buf0flu.cc:1099),buf_flush_try_neighbors(buf0flu.cc:1453),buf_flush_page_and_try_neighbors(buf0flu.cc:1453),buf_do_flush_list_batch(buf0flu.cc:1780),buf_flush_batch(buf0flu.cc:1780),buf_flush_do_batch(buf0flu.cc:1780),pc_flush_slot(buf0flu.cc:2874),buf_flush_page_cleaner_worker(buf0flu.cc:3504),start_thread(libpthread.so.0),clone(libc.so.6)1nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:279),buf_flush_wait_flushed(buf0flu.cc:2074),log_preflush_pool_modified_pages(log0log.cc:1531),log_checkpoint_margin(log0log.cc:1973),log_check_margins(log0log.cc:1973),log_free_check(log0log.ic:491),que_run_threads_low(log0log.ic:491),que_run_threads(log0log.ic:491),trx_purge(trx0purge.cc:1883),srv_do_purge(srv0srv.cc:2631),srv_purge_coordinator_thread(srv0srv.cc:2631),start_thread(libpthread.so.0),clone(libc.so.6)1nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:279),buf_flush_wait_flushed(buf0flu.cc:2074),log_preflush_pool_modified_pages(log0log.cc:1531),log_checkpoint_margin(log0log.cc:1973),log_check_margins(log0log.cc:1973),log_free_check(log0log.ic:491),que_run_threads_low(log0log.ic:491),que_run_threads(log0log.ic:491),que_eval_sql(que0que.cc:1232),dict_stats_exec_sql(dict0stats.cc:311),dict_stats_save(dict0stats.cc:2415),dict_stats_update(dict0stats.cc:3112),dict_stats_process_entry_from_recalc_pool(dict0stats_bg.cc:356),dict_stats_thread(dict0stats_bg.cc:356),start_thread(libpthread.so.0),clone(libc.so.6)

其中 1402 個執行緒是處於自旋狀態，因為無法拿到對應的 Mutex 鎖而進行忙等，應該就是對應我們看到的大量 Opening tables 的執行緒，但是具體什麼鎖需要跟蹤下 MySQL 原始碼才能知道。

36 個執行緒都是 insert 操作，最後都再等待 buf_flush_wait_flushed，貌似是等待刷盤操作。最後三個執行緒分別是：

1. page_cleaner：後臺刷髒執行緒，最後處於 buf_dblwr_flush_buffered_writes。

2. purge_thread：purge 執行緒（清理 undo），最後處於 buf_flush_wait_flushed。

3. dict_stats_thread：統計資訊更新執行緒，最後處於 buf_flush_wait_flushed。

簡單分析下這些執行緒，發現 insert、pugre_thread、dict_stats_thread 的最後一部分堆疊呼叫都是如下：

log_free_check -> log_check_margins  -> log_checkpoint_margin   -> log_preflush_pool_modified_pages    -> buf_flush_wait_flushed     -> os_thread_sleep

三、原始碼跟蹤

Server version: 5.7.22-log MySQL Community Server (GPL)

這裡我把原始碼分析主要拆成 2 塊，第一塊是分析 1402 個執行緒的堆疊，看下是在等待什麼鎖資源，第二塊是分析這個鎖資源被哪個執行緒佔用了，為啥不釋放？

先來看第一部分，過程就是從 MySQL 快照中找到一個如下類似的執行緒（這裡去掉了一些引數顯示），然後從 MySQL 原始碼中一步一步分析函式呼叫。

#0  0x00007efc61303965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1  0x0000000000ffb03b in wait at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/os/os0event.cc:165#2  os_event::wait_low at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/os/os0event.cc:335#3  0x00000000010a0ce9 in sync_array_wait_event at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/sync/sync0arr.cc:475#4  0x0000000000f8c1c4 in TTASEventMutex<GenericPolicy>::wait at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/include/ut0mutex.ic:89#5  0x0000000000f8c33b in spin_and_try_lock at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/include/ib0mutex.h:850#6  enter at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/include/ib0mutex.h:707#7  PolicyMutex<TTASEventMutex<GenericPolicy> >::enter at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/include/ib0mutex.h:987#8  0x0000000001152f41 in dict_table_open_on_name at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/dict/dict0dict.cc:1238#9  0x0000000000f72a73 in ha_innobase::open_dict_table at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/handler/ha_innodb.cc:6250#10 0x0000000000f8273b in ha_innobase::open at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/handler/ha_innodb.cc:5888#11 0x000000000081b33e in handler::ha_open at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/handler.cc:2759#12 0x0000000000dc239a in open_table_from_share at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/table.cc:3353#13 0x0000000000cc18b9 in open_table at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_base.cc:3559#14 0x0000000000cc52b6 in open_and_process_table at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_base.cc:5145#15 open_tables at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_base.cc:5756#16 0x0000000000cc5e62 in open_tables_for_query at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_base.cc:6531#17 0x0000000000d14ff6 in execute_sqlcom_select at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_parse.cc:5127#18 0x0000000000d18bce in mysql_execute_command at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_parse.cc:2792#19 0x0000000000d1aaad in mysql_parse at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_parse.cc:5582#20 0x0000000000d1bcca in dispatch_command at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_parse.cc:1458#21 0x0000000000d1cb74 in do_command at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/sql_parse.cc:999#22 0x0000000000dedaec in handle_connection at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/sql/conn_handler/connection_handler_per_thread.cc:300#23 0x0000000001256a94 in pfs_spawn_thread at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/perfschema/pfs.cc:2190#24 0x00007efc612ffdd5 in start_thread () from /lib64/libpthread.so.0#25 0x00007efc5fdb8ead in clone () from /lib64/libc.so.6

分析過後，發現這種執行緒，最終是在 dict_table_open_on_name 函式中（注意函式行數為 1238，與堆疊一致），因為無法獲取 dict_sys->mutex，這把資料字典鎖而產生自旋等待。

再來看第二部分，dict_sys->mutex 被哪個執行緒佔了，為啥不釋放？這裡有個小技巧，就是隻需要分析總數為 1 的函式呼叫，因為這把資料字典鎖同一時間只能被一個執行緒佔用，那這裡主要是就是最後 3 個執行緒。先從最可疑的執行緒開始，即 dict_stats_thread，因為它也有一個 dict~

#0  0x00007efc61306e3d in nanosleep () from /lib64/libpthread.so.0#1  0x0000000000ffb7a0 in os_thread_sleep at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/os/os0thread.cc:279#2  0x0000000001135779 in buf_flush_wait_flushed at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/buf/buf0flu.cc:2074#3  0x0000000000fd4589 in log_preflush_pool_modified_pages at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/log/log0log.cc:1531#4  0x0000000000fd7a69 in log_checkpoint_margin at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/log/log0log.cc:1973#5  log_check_margins at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/log/log0log.cc:2065#6  0x000000000101af4d in log_free_check at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/include/log0log.ic:491#7  que_run_threads_low at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/que/que0que.cc:1109#8  que_run_threads at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/que/que0que.cc:1155#9  0x000000000101b86e in que_eval_sql at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/que/que0que.cc:1232#10 0x0000000001176b9c in dict_stats_exec_sql at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/dict/dict0stats.cc:311#11 0x0000000001179346 in dict_stats_save at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/dict/dict0stats.cc:2415#12 0x000000000117a9ca in dict_stats_update at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/dict/dict0stats.cc:3112#13 0x000000000117cc3a in dict_stats_process_entry_from_recalc_pool at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/dict/dict0stats_bg.cc:356#14 dict_stats_thread at /export/home/pb2/build/sb_0-27500212-1520171728.22/mysql-5.7.22/storage/innobase/dict/dict0stats_bg.cc:445#15 0x00007efc612ffdd5 in start_thread () from /lib64/libpthread.so.0#16 0x00007efc5fdb8ead in clone () from /lib64/libc.so.6

同樣根據該執行緒的函式呼叫，去 MySQL 原始碼裡一步一步跟蹤，看下是否佔用了資料字典鎖，比較幸運，在 dict_stats_save 函式中發現其拿走了 資料字典鎖，然後進入 dict_stats_exec_sql 函式執行更新統計資訊的 SQL，而且期間並未釋放資料字典鎖（這塊需要去函式分支中檢查，因為函式較多，這裡省略說明），如下：

找到資料字典鎖被 dict_stats_thread 佔用後就得分析為啥它不釋放呢，我們繼續往下看堆疊，這塊就得去著重分析下 log_free_check 開始往下的函式，之前也提到過 insert、pugre_thread、dict_stats_thread 等執行緒最後都是跑得這段邏輯。

dict_stats_save -> dict_stats_exec_sql  -> que_eval_sql   -> que_run_threads    -> que_run_threads_low     -> log_free_check /* Check that there is enough space in the log to accommodate possible log entries by this query step. */      -> log_check_margins /* Checks that there is enough free space in the log to start a new query step.Flushes the log buffer or makes a new checkpoint if necessary. */       -> log_checkpoint_margin /* Tries to establish a big enough margin of free space in the log groups, such that a new log entry can be catenated without an immediate need for a checkpoint. */        -> log_preflush_pool_modified_pages /* Advances the smallest lsn for which there are unflushed dirty blocks in the buffer pool. */         -> buf_flush_wait_flushed /* Waits until a flush batch of the given lsn ends */          -> os_thread_sleep

這段邏輯主要是先判斷 log buffer 可用空間是否達到臨界值（log->buf_free > log->max_buf_free），達到的話需要將 log buffer 中的日誌先刷到 redo log 中。

然後判斷 innodb buffer pool 中髒頁佔比是否達到臨界值（(log->lsn - oldest_lsn) > log->max_modified_age_sync），達到的話需要進行同步刷髒頁，這個過程可能會比較久，且阻塞當前其他寫入程序。

最後判斷 checkpoint 是否達到臨界值（對比 log->lsn - log->last_checkpoint_lsn、max_checkpoint_age、max_checkpoint_age_async 3 個值的大小），分別選擇進行同步 checkpoint、非同步 checkpoint、或是不做 checkpoint。

這裡最後進入的是 log_preflush_pool_modified_pages，就是`同步刷髒頁`的過程，刷髒頁的過程就是喚醒 page clean thread 進行寫入，而自身則處於 buf_flush_wait_flushed 函式中，一直等待刷髒頁完成，就如我們之前堆疊中看到的一樣，insert、pugre_thread、dict_stats_thread 等執行緒均在等待 page clean thread 重新整理髒頁。

這塊可能我理解的不是很透徹~，大家有興趣可以去看下 MySQL · 引擎特性 · InnoDB redo log漫遊，裡面介紹的非常詳細。

MySQL · 引擎特性 · InnoDB redo log漫遊

http://mysql.taobao.org/monthly/2015/05/01/

四、總結

根據上述堆疊分析，總結下來，還是由於磁碟寫入效能太差，導致了 innodb buffer pool 中髒頁LSN 落後當前 LSN 太多，達到臨界值後，觸發了 MySQL 的同步刷髒機制，期間會卡住所有正在進行寫入的執行緒，即諸多 insert 語句、purge 執行緒、還有 dict_stats_thread(統計資訊更新執行緒)均被阻塞。

我們一開始看到的許多會話處於 Opening tables 的現象，可以說是一個比較巧合的事。當某張表的資料量變化達到一定量級（預設 10%），就觸發了後臺的 dict_stats_thread，它會拿著 dict_sys->mutex(資料字典鎖)去更新對應表上面的統計資訊，但是更新過程中卡在同步刷髒頁步驟，無法及時釋放 dict_sys->mutex，因此很多 SQL 會因為無法拿到 dict_sys->mutex 而處於 Opening tables 狀態。

附錄：

http://mysql.taobao.org/monthly/2015/05/01/http://blog.itpub.net/7728585/viewspace-2149106/https://my.oschina.net/yejr/blog/4594471https://developer.aliyun.com/article/41043http://mysql.taobao.org/monthly/2020/03/08/

∨ 基於Swing的Java小遊戲——吃熱狗

熱門排行

劇多

技術分享 | 大量 Opening tables 案例分析