摘要:線上最近出現(xiàn)了批量的死鎖,百思不得姐。死鎖記錄如下第一反應是批量,的順序不一樣導致的死鎖。什么是隱式鎖隱式鎖的意思就是沒有鎖在插入記錄時,是不加鎖的。
線上最近出現(xiàn)了批量insert的死鎖,百思不得姐。死鎖記錄如下
2018-10-26T11:04:41.759589Z 8530809 [Note] InnoDB:
*** (1) TRANSACTION:
TRANSACTION 1202026765, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 8532863, OS thread handle 139858337453824, query id 16231472122 10.111.10.143 seewo update
INSERT IGNORE INTO xx_performance_type_label_relation(label_id, performance_type_id, type, create_time)
VALUES
("bb0394e670644168a998a93a3ed521bc", "06b96ee0bab84d71bb17bf9645d3aa54", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "27d82e2331b241e1a9c9c0a74ec21099", -1, now())
,
("bb0394e670644168a998a93a3ed521bc", "3100b5978fb24f56b327d25732a7d7a7", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "435a1e19ce6e4e5bbb84240b3b34cf03", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "447fe27199ca40e289ef2834469d9a78", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "87a52c4d00844b5bb9eb75e8fe34202a", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "c6a0e26983bd4fae837d5ee2f4efeef8", 1, now())
2018-10-26T11:04:41.759635Z 8530809 [Note] InnoDB: *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 505 page no 9912 n bits 288 index uk_performance_type_id_label_id of table `masaike`.`xx_performance_type_label_relation` trx id 1202026765 lock_mode X locks gap before rec insert intention waiting
2018-10-26T11:04:41.759674Z 8530809 [Note] InnoDB: *** (2) TRANSACTION:
TRANSACTION 1202026764, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 8530809, OS thread handle 139858469242624, query id 16231472119 10.111.10.153 seewo update
INSERT IGNORE INTO xx_performance_type_label_relation(label_id, performance_type_id, type, create_time)
VALUES
("bb0394e670644168a998a93a3ed521bc", "06b96ee0bab84d71bb17bf9645d3aa54", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "27d82e2331b241e1a9c9c0a74ec21099", -1, now())
,
("bb0394e670644168a998a93a3ed521bc", "3100b5978fb24f56b327d25732a7d7a7", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "435a1e19ce6e4e5bbb84240b3b34cf03", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "447fe27199ca40e289ef2834469d9a78", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "87a52c4d00844b5bb9eb75e8fe34202a", 1, now())
,
("bb0394e670644168a998a93a3ed521bc", "c6a0e26983bd4fae837d5ee2f4efeef8", 1, now())
2018-10-26T11:04:41.759713Z 8530809 [Note] InnoDB: *** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 505 page no 9912 n bits 288 index uk_performance_type_id_label_id of table `masaike`.`xx_performance_type_label_relation` trx id 1202026764 lock mode S
2018-10-26T11:04:41.759753Z 8530809 [Note] InnoDB: *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 505 page no 9912 n bits 288 index uk_performance_type_id_label_id of table `masaike`.`xx_performance_type_label_relation` trx id 1202026764 lock_mode X locks gap before rec insert intention waiting
2018-10-26T11:04:41.759784Z 8530809 [Note] InnoDB: *** WE ROLL BACK TRANSACTION (2)
第一反應是批量insert,insert的順序不一樣導致的死鎖。但是這個在這里是不成立的。原因有兩點
出現(xiàn)問題的批量插入SQL中順序是一模一樣的,在順序一樣的情況下,只會進行插入等待(implicit lock轉(zhuǎn)explicit X鎖)下面有實驗
如果是因為批量插入順序不一致帶來的死鎖日志,打印的結(jié)果不是等待插入意向鎖(insert intention waiting),下面有實驗
現(xiàn)在采用一個簡化的表,做實驗
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`a` varchar(5) NOT NULL DEFAULT "",
`b` varchar(5) NOT NULL DEFAULT "",
PRIMARY KEY (`id`),
UNIQUE KEY `uk_name` (`a`,`b`)
) ENGINE=InnoDB;
實驗 01
在記錄不存在的情況下,兩個同樣順序的批量insert同時執(zhí)行,第二個會進行鎖等待狀態(tài)
首先truncate t1;
t1 | t2 | |
---|---|---|
begin; | begin; | |
insert ignore into t1(a, b)values("1", "1"); | 成功 | |
insert ignore into t1(a, b)values("1", "1"); | 鎖等待狀態(tài) |
可以看到目前鎖的狀態(tài)
mysql> select * from information_schema.innodb_locks;
+-------------+-------------+-----------+-----------+------------+------------+------------+-----------+----------+-----------+
| lock_id | lock_trx_id | lock_mode | lock_type | lock_table | lock_index | lock_space | lock_page | lock_rec | lock_data |
+-------------+-------------+-----------+-----------+------------+------------+------------+-----------+----------+-----------+
| 31AE:54:4:2 | 31AE | S | RECORD | `d1`.`t1` | `uk_name` | 54 | 4 | 2 | "1", "1" |
| 31AD:54:4:2 | 31AD | X | RECORD | `d1`.`t1` | `uk_name` | 54 | 4 | 2 | "1", "1" |
+-------------+-------------+-----------+-----------+------------+------------+------------+-----------+----------+-----------+
在我們執(zhí)行事務t1的insert時,沒有在任何鎖的斷點處出現(xiàn),這跟MySQL插入的原理有關系
insert 加的是隱式鎖。什么是隱式鎖?隱式鎖的意思就是沒有鎖
在t1插入記錄時,是不加鎖的。這個時候事務t1還未提交的情況下,事務t2嘗試插入的時候,發(fā)現(xiàn)有這條記錄,t2嘗試獲取S鎖,會判定記錄上的事務id是否活躍,如果活躍的話,說明事務未結(jié)束,會幫t1把它的隱式鎖提升為顯式鎖(X鎖)
源碼如下
t2獲取S鎖的結(jié)果:DB_LOCK_WAIT
實驗02
批量插入順序不一致的導致的死鎖日志不是等待插入意向鎖
t1 | t2 | |
---|---|---|
begin | ||
insert into t1(a, b)values("1", "1"); | 成功 | |
insert into t1(a, b)values("2", "2"); | 成功 | |
insert into t1(a, b)values("2", "2"); | t1嘗試獲取S鎖,把t2的隱式鎖提升為顯式X鎖,進入DB_LOCK_WAIT | |
insert into t1(a, b)values("1", "1"); | t2嘗試獲取S鎖,把t1的隱式鎖提升為顯式X鎖,產(chǎn)生死鎖 |
------------------------
LATEST DETECTED DEADLOCK
------------------------
181101 9:48:36
*** (1) TRANSACTION:
TRANSACTION 3309, ACTIVE 215 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 376, 2 row lock(s), undo log entries 2
MySQL thread id 2, OS thread handle 0x70000a845000, query id 58 localhost root update
insert into t1(a, b)values("2", "2")
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 55 page no 4 n bits 72 index `uk_name` of table `d1`.`t1` trx id 3309 lock mode S waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 1; hex 32; asc 2;;
1: len 1; hex 32; asc 2;;
2: len 4; hex 80000002; asc ;;
*** (2) TRANSACTION:
TRANSACTION 330A, ACTIVE 163 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 376, 2 row lock(s), undo log entries 2
MySQL thread id 3, OS thread handle 0x70000a888000, query id 59 localhost root update
insert into t1(a, b)values("1", "1")
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 55 page no 4 n bits 72 index `uk_name` of table `d1`.`t1` trx id 330A lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 1; hex 32; asc 2;;
1: len 1; hex 32; asc 2;;
2: len 4; hex 80000002; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 55 page no 4 n bits 72 index `uk_name` of table `d1`.`t1` trx id 330A lock mode S waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 1; hex 31; asc 1;;
1: len 1; hex 31; asc 1;;
2: len 4; hex 80000001; asc ;;
*** WE ROLL BACK TRANSACTION (2)
到目前為止,已經(jīng)陷入了僵局,完全沒法復現(xiàn)死鎖的情況??戳舜a,發(fā)現(xiàn)在insert之前有一個delete,但是delete與insert不在一個事務里面,也就是delete提交以后,才進行批量insert,真正出問題的地方在批量insert的地方。一開始就排除了delete對后面的影響,難道不在一個事務,也會有影響?
寫了一個代碼去模擬,有很大概率會復現(xiàn)
fun test() {
dao.delete() // 對應delete from
// sleep for 10ms
dao.insert() // 對應insert ignore
}
對應的 SQL 如下,注意是兩個事務
begin;
delete from t1 where a = "25"
commit;
begin;
INSERT ignore INTO `t1` (`a`, `b`) VALUES("25","1")
commit;
這個代碼在兩個線程同時調(diào)用的時候,非常容易死鎖。
后來翻遍了網(wǎng)上相關的死鎖案例,有一個關于purge刪除的過程可能跟這個有關系。
如果標記為刪除,說明事務已經(jīng)提交,還沒來得及 purge,這時后面的事務加S鎖等待;
在源碼中打印一些日志。 1.在storage/innobase/row/row0ins.c的row_ins_set_shared_rec_lock增加日志,可以看到對唯一索引增加S鎖的過程
if (dict_index_is_clust(index)) {
err = lock_clust_rec_read_check_and_lock(
0, block, rec, index, offsets, LOCK_S, type, thr);
} else {
err = lock_sec_rec_read_check_and_lock(
0, block, rec, index, offsets, LOCK_S, type, thr);
// 增加如下日志
fprintf(stderr, "row_ins_set_shared_rec_lock %s %lu %d
" , index->name, type, err);
}
2.在lock_rec_enqueue_waiting增加日志,可以看到鎖等待的情況
static
enum db_err
lock_rec_enqueue_waiting(
{
fprintf(stderr, "lock_rec_enqueue_waiting::::: %s %lu
" , index->name, type_mode);
}
日志大概如下
row_ins_set_shared_rec_lock uk_name 0 9 (t1獲取S鎖成功) row_ins_set_shared_rec_lock uk_name 0 9 (t2獲取S鎖成功) lock_rec_enqueue_waiting::::: uk_name 2563(t1 X鎖進如鎖等待) lock_rec_enqueue_waiting::::: uk_name 2563(t2 X鎖進如鎖等待)
其中2563=2048+512+3=LOCK_INSERT_INTENTION+LOCK_GAP+LOCK_X
這個過程跟非常經(jīng)典的三個事務同時insert,一個回滾,剩下的兩個事務一個成功,一個死鎖,其實是一模一樣的原理。
實驗03三個 insert ignore,一個回滾造成的死鎖
insert語句都是insert ignore into t1(a, b)values("1", "1");以下省略
t1 | t2 | t3 | 備注 |
---|---|---|---|
begin | begin | begin | |
insert | 成功 | ||
insert | 把t1的隱式鎖提升為X鎖,t2進入進入S鎖等待 | ||
insert | t3進入進入S鎖等待 | ||
rollback; | t1回滾以后,釋放X鎖,t2和t3同時拿到了S鎖 | ||
ok | deadlock | t2和t3都想拿插入意向鎖X鎖,造成死鎖條件 |
死鎖日志,跟我們案例中的一模一樣
------------------------
LATEST DETECTED DEADLOCK
------------------------
181101 23:22:59
*** (1) TRANSACTION:
TRANSACTION 5032, ACTIVE 11 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 1
MySQL thread id 5, OS thread handle 0x70000d736000, query id 125 localhost root update
insert ignore into t1(a, b)values("1", "1")
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 56 page no 4 n bits 584 index `uk_name` of table `d1`.`t1` trx id 5032 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 139 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 3; hex 313031; asc 101;;
1: len 3; hex 313031; asc 101;;
2: len 4; hex 800007b1; asc ;;
*** (2) TRANSACTION:
TRANSACTION 5033, ACTIVE 6 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 1
MySQL thread id 6, OS thread handle 0x70000d779000, query id 126 localhost root update
insert ignore into t1(a, b)values("1", "1")
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 56 page no 4 n bits 584 index `uk_name` of table `d1`.`t1` trx id 5033 lock mode S locks gap before rec
Record lock, heap no 139 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 3; hex 313031; asc 101;;
1: len 3; hex 313031; asc 101;;
2: len 4; hex 800007b1; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 56 page no 4 n bits 584 index `uk_name` of table `d1`.`t1` trx id 5033 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 139 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 3; hex 313031; asc 101;;
1: len 3; hex 313031; asc 101;;
2: len 4; hex 800007b1; asc ;;
*** WE ROLL BACK TRANSACTION (2)
目前來看,得到的結(jié)論是:
一個已提交但是未purge掉的記錄會造成后續(xù)插入獲取S共享鎖,兩個事務同時獲取S鎖,然后嘗試獲取插入意向鎖,造成死鎖
網(wǎng)上大神梳理的insert流程
首先對插入的間隙加插入意向鎖(Insert Intension Locks)
如果該間隙已被加上了 GAP 鎖或 Next-Key 鎖,則加鎖失敗進入等待;
如果沒有,則加鎖成功,表示可以插入;
然后判斷插入記錄是否有唯一鍵,如果有,則進行唯一性約束檢查
如果不存在相同鍵值,則完成插入
如果存在相同鍵值,則判斷該鍵值是否加鎖
如果沒有鎖, 判斷該記錄是否被標記為刪除
如果標記為刪除,說明事務已經(jīng)提交,還沒來得及 purge,這時加 S 鎖等待;
如果沒有標記刪除,則報 1062 duplicate key 錯誤;
如果有鎖,說明該記錄正在處理(新增、刪除或更新),且事務還未提交,加 S 鎖等待;
插入記錄并對記錄加 X 記錄鎖;
上面是我 debug 源碼得到的一些結(jié)論,如果我的理解有誤的話,記得留言告訴我
以下為參考文檔
www.aneasystone.com/archives/20…
www.aneasystone.com/archives/20…
hedengcheng.com/");
文章版權歸作者所有,未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請注明本文地址:http://systransis.cn/yun/7869.html
摘要:死鎖的本質(zhì)是資源競爭,批量插入如果順序不一致很容易導致死鎖,我們來分析一下這個情況。為了方便演示,把批量插入改寫為了多條。死鎖的本質(zhì)是資源競爭,批量插入如果順序不一致很容易導致死鎖,我們來分析一下這個情況。為了方便演示,把批量插入改寫為了多條 insert。 先來做幾個小實驗,簡化的表結(jié)構(gòu)如下 CREATE TABLE `t1` ( `id` int(11) NOT NULL AUTO_...
摘要:小明馬上開發(fā)完畢,成功上線。下班過后,小明回想大紅說的話,什么是間隙鎖,什么是插入意向鎖,看來作為開發(fā)者對數(shù)據(jù)庫不應該只會寫啊,不然遇到一些疑難雜癥完全沒法解決啊。破壞了數(shù)據(jù)庫中的隔離性。 1.鎖? 1.1何為鎖 鎖在現(xiàn)實中的意義為:封閉的器物,以鑰匙或暗碼開啟。在計算機中的鎖一般用來管理對共享資源的并發(fā)訪問,比如我們java同學熟悉的Lock,synchronized等都是我們常見的...
閱讀 2855·2023-04-25 17:59
閱讀 685·2023-04-25 15:05
閱讀 675·2021-11-25 09:43
閱讀 3038·2021-10-12 10:13
閱讀 3545·2021-09-27 13:59
閱讀 3589·2021-09-23 11:21
閱讀 3889·2021-09-08 09:35
閱讀 571·2019-08-29 17:12