开发的同事反应系统特别慢,基本是hang住的状态。 SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Producti
开发的同事反应系统特别慢,基本是hang住的状态。
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL> select event,count(*) from v$session where wait_class
'Idle' group by event;
EVENT COUNT(*)
---------------------------------------- ----------
asynch descriptor resize 1
cursor: mutex S 1
library cache: mutex X 2
library cache lock 150
SQL> !uptime
13:58:46 up 275 days, 14:54, 20 users, load average: 73.46, 69.75, 67.47
SQL> !vmstat -n 3 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
21 0 179128 21864 19620 4764460 0 0 219 59 0 0 3 0 94 3 0
9 0 179128 20936 19636 4764480 0 0 11 41 2686 8128 91 1 8 0 0
9 0 179128 17012 19636 4761136 0 0 0 64 2240 7314 89 2 9 0 0
51 2 179128 20020 19676 4755980 0 0 535 156 2550 6650 63 3 31 2 0
59 3 179128 17240 19700 4755192 0 0 696 27 1840 7343 71 4 23 2 0
Extra information that will be dumped at higher levels:
[level 4] : 4 node dumps -- [LEAF] [LEAF_NW]
[level 5] : 156 node dumps -- [NO_WAIT] [INVOL_WT] [SINGLE_NODE] [NLEAF] [SINGLE_NODE_NW]
State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[12]/1/13/30534/0x145dc5c70/6571/NLEAF/[22*]
[20]/1/21/49911/0x145dba2b0/32225/NLEAF/[22*]
[22]/1/23/59451/0x145db7440/29667/NLEAF/[1390][1376][397][986]
[27]/1/28/51814/0x142df11f8/3185/NLEAF/[22*]
[37]/1/38/5525/0x142de29c8/3818/NLEAF/[22*]
[38]/1/39/61201/0x145da00c0/29405/NLEAF/[22*]
[41]/1/42/48315/0x142ddcce8/3888/NLEAF/[22*]
[43]/1/44/53494/0x142dd9e78/2915/NLEAF/[22*]
[45]/1/46/7972/0x142dd7008/3782/NLEAF/[22*]
[46]/1/47/55253/0x145d94700/29419/NLEAF/[22*]
[48]/1/49/2059/0x145d91890/3874/NLEAF/[22*]
[53]/1/54/232/0x142dcb648/3806/NLEAF/[22*]
[57]/1/58/1001/0x142dc5968/3946/NLEAF/[22*]
[60]/1/61/536/0x145d801f0/29571/NLEAF/[22*]
[61]/1/62/18036/0x142dbfc88/29477/NLEAF/[22*]
[62]/1/63/33331/0x145d7d380/29495/NLEAF/[22*]
[63]/1/64/172/0x142dbce18/3932/NLEAF/[22*]
[66]/1/67/295/0x145d776a0/32698/NLEAF/[22*]
[67]/1/68/1178/0x142db7138/3194/NLEAF/[22*]
[209]/1/210/37201/0x142f135b8/3864/NLEAF/[22*]
[210]/1/211/1116/0x145ed0cb0/6567/NLEAF/[22*]
[214]/1/215/43363/0x145ecafd0/3780/NLEAF/[22*]
[216]/1/217/52931/0x145ec8160/3778/NLEAF/[22*]
[217]/1/218/57913/0x142f07bf8/3850/NLEAF/[22*]
[219]/1/220/43817/0x142f04d88/3842/NLEAF/[22*]
[223]/1/224/1162/0x142eff0a8/2489/NLEAF/[22*]
[225]/1/226/1793/0x142efc238/3948/NLEAF/[22*]
[226]/1/227/63937/0x145eb9930/3903/NLEAF/[22*]
[229]/1/230/54968/0x142ef6558/6549/NLEAF/[22*]
[232]/1/233/43293/0x145eb0de0/3934/NLEAF/[22*]
[233]/1/234/65252/0x142ef0878/3936/NLEAF/[22*]
[235]/1/236/10319/0x142eeda08/3798/NLEAF/[22*]
[238]/1/239/10942/0x145ea8290/3804/NLEAF/[22*]
[239]/1/240/31890/0x142ee7d28/3163/NLEAF/[22*]
[240]/1/241/1880/0x145ea5420/3854/NLEAF/[22*]
[242]/1/243/1693/0x145ea25b0/3820/NLEAF/[22*]
[243]/1/244/157/0x142ee2048/29497/NLEAF/[22*]
[248]/1/249/666/0x145e99a60/2460/NLEAF/[22*]
[249]/1/250/39835/0x142ed94f8/3970/NLEAF/[22*]
[250]/1/251/199/0x145e96bf0/3816/NLEAF/[22*]
[252]/1/253/141/0x145e93d80/3972/NLEAF/[22*]
[253]/1/254/832/0x142ed3818/3890/NLEAF/[22*]
…
[1399]/1/1400/1510/0x1416211c0/3814/NLEAF/[22*]
[1401]/1/1402/463/0x14161e350/3145/NLEAF/[22*]
*** 2014-07-28 13:52:17.778
===============================================================================
END OF HANG ANALYSIS
===============================================================================
*** 2014-07-28 13:52:17.785
===============================================================================
HANG ANALYSIS DUMPS:
oradebug_node_dump_level: 3
===============================================================================
State of LOCAL nodes
Os层面kill掉部blocking spid后,又出现了新的blocking spid,这个过程持续了几个小时,我们来看看到底是什么object引起的library cache lock。
查看了这个父游标的version_count达到了1278个
SQL> select version_count from v$sqlarea where sql_id='90qwy5xcku4v5';
VERSION_COUNT
-------------
1278
sql_id='90qwy5xcku4v5'的sql_text:
SELECT message1_.MESSAGEID AS MESSAGEID,
message1_.CHANNELID AS CHANNELID,
message1_.KEYWORD AS KEYWORD,
message1_.MESSAGETOPIC AS MESSAGET4_,
message1_.MESSAGEBODY AS MESSAGEB5_,
message1_.MESSAGEURL AS MESSAGEURL,
message1_.MESSAGEACTION AS MESSAGEA7_,
message1_.MESSAGETYPE AS MESSAGET8_,
message1_.MESSAGELEVEL AS MESSAGEL9_,
message1_.MESSAGESTATUS AS MESSAGE10_,
message1_.CREATER AS CREATER,
message1_.BEGINDATE AS BEGINDATE,
message1_.ENDDATE AS ENDDATE,
message1_.CREATEDATE AS CREATEDATE,
message1_.ATTACHEMENTNAME AS ATTACHE15_,
message1_.READFLAG AS READFLAG,
message1_.BUSINESSID AS BUSINESSID
FROM TBL_MSG_USER_MESSAGE usermessag0_, TBL_MSG_MESSAGE message1_
WHERE (usermessag0_.READSTATUS = :"SYS_B_0")
AND (usermessag0_.CUSTOMERNO = :"SYS_B_1")
AND (usermessag0_.MESSAGEID = message1_.MESSAGEID)
AND (message1_.BEGINDATE <= :"SYS_B_2")
AND (message1_.ENDDATE > :"SYS_B_3")
[oracle@zrdb-2 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Mon Jul 28 14:33:30 2014
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> show parameter cursor_sharing;
NAME
------------------------------------
TYPE
----------------------------------------------------------------
VALUE
------------------------------
cursor_sharing
string
SIMILAR
问了下开发人员,有没有写成上述绑定变量的sql语句,开发人员反馈没有,那么这个问题可能是这样引起的,因为修改参数cursor_sharing从exact变为了similar后,导致了sql语句被重新改写成bind value的形式,按理说这个应该是为了减少硬解析,但是确造成了version_count high现象,也就是一个父游标下多个子游标,造成了library cache latch、library cache lock,library cache pin等等待时间。
High Version Count with CURSOR_SHARING = SIMILAR or FORCE (文档 ID 261020.1)
? Significant database time spent waiting for library cache latch
? High parse rates in AWR/Statspack reports
? High version counts in AWR/Statspack reports
? cursor_sharing = SIMILAR
? Intermittent database-wide slowdowns
? STATSPACK, AWR or V$SQLAREA shows a high version count on cursors that have bind replacement done due to CURSOR_SHARING=SIMILAR or FORCE.
? V$SQL_SHARED_CURSOR does not show any reason why this is happening.
? the number of children could keep on increasing until the shared pool is filled. In some cases, if the number gets past 1024 a new hash value is created for the next set of 1024.
High version counts can easily cause high contention for library cache latches. A process parsing a SQL statement with many versions (children cursors) will need to scan
through all these children while holding on to a library cache latch. This means that other processes needing the same latch will have to wait and can lead to significant database-wide performance degradation.
Changes:
CURSOR_SHARING has been changed to SIMILAR or FORCE.
Cause:
SELECT sql_text,version_count,address
FROM V$SQLAREA
WHERE sql_text like 'select /* TEST */%';
SQL_TEXT
---------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- --------
select /* TEST */ * from emp where sal > :"SYS_B_0"
5 80EE4BF0
SELECT * FROM V$SQL_SHARED_CURSOR WHERE kglhdpar = '80EE4BF0';
ADDRESS KGLHDPAR U S O O S L S E B P I S T A B D L T R I I R L I O S M U T
-------- -------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
80FBD50C 80EE4BF0 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
80EE816C 80EE4BF0 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
....
If CURSOR_SHARING=SIMILAR, this could be expected behavior.
The difference between SIMILAR and FORCE is that SIMILAR forces similar statements to share the SQL area without deteriorating execution plans.
SIMILAR causes statements that may differ in some literals, but are otherwise identical, to share a cursor, unless the literals affect either the meaning of the statement or the degree to which the plan is optimized. In this case the literal is marked "unsafe" to be shared and the cursor will not be shared.
In the above example, the literal which is being replaced in the predicate:
sal > :"SYS_B_0"
is unsafe because the execution plan depends on its value.
Why Do Statements Using Literals Not Share When Cursor_Sharing=Similar? (文档 ID 364845.1)
Inequality & cursor_sharing = SIMILAR example
? Starting with cursor_sharing = similar run a SQL with a literal.
The literal will be replaces with a bind due to 'similar':
alter session set cursor_sharing = similar;
Session altered.
select count(*) from t1 where i > 10;
COUNT(*)
----------
9990
select sql_text,version_count
from v$sqlarea
where sql_text like 'select count(*) from t1 where%';
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 1
Query results in a single version in the shared pool with a generated bind for the value: :"SYS_B_0".
? Same query run with a different literal:
select count(*) from t1 where i > 20;
COUNT(*)
----------
9980
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 2
2
Notice we now have a second version of the cursor for the second value
? If we try with another couple of different literals:
select count(*) from t1 where i > 30;
COUNT(*)
----------
9970
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 3
select count(*) from t1 where i > 40;
COUNT(*)
----------
9960
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 4
? So cursor_sharing = SIMILAR does not share cursor when inequality predicates are used
Inequality & cursor_sharing = FORCE example
? Set cursor_sharing = FORCE
Restart Database
SQL> alter session set cursor_sharing = force;
Session altered.
SQL> select count(*) from t1 where i > 10;
COUNT(*)
----------
9990
SQL> select sql_text,version_count from v$sqlarea where
2 sql_text like 'select count(*) from t1 where%';
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 1
The first time the version count is the same as with the SIMILAR example
? However with subsequent different values the same cursor is used:
SQL> select count(*) from t1 where i > 20;
COUNT(*)
----------
9980
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 1
SQL> select count(*) from t1 where i > 30;
COUNT(*)
----------
9970
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 1
SQL> select count(*) from t1 where i > 40;
COUNT(*)
----------
9960
SQL_TEXT VERSION_COUNT
-------------------------------------------- ---------------
select count(*) from t1 where i > :"SYS_B_0" 1
Note that with FORCE, the version count remains at 1 meaning that the cursor is shared even with inequality predicate when cursor_sharing = FORCE .
通过上面两篇mos的文章我们基本清楚了,当cursor_sharing=similar时,对于非等的谓词比较,优化器会产生多个子游标(version_count high),也就造成了每次解析sql时都需要遍历这个父游标下的子游标的,其实也就造成了library cache latch、library cache pin、library cache lock等等待事件。
下面对cursor_sharing参数进行一些简单的测试:
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 – Production
SQL> create table t as select * from dba_objects;
Table created.
SQL> alter session set cursor_sharing='similar';
Session altered.
SQL> select /*TEST*/ object_name from t where object_id=10;
OBJECT_NAME
--------------------------------------------------------------------------------
C_USER#
SQL> select /*TEST*/ object_name from t where object_id=11;
OBJECT_NAME
--------------------------------------------------------------------------------
I_USER#
SQL> select /*TEST*/ object_name from t where object_id=12;
OBJECT_NAME
--------------------------------------------------------------------------------
FET$
SQL> select sql_text,version_count,address from v$sqlarea where sql_text like 'select /*TEST*/ object_name from t%';
SQL_TEXT
--------------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- ----------------
select /*TEST*/ object_name from t where object_id=:"SYS_B_0"
3 000000008F8982C8
SQL> alter system flush shared_pool;
System altered.
SQL> select /*TEST*/ object_name from t where object_id<10;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_FILE#_BLOCK#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_OBJ#
TAB$
8 rows selected.
SQL> select /*TEST*/ object_name from t where object_id<9;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_OBJ#
TAB$
7 rows selected.
SQL> select /*TEST*/ object_name from t where object_id<8;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_TS#
C_TS#
CLU$
C_OBJ#
TAB$
6 rows selected.
SQL> select sql_text,version_count,address from v$sqlarea where sql_text like 'select /*TEST*/ object_name from t%';
SQL_TEXT
--------------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- ----------------
select /*TEST*/ object_name from t where object_id<:"SYS_B_0"
3 000000008F5051F0
这里发现在oracle 10.2.0.4版本中无论是谓词等值还是非等值比较,即使谓词的列的值不影响sql的执行计划,也会产生多个子游标。
而如果到11.2.0.3的版本中:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> show parameter cursor_sharing;
NAME TYPE
------------------------------------ --------------------------------
VALUE
------------------------------
cursor_sharing string
similar
SQL> select /*TEST*/ object_name from t_ora11g where object_id=10;
OBJECT_NAME
--------------------------------------------------------------------------------
C_USER#
SQL> select /*TEST*/ object_name from t_ora11g where object_id=11;
OBJECT_NAME
--------------------------------------------------------------------------------
I_USER#
SQL> select /*TEST*/ object_name from t_ora11g where object_id=12;
OBJECT_NAME
--------------------------------------------------------------------------------
FET$
SQL> select sql_text,version_count,address from v$sqlarea where sql_text like 'select /*TEST*/ object_name from t_ora11g%';
SQL_TEXT
--------------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- ----------------
select /*TEST*/ object_name from t_ora11g where object_id=:"SYS_B_0"
1 00000000903487D8
SQL> alter system flush shared_pool;
System altered.
SQL> create table t_ora11g as select * from dba_objects;
Table created.
SQL> select /*TEST*/ object_name from t_ora11g where object_id<10;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_FILE#_BLOCK#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_OBJ#
TAB$
8 rows selected.
SQL> select /*TEST*/ object_name from t_ora11g where object_id<9;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_OBJ#
TAB$
7 rows selected.
SQL> select /*TEST*/ object_name from t_ora11g where object_id<8;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_TS#
C_TS#
CLU$
C_OBJ#
TAB$
6 rows selected.
SQL> select sql_text,version_count,address from v$sqlarea where sql_text like 'select /*TEST*/ object_name from t_ora11g%';
SQL_TEXT
--------------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- ----------------
select /*TEST*/ object_name from t_ora11g where object_id<:"SYS_B_0"
1 000000009914E3E8
在11.2.0.3版本的cursor_sharing=similar情况下,oracle对于等值和非等值的谓词比较对应的父游标的version_count都为1。
由于上述生产系统出现version_count较高的版本是oracle 11.2.0.1的版本,又找了一个11.2.0.1的版本进行测试
SQL> create table t as select * from dba_objects;
Table created.
SQL> alter session set cursor_sharing='similar';
Session altered.
SQL> select /*TEST*/ object_name from t where object_id=10;
OBJECT_NAME
--------------------------------------------------------------------------------
C_USER#
SQL> select /*TEST*/ object_name from t where object_id=11;
OBJECT_NAME
--------------------------------------------------------------------------------
I_USER#
SQL> select /*TEST*/ object_name from t where object_id=12;
OBJECT_NAME
--------------------------------------------------------------------------------
FET$
SQL> select sql_text,version_count,address from v$sqlarea where sql_text like 'select /*TEST*/ object_name from t%';
SQL_TEXT
--------------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- ----------------
select /*TEST*/ object_name from t where object_id=:"SYS_B_0"
3 0000000117B10050
SQL> create table t_inequ as select * from dba_Objects;
Table created.
SQL> select /*TEST*/ object_name from t_inequ where object_id<10;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_FILE#_BLOCK#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_OBJ#
TAB$
8 rows selected.
SQL> select /*TEST*/ object_name from t_inequ where object_id<11;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_FILE#_BLOCK#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_USER#
C_OBJ#
TAB$
9 rows selected.
SQL> select /*TEST*/ object_name from t_inequ where object_id<12;
OBJECT_NAME
--------------------------------------------------------------------------------
I_OBJ#
I_FILE#_BLOCK#
I_TS#
C_TS#
CLU$
C_FILE#_BLOCK#
C_USER#
C_OBJ#
TAB$
I_USER#
10 rows selected.
SQL> select sql_text,version_count,address from v$sqlarea where sql_text like 'select /*TEST*/ object_name from t_inequ%';
SQL_TEXT
--------------------------------------------------------------------------------
VERSION_COUNT ADDRESS
------------- ----------------
select /*TEST*/ object_name from t_inequ where object_id<:"SYS_B_0"
3 00000000FFF17EB8
看来这个和oracle 10.2.0.4版本一样,都存在上述的问题,这个也是我们在网络上看见很多文章或者案例提到cursor_sharing设置为similar和force带来的隐患,有很多的类似的oracle bug,所以这里我们需要将参数cursor_sharing设置为exact或者force即可,而关于为什么子游标不共享,可以参考v$sql_shared_cursor视图。
原文地址:cursor_sharing=similar参数引起version_count high|libra, 感谢原作者分享。