作者:jajajaja幸福_348 | 来源:互联网 | 2023-08-14 08:21
这篇文章一开始只是一个坚果,后来我意识到它将更像一个椰子而不是榛子,于是决定把它变成一个短篇的系列。在接下来的两周内,我将发布以下四个部分:
1、简介(本篇)
2、磁盘和表空间碎片
3、表碎片
4、索引碎片
一、简介
“碎片”一词的含义,是指某物被分成零碎的小片,但是它也带有一种情感色彩,暗示它是许多小的片段。在Oracle语境下,你需要考虑片段的含义,片段的粒度和可能对性能的影响。由于我们可能在(逻辑)磁盘级别,文件级别,表空间级别,段级别,分区(extent)级别和块级别讨论它,因此当您说“我的表空间已碎片化”或“我的索引已碎片化”之类的表述时,有必要非常清楚地要说明白。
让我们从一个例子开始:我创建了一个新的表空间,并将一个表移入。当我从DBA_EXTENTS中发现它有100个extents时,显然,从字面含义上看,它是“碎片”化的,因为它是由100个不同的片段组成的。另一方面,由于该表是我在表空间上创建的第一个表,我可以发现所有的片段是相邻的。所以,你可以说表是“逻辑上是碎片化”的,但“物理上是连续的”。
这个例子中的碎片,会对你的系统性能有影响吗?由于Oracle的IO操作大多数都是块级别的(我们读数据块到DB CACHE,写数据块到文件),而且这和块在特定extent中的位置是不相关的,所以,回答也许是否定的。但是,有时候,我们在单独一次读请求中,要尝试读取多个相邻的块(比如表扫描和索引快速全扫);那么“物理上连续的”的表被“逻辑上分割”成许多extents上,会有问题吗?
如果每个extents只有64KB,这是否限制了“db file multiblock read” 请求的大小,或者这些请求是否可以跨越extents的边界?如果表空间是由两个(或更多)文件组成的,因此,extents通常是在两个文件上交替分配的。这是否会影响读取操作的方式?如果我们尝试做一个并行的表扫描,直接路径读上的限制是否有所不同?如果你在数据仓库上花费了大量的时间进行这类操作,那么这些会是需要你回答的一部分问题。(例如,请参阅我三年前写的一篇关于运行并行查询时I/O大小异常的说明https://jonathanlewis.wordpress.com/2007/05/29/autoallocate-and-px/ 译文链接,以及几年后Christian Antognini在11g中描述的相关功能增强。http://antognini.ch/2009/08/system-managed-extent-size-11g-improvements/ 译文链接)
只有在你开始清楚地思考“碎片化”的含义之后,你才能开始理解它可能导致的问题,以及它会或不会对你的系统产生影响的原因。在第二部分中,我将对您在磁盘级和表空间级考虑碎片化的问题做一些说明。
以下附原文及链接:
https://jonathanlewis.wordpress.com/2010/07/13/fragmentation-1/
Fragmentation 1
Filed under: fragmentation,Infrastructure,Oracle — Jonathan Lewis @ 8:33 pm BST Jul 13,2010
This note started life as a nutshell until I realised that it was going to be more of a coconut than a hazel nut and decided to turn it into a short series instead. I should manage to post four parts over the next two weeks:
Introduction (this bit)
Disk and Tablespace Fragmentation
Table Fragmentation
Index Fragmentation
Introduction
The implication of the word “fragmentation” is that something is broken into pieces, but it also carries an emotional overtone that suggests it’s lots of little pieces. In an Oracle context you need to consider what you mean by “pieces”, the granularity of the pieces, and the possible impact on performance. Since it’s possible to talk about fragmentation at the (logical) disk level, the file level, the tablespace level, the segment level, the extent level, and the block level, it’s necessary to think very clearly about what you’re trying to say when you make a comment like “my tablespace is fragmented” or “my index is fragmented”.
Let’s start with an example: I have created a new tablespace and moved a table into it. When I check dba_extents the table has 100 extents. Clearly it is “fragmented” in the basic sense of the word since it is made of 100 different pieces. On the other hand, because the table was the first thing I created in the tablespace, I can see that all the extents are adjacent – so you could say the table is “logically fragmented” but “physically contiguous”.
Does this example of fragmentation have any impact on the performance your system ? Since most I/O done by Oracle operates at the block level (we read data blocks into the db cache, we write data blocks to files), and the location of the block within any particularly extent is irrelevant, the answer is probably no. But there are times when we try to read multiple adjacent blocks with a single read request (tablescans and index fast full scans); does it matter that our “physically contiguous” table is “logically fragmented” into lots of extents ?
What if the extents are (say) only 64KB each, does this limit the size of the “db file multiblock read” requests that we will be making or can those reads cross extent boundaries ? What if the tablespace is made up of two (or more) files so that the extents generally “round-robin” between files – does this affect the way the reads can operate ? What if we try to do a parallel tablescan -are the restrictions on “direct path reads” different ? If you’re running a datawarehouse that spends a lot of its time doing this type of operation then these are just some of the questions you need to answer. (See, for example, a note I wrote three years ago about some of the anomalies of I/O sizes when running parallel query, and a related enhancement in 11g described by Christian Antognini a couple of years later.)
It’s only after you start to think clearly about what you mean by “fragmentation” that you can begin to understand the possible problems that it can cause and the reasons why it may, or may not, have an impact on your system. In part two I’ll make some comments about the way you should think about fragmentation at the disk level and the tablespace level.